TAX-I: How we predicted the outcomes of Dutch tax cases with an average performance score of 70% has been saved
TAX-I: How we predicted the outcomes of Dutch tax cases with an average performance score of 70%
And what could it mean for the legal industry?
Author: Marc Derksen
Feb 2021 - 3 min read
In January 2021, our data science team at Deloitte’s tax-i collected over 16,000 Dutch tax cases from Rechtspraak.nl and built a model which predicts whether a tax payer or the tax authorities would win an appeal based on the facts of the case. We tested our model by predicting the outcomes of 1,218 tax cases that were published in 2020 by Rechtspraak.nl.
It turned out that the model was able to predict correctly with a yearly average score of 70% (Figure 1).
What could this mean for the legal industry?
Let’s start by describing our methods here. First, we needed to classify the cases with label being either “Tax Authority Wins” or “Tax Payer Wins”. We were able to do this for over 16,000 cases predating 2020 through the application of text extraction methods informed by subject matter experts. Next, we had to extract the factual information (text, court, previous decisions, if any) of each case without contaminating the data with references to the dictum or other legal considerations by the judge (as that would be considered cheating). During the last step, we used XGBoost to model the training data which we then applied on the facts of the 2020 tax cases.
It is noteworthy that we did not make a distinction between different tax types in the training data. This is a model with generalized understanding of tax law across multiple domains, including but not limited to value-added tax, corporate income tax and wage tax.
A tax technical perspective would perhaps beg the question of the use information from a value-added tax case for predicting the outcome of a corporate income tax case. However, since the performance is quite good, one could also argue that the model seems to capture general (non-tax specific) rules relevant for cases of various tax types.
Unfortunately, there is currently no human predictive performance for this dataset that could function as a benchmark against this model. However, several studies seem to show that the predictive power of legal experts often does not pass the 66% score.
Which brings us to our next point, usability.
The main obstacle to usability is interpretability as machine learning models are largely seen as black boxes. However, using explainable A.I. (“ExAI”) we can shine light into the black box and understand how the model arrives at a particular decision. Our ExAI makes use of SHAP values, visible in Figure 3, below.
In short, SHAP values are used to explain the prediction of an instance by computing the contribution of each feature to the prediction.
Figure 3: SHAP value chart showing the 20 features with the highest impact on the model output. In this specific example there are only textual features (T_) in the top 20. Depending on the prediction, court (C_) or other features could also be part of the SHAP chart
Legal decision support
As explainability technology develops further, we will see the increased adoption of these types of models. One could suggest that the mere increase in predictive power is already a reason to turn to such models. This could typically be of interest to actors that deal with a very high volume of cases, like litigation financers or legal aid insurers. These kinds of models could for example support much of the underwriting business for these actors.
As we come to understand the possibilities and restrictions of machine learning and natural language understanding/processing, we enable ourselves to better identify viable use cases for the legal domain. These experiments pave the way for legal decision support solutions that will increase legal productivity, improve access to justice and create new insights in the legal system.
 The Supreme Court Forecasting Project: Legal and Political Science Approaches to Predicting Supreme Court Decisionmaking, Ruger et al, 2004