Accountability of Artificial Intelligence: how to see what is inside the black box?
Organizations are increasingly adopting machine learning and Artificial Intelligence (AI) to enhance existing business processes and develop new products and services. Consequently, more and more decisions that affect our daily lives are at the hands of algorithms—decisions like who to recruit for a new job opening, who to grant credit to, or how to plan a certain medical treatment.
Stefan van Duin - 7 mei 2018
This development has raised concerns over the fairness of machine learning and AI, given that the decisions implied by algorithms are the consequence of complex models that use heterogeneous data sources and are therefore not readily interpretable. Because of this, the systematic auditing of algorithms by third parties may well be desirable to warrant that they operate appropriately—especially when it involves delicate issues, as discussed below.
Considerable progress has been made to open the proverbial black box of machine learning to improve transparency—promising examples are open-source projects like ELI5 and LIME (both available for Python), which can reveal which specific elements in a photo or which words in unstructured text are decisive for a black-box model to assign it to a certain decision. Although these tools are convenient to assess what drives an algorithm’s decisions, it does not resolve a source of unfairness that is apparent in virtually all real-life applications of AI, namely: training data bias.
Training Data Bias
Training data bias is a collective term used to indicate that a dataset provides a distorted view of reality and can therefore lead to undesirable outcomes when used for decision-making. Often training data bias originates from prejudices of prior decision-makers, from biases that have historically been embedded in our society, or because the training data does not represent the population appropriately (so-called selection bias).
In a recent study, for example, it was found that due to selection bias, the performance of gender classification software varies tremendously across ethnicities and genders. Amongst darker-skinned females, misclassification rates spiked around 34%, whereas lighter-skinned males were misclassified on only 0.8% of all occasions. Evidently, it is of great importance to prevent such anomalies to occur in, for example, medical applications, where AI should conform with the strict ethical guidelines applicable in that domain.
In another case, in an attempt to give meaning to words, one of the most-used deep learning algorithms for natural language processing relates “man” to “computer programmer” similarly as it relates “woman” to “homemaker”. The fact that the algorithm was trained on millions of real-life news articles reveals that such biases are nested in society at large and that—when used for future decision-making—are unintentionally incorporated in business processes.
Accounting for Unfairness through Bias: Disparate Impact
Many precautions can and should be undertaken to mitigate the risk of inducing unfairness through biased (training) data, such as verifying that the distribution of the training data is consistent with that of the population, benchmarking algorithms in standardized scenarios, and analyzing the variations in performance across subpopulations.
Alternatively, a quantitative measure of (unintentional) bias that is particularly simple to compute and interpret is the “80 percent rule” advocated by the US Equal Employment Opportunity Commission. Informally speaking, this rule qualifies algorithms as having disparate impact if its decisions are disproportionately negative for people that possess certain sensitive attributes, such as gender. We will explain this notion by means of an example and emphasize that the “80 percent rule” can readily be generalized to the “X percent rule”.
Let’s say that a black-box algorithm has been put into place to make credit-granting decisions for consumer loan applications and that historically credit has often been declined to people from a certain neighborhood that is considered to be ‘bad’. Suppose that the credit-granting institution has the intention to grant loans based on an individual’s financial position and earning capacity and that therefore someone’s place of residence is not used as an explanatory variable. However, it could well be that the algorithm relies on data that correlates with someone’s place of residence, such as credit card transactions or profession, thereby implicitly encoding someone’s place of residence in the decision. To quantify the reliance on the place of residence (or any other sensitive attribute that is not explicitly included in the model), the “80 percent rule” would state that the algorithm has disparate impact if the following holds:
Hence, the ratio of the probability of a positive outcome for an individual with a specific sensitive attribute and without it may not be disproportionately small.
The benefit of using the “X percent rule” is that it easily incorporates various levels of fairness and that it is readily available and easy to compute for any black-box classification model. As such, this rule allows to certify that an algorithm’s decisions do not have disparate impact based on certain sensitive attributes. In doing this systematically, we improve the accountability of algorithms and assure that the implied actions are in harmony with applicable legislation, values, and principles. In case disparate impact is an apparent problem, methodology exists to either modify the dataset or the algorithm to correct for it, however, the technicalities required are beyond the scope of this paper.
The potential of AI is tremendous, given the levels of sophistication and scalability that we are currently witnessing. To live up to that promise in a sustainable manner, considerable effort is required to ensure that complex algorithms operate in ways that are beneficial to both its owner and users—systematic auditing of algorithms has the potential to provide the accountability required to warrant this.
About the authors
Ruben van de Geer is PhD candidate at the department of mathematics and is affiliated with the Amsterdam Center for Business Analytics, which is a cooperation of academia and industry in which Deloitte and de Vrije Universiteit participate. He studied “Econometrics and Financial Mathematics” and “Operations Research and Business Econometrics”. His PhD project is joint research with Deloitte and focuses on dynamic pricing in the retail industry.
Stefan van Duin is partner Analytics & Information Management at Deloitte Consulting. He has 20 years of experience in business intelligence and data analytics and loves working on complex, strategic projects, especially when these projects play in an international context.
Sandjai Bhulai is full professor of Business Analytics at Vrije Universiteit Amsterdam. He studied “Mathematics” and “Business Mathematics and Informatics” and obtained a PhD on Markov decision processes for the control of complex, high-dimensional systems. He is co-founder of the Amsterdam Center for Business Analytics, co-founder of the postgraduate program Business Analytics / Data Science, and also co-founder of Prompt Business Analytics.