Unboxing the Box with GlassBox

Article

Unboxing the box with GlassBox

A toolkit to create transparency in Artificial Intelligence

Artificial intelligence (AI) models can become so complex that we no longer understand the output. This undermines the trust of companies and customers. Therefore, Deloitte has developed GlassBox: a toolkit that looks inside the proverbial ‘black box’ of AI-powered algorithms.

Artificial intelligence (AI) models, in particular data-driven models like machine learning, can become highly complex. These algorithms are typically presented as a ‘black box’: you feed them with data and there is an outcome, but what happens in the meantime is hard to explain.

This lack of understanding of AI technology causes large risks for companies, says Roald Waaijer, director Risk Advisory at Deloitte. “AI-powered algorithms are increasingly used for decisions that affect our daily lives. Therefore, if an algorithm runs awry, the consequences can be disastrous. For a company it can cause serious reputational damage and lead to fines of tens of millions of euros.” Worst of all, he adds, it may hurt customers, for instance by unintentionally treating them unfairly if there are biases in the algorithm or training data. “This may lead to a serious breach of trust, which can take years to rebuild.”

To help companies look inside the proverbial black box of AI, Deloitte has developed GlassBox. This technical toolkit is designed to validate AI models and to expose possible bias and unfairness – in short, to check whether AI-powered algorithms are doing what they are supposed to do. “It’s just like bringing your car to the garage,” explains Waaijer. “You occasionally need to look under the bonnet to see whether everything is working properly. That is what GlassBox does: we look under the bonnet of an algorithm to check the AI engine.”

Moreover, Deloitte offers tools to help explain the decision-making process of AI models to employees and customers, for instance by visualising how an AI-powered algorithm came to a decision. “With the GDPR regulation that recently came into force, consumers have the right to receive a meaningful explanation of how their data was used to get to a decision,” says Waaijer. “‘Computer says no’ is not a sufficient answer. You have to explain the decision and give insight into what happens inside the black box.”
 

Inspection methods

There is a wide variety of AI models, like neural networks, discriminant-based methods, tree-based methods and others. The GlassBox toolkit has different inspection methods for each of them (see illustration). Some tools have been developed in-house; others, like ELI5 or LIME, are open source. Bojidar Ignatov, junior manager Financial Risk Management at Deloitte with a focus on advanced modelling, has been involved with assembling the GlassBox toolkit. “There are various ways to open the black box and get an idea of how these algorithms operate,” he says.
 


Take for instance random forest models. With this AI technology, you randomly generate a lot of trees – a forest. All the trees have a different combination of variables that interact, and the algorithm tries to find the tree that is most representative for the data. The GlassBox toolkit offers various ways to validate random forest models: for example, it is possible to reconstruct how features are selected, to take a deep dive on different features or to understand the interaction of the features.

One example of a feature deep dive of random forest models is to replicate a local optimum of the model used for a single decision of the model, and to estimate the global optimum. “By comparing these, you can figure out what features have a strong impact on a decision, and which are the key features overall,” explains Ignatov.

In order to get an understanding of the interaction of features in random forest models, Deloitte developed the ‘Interaction Matrix’. The Interaction Matrix is a tool that shows which features are relatively often placed together within the trees. “These interactions can be visualised in a heat map, so that it is easy to see which combination of factors often contribute to the outcome,” says Ignatov. “The warmer the heat map, the more often two factors are connected.” In the end, a human expert that understands the context of the algorithm can judge whether these features are indeed important for the decision, or if the model needs tweaking.
 

Avoiding bias in data

An important dimension in validating AI models is checking the data quality. An AI model can work perfectly fine, but if the input data is flawed or biased, this will affect the outcome. Even if you don’t use race or gender explicitly in your model, it doesn’t mean these factors do not play a role in the output, explains Waaijer: “AI models are famous for proxying features via multiple factors, like postal code, height, etc. When the data input is biased, the AI model will find a way to replicate the bias in the outcomes, even if the bias isn’t explicitly included in the variables of the model.”

The GlassBox toolkit has different tools to expose bias in data sets. A part of the data validation is to check whether all the required steps have been made before data goes into the AI model, such as scaling the input data, data transformations, data enrichments and enhancements, manual and implicit binning or sample balancing. Next, bias is checked against a predetermined set of biases that are possibly present in the data set or for creeping bias in fresh data. It all comes down to testing different scenarios, says Waaijer. “For instance, if you want to test whether a data set has a gender bias, you can test how the distribution of men and women plays out with a certain combination of parameters. If the outcome is very different than expected, something is wrong.”

One difficulty is that, in most cases, it is only possible to detect bias when a model is in use, adds Waaijer. “You only really see it happening in practice. For instance, research has shown that a combination of seemingly innocent factors like height and postal code can disadvantage people of a certain background. This is not something you would expect, you can only discover this when the model is in use.” This means that ethical considerations in AI models should be in place not just at the beginning, but continuously, says Waaijer. “You need a monitoring cycle in which you continuously, or at least periodically, monitor the results.”
 

Human judgement

The GlassBox toolkit offers various tools to gain insight into advanced or AI-powered algorithms. However, human judgment is needed to make sense of the results. “Only humans can gauge the context in which an algorithm operates and can understand the ramifications of an outcome,” says Waaijer. “Unboxing the algorithm is step one; step two is to be sure that the algorithm operates in line with the values of the company. For that, you need human expertise.”

The GlassBox toolkit enables organisations to take control over the various AI models they have in use. It allows them to ensure the outcomes of AI-powered algorithms are explainable and make sense. “Opening the black box of AI will become a business priority,” says Waaijer. “If you can make sure you use AI-powered algorithms correctly and responsibly, not only will you avoid risks, but you will also be able to realise the full potential with AI.”

Unboxing the Box with GlassBox

Vond u dit nuttig?