Article
Ensuring AI Models are Robust Outside the Lab
An examination what it takes for AI to be robust, how to identify robustness risks and deftly manage them
Many experts agree that, for an AI model to be trustworthy, it must be robust. But what does robustness mean in the context of machine learning models? Research and discussion on AI robustness is not as advanced as the related Trustworthy AI topics explainability and bias. In this paper, we define what we see as the critical – if perhaps not the only – characteristics of robustness. Robust AI is stable, resilient and reliable – at launch and throughout the term. We confront each characteristic head-on, proposing effective metrics and strategies to improve robustness. We also address the components of model drift – data drift and concept shift – proposing how to measure and deal with them. Our findings and conclusions derive not only from methodological research, but from rigorous exploration and experimentation in putting theory into practice and culminating in the development of a dedicated tool to audit AI model robustness.
We all have high expectations for AI. Technological advances and accumulation of success stories across domains give us good reason to do so. In many respects, however, current AI systems have a long way to go to earn our trust, especially in critical applications.
Implemented correctly, ML systems can have a truly transformational effect. Proper implementation requires careful consideration, from the algorithm/architecture to hyperparameters and, most importantly, the data used to train the model. The growing reliance on nascent AI technologies, especially in more critical or sensitive applications, raises concerns whether they are reliable and robust enough to handle less-than-perfect, real-world conditions beyond the safe confines of the lab.
Our interest in these methods is not to evaluate failures post-mortem, but to build models that are inherently robust before they are deployed. This avoids propagation of risk and unnecessary downstream costs. We propose a proactive strategy that integrates methods and associated metrics into a logical workflow. (We coded this methodology into a toolset to enable our ML engineers and auditors to proactively identify potential failure modes and resolve them in future model iterations.)
The three critical characteristics of AI robustness are:
- Reliability: the prediction accuracy meets expectations consistently over time, avoiding too many oversights or false alarms.
- Stability: the model performs well both generally and under stress conditions, such as in edge cases. It is neither overly sensitive to naturally occurring nor to intentional, targeted noise.
- Resilience: model behavior is not easily manipulated through exploitation of vulnerabilities (either within the code or the training data)
Considering resilience, we find it particularly important to consider new attack vectors introduced by machine learning. Bad actors can exploit these vulnerabilities in subsequent stages of the AI processing chain, posing multiple threats that could potentially add up to an aggregated risk of system failure. Many forms of attack require knowledge of the model parameters – so-called white box adversarial attacks. Conversely, black box adversarial attacks (requiring no inside knowledge) present another very real risk
However, robustness concerns far more than adversarial attacks. In fact, the quality, completeness and volume of (unmanipulated) data has a substantially larger impact on AI model performance – and not just in the pre-launch phase, but also throughout the model’s lifecycle. We explore the causes of model deficiencies, from unrepresentative data, poor annotation (labeling), over-fitting, or under-specification.
We also examine gradual degradation of model performance: even models that begin their useful lives reliably producing accurate predictions will likely deteriorate over time. We are able to detect this phenomenon, known as model decay, through a variety of approaches – windows, ensembles and statistical methods – noting the advantages and drawbacks of each. Once detected, we measure the degradation based on either covariate shift (data drift), concept shift or a mix of both.
We intentionally subject the model to stress situations to determine how well it will handle more ambiguous cases. Dealing with approximate situations is a strong selling point for AI models – unambiguous cases may be treated by simpler. Models often make incorrect predictions when they are overly sensitive to unfamiliar “edge cases”, which conventional testing has failed to sufficiently address. To earn our trust, we need AI models to be flexible enough to handle edge cases as well as other imperfect situations where traditional models would fail.
A system that is robust will guard against the numerous risks or faulty implementation and continue to serve as a powerful predictive tool – whether as a decision aid or an automation enabler. And that is one of the critical differentiators between AI and trustworthy AI.
Download the paper for more information.