ML Ops for business has been saved
ML Ops for business
Building an AI enterprise to solve real-world problems
Machine learning for business is evolving from a small, locally owned discipline to a fully functional industrial operation. ML operations, or MLOps, builds on DevOps—but it can be tricky to scale. Here’s why, along with a set of practices to help you smooth out the journey.
Operationalizing AI and ML for stronger impact
Artificial intelligence (AI) and machine learning (ML) are pervasive due to powerful trends affecting all industries and sectors.
- Businesses continually need a nuanced understanding of their customers and their evolving behaviors to maintain a competitive edge. Plus, competition from nontraditional sources is forcing a revisit of the existing strategic and operational paradigm.
- Massive proliferation of data continues to add more texture to what insights can be gleaned and how decisions can be made. It’s predicted that 175 zettabytes of data will exist by 2025.2
- In keeping with the proliferation of data, there have been rapid advancements in data storage and computational availability. Big data (streaming and static) can now be mined, housed, and analyzed.
- The rise of auto-ML tools and platforms has democratized the effort of data mining and decision sciences, and the breed of citizen developers is growing.
This presents both the need for and the potential to capture continuous insights that can inform business decisions. AI and ML today need to be adopted widely and operationalized. It is how organizations can drive stronger outcomes through human and machine collaboration and realize scale with speed, data with understanding, decisions with confidence, and outcomes with accountability—the Age of WithTM.
“64% of respondents believe that AI enables a competitive advantage”
“54% are spending 4x more than last year on AI initiatives”
“74% plan to integrate AI into all enterprise applications within three years”
When machine learning was a small discipline, locally owned, and contained in divisions and functions by a small group of experts, this entire process happened quietly, even smoothly, and was manageable. As AI and ML started getting to the core of enterprise transformations and bearing expectations of being sustainable at scale, there came the need for them to track to a fully functional development, operationalization, and automation cycle. This is the realm of ML operations (MLOps).
There is a chasm between ML and MLOps that can be tricky to scale, and MLOps can turn into ML-Oops.
Tales from the front lines
In our own experiences helping clients realize impact from what’s possible with ML and translate that insight into trustworthy performance, enterprises have faced significant challenges around MLOps due to a number of factors.
Organizations that want to scale AI and ML across all areas must focus on implementing a set of standards and a framework to create production-capable AI and ML building blocks. It is also imperative to focus on building foundations of processes that are reliable and repeatable. It will not be possible to industrialize machine learning if the reliance is on a few talented practitioners in niche techniques and technologies; industrialization will require the coming together of a varied mix of talent and technologies.
That’s because the AI and ML needs of the enterprise are too big and too complex for any small group to run. It requires method, process, and the art of the organization.
… to MLOps
MLOps drives this through the entire life cycle of ML models, from design to implementation to management.
If enterprises develop only a few models for limited product lines in project cycles of a few months, they will see limited value in AI and ML adoption. Sustainable impact will come from a portfolio of machine learning models that are designed, productionized, automated, operationalized, and embedded into ongoing business functions at scale for enterprise-level use.
MLOps is a process, in classic Lean Six Sigma parlance. It is not dependent on a few experts, niche use, bespoke designs, or custom development.
MLOps builds on DevOps
MLOps aims to achieve the core principles of DevOps: automation (as opposed to siloed custom dev); deployment (proliferation, as opposed to one-time use); process (integration, testing, and releasing); and infrastructure considerations.
That said, MLOps builds on and goes beyond DevOps:
- Core team structures. For MLOps to be successful, data science and ML modelers need to be in lockstep with MLOps engineers, data engineers, and process experts. It requires a diverse and cross-functional team much more complex than DevOps.
- Experimentation. ML models are iterative and involve many experiments in their development phase. They also need to stay tuned to the evolving core business issues they are trying to solve—pricing strategies, customer behaviors, competitive intelligence, omnichannel, and industry- and domain-specific issues like the future of work or consumerism.
- Versatility of testing. In addition to the standard unit and integration testing, ML testing needs to validate ML models and retrain them.
- Production and training are subject to change in business fundamentals. When models are in production, a lot can change. Data profiles will evolve and affect downstream processes, and revalidations of critical assumptions and parameters need to be incorporated.
How to make the journey
The path to MLOps and more effective ML development and deployment hinges on selecting the right people, processes, technologies, and operating models with a clear linkage to business issues and outcomes.
This is an evolved state and very much possible in the Age of With, in which human-machine collaboration through next-gen assets and platforms predict what is possible and translate the insight into trustworthy performance. Companies invest in bringing AI practitioners and data scientists together into a practice while also investing in preconfigured solutions. Business and domain experts can build use cases around signature issues. The data science experts can drive innovation in machine learning models. Data and ML engineers can use auto-ML tools to stitch together quick ML models.
Aligned people with common goals
Adopting a coherent and cohesive inclusive effort to bring people together for a common goal is key. Identifying and clarifying roles and driving collaboration across teams through multilevel governance is necessary.
In addition to these core roles, the data and MLOps governance framework must include business program managers, finance and technology, legal counsel, enterprise and model risk, and the enterprise data office and audit.
Automation and efficiency in the process; tools and technology stack woven into the process
MLOps aspires to deliver:
- Reusable plugins and frameworks, automated data preparation and collaboration, and versioning of models so a data scientist can reuse or accelerate use cases based on models created in an as-is state
- Identification of an ML pipeline that feeds into applications, portals, enterprise analytics platforms, and databases
- Cross-pollination and a continuous feedback loop into data stores and feature stores, as well as building and designing automated learnings into repositories
- Model maintenance, with a cadence for data updates, management of variables, scheduling, and deployment of models from anywhere
Monitoring model drift and model performance for all in production; notifications and alerts on events in the ML life cycle; centralized interfaces and dashboards to monitor ML pipelines
The way forward
Aligning metrics and measures to vision: It is necessary to establish a vision at the outset, then assess readiness. What cannot be measured cannot be improved (a timeless Lean Six Sigma adage). Standards for how to design and measure the efficiency and effectiveness of MLOps are evolving and need to be integrated into performance management.
“The boat matters more than the rowing”3 – Focus on process and system: MLOps sits at the intersection of skills and process. It pulls together a range of skills and relies on automation, workflows, and systems to drive impact on a sustained basis.
Design for innovation and change: Process-centricity can sometimes obscure that innovation is at the core of AI and ML. The MLOps framework should promote innovation such that the ML itself stays relevant and future-ready. Further, in Black Swan4 events like a global pandemic, some established processes will be rendered ineffective and dysfunctional. It is important to provide a forum and empower data scientists, AI practitioners, and ML champions to explore, to innovate, and to stay at the cutting edge of this fast-evolving discipline.
Change management: Given that MLOps requires many teams, it also necessitates consumption of models developed by others. This is not easy to implement and requires change management. Model consumers are concerned with the quality and reliability of models not built by them. Different units tend to build their own data science teams and create their own AI setup. This duplicates efforts and causes redundancies, and worse, the best-in-class that exists in the organization might not be known or might not get leveraged.
As AI and ML proliferate across all industries and are adopted enterprise wide, machine learning and AI models need to be explainable in their construct, trustworthy in their genesis and underlying data, measurable in their impact, sustainable in their outcomes, scalable in their design, and self-correcting in their behavior.
ML is just like any other powerful tool. When used correctly, it can help build. On the flip side, incorrect deployment leads to damage. A major advantage of AI and ML capabilities is speed of analysis and insight on a huge scale, but if misdirected, models can cause suboptimal and even bad decisions at the same speed and scale. To avoid this, or what we call ML-Oops, we need to embed MLOps into all our AI and ML efforts at scale at the design phase itself.
- Deloitte AI Institute, AI Survey.
- Tom Coughlin, “175 Zettabytes by 2025,” Forbes, November 27, 2018.
- Deloitte internal research.
- Rolf Dobelli, The Art of Thinking Clearly (Farrar, Strous, and Giroux, 2013).
- Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable (Random House, 2007).