Posted: 04 May 2023 5 min. read

Can public health forecasting predict the next emergency?

By Juergen Klenk, Ph.D., principal, Deloitte Consulting LLP, and Beth Meagher, principal, Deloitte Consulting LLP

When the US Centers for Disease Control and Prevention (CDC) launched its Center for Forecasting and Outbreak Analytics (CFA) in 2022, its vision was to be a National Weather Service (NWS) for public health threats; a data-driven, early warning system using modeling and analytics to strengthen the United States’ ability to prepare for and respond to infectious diseases.1

Data and analytical capabilities can amplify and accelerate response efforts to infectious disease outbreaks and guide interventions to improve outcomes. CFA’s work focuses on three key areas: predict, inform, and innovate. It produces models and forecasts to characterize the state of an outbreak and its course. It also informs public health decision makers on the potential consequences of deploying control measures and supports innovation to continuously improve the science of outbreak analytics and modeling.2

CFA works alongside public-and private-sector partners to inform data-driven decisions to protect lives and prevent infections.3 CFA supports decision makers at multiple levels, including CDC leadership, federal agencies,  the White House, Congress, state, territorial, local, and tribal (STLT) leaders, the private sector (including academia), and the general public.4 Among early successes, CFA, working with Kaiser Permanente Southern California and UC Berkeley, estimated the COVID-19 Omicron variant’s severity as well as the surge’s timing and impact in the US, which provided several weeks of advance notice for key planning activities. CFA also partnered with the Multi-National Mpox Response team to produce detailed technical reports of that outbreak.5

Balancing public health data access and privacy concerns

Establishing an NWS for infectious diseases and other public health threats may be an achievable goal. But unlike in weather forecasting, the CDC faces additional challenges and complexities inherent in working with health care and public health data. NWS gets its data real-time from many locations, and that data is not protected by privacy concerns.6 In contrast, the CDC has neither the ability nor the authority to direct public health data collection and has to form new data-use agreements with each jurisdiction for each new public health issue.7 Other challenges may include:

  • Getting better data faster: A model’s predictions are only as good as the data that the model was trained on. Data CDC receives from both health care entities and STLT partners often lacks standardization, requires time-intensive manual cleaning and transformation, and may be limited due to privacy concerns and data-use agreements of varying scope.
  • Maximizing accuracy while protecting privacy: Various universities, federal agencies, and other entities are developing and training similar forecasting models in silos, largely due to privacy issues that inhibit or prevent universities, states, and health care providers from sharing data to collaborate on model development.
  • Getting enough computing power: Training and deploying large, complex models requires access to powerful processors and graphics processing units (GPUs) for training, as well as scalable storage for data and model parameters.
  • Translating complex data into decision-making tools: The CDC must be able to translate complex statistical findings into understandable, actionable insights for a wide variety of audiences, including top decision-making officials and the general public.

Federated learning (FL) can help overcome these challenges and improve the CDC’s ability to forecast and model emerging health threats to guide response efforts. FL is a privacy-preserving approach to training machine learning models across a decentralized network of data providers. By harnessing the power of artificial intelligence (AI) and high-performance computing (HPC), FL can help enable access to the large amounts of siloed, sensitive training data needed to create high-accuracy predictive models without exchanging data.8 This can reduce the risk of compromising data security and privacy via included privacy-preserving algorithms and workflow strategies. (See Federated learning in health care and life sciences).

Using federated learning in health care

FL use-cases among health care and life sciences organizations are becoming increasingly evident. For example, a consortium of pharmaceutical companies participated in Project MELLODDY7 to collaboratively build federated drug discovery models without revealing patient data or commercial secrets. The EXAM (EMR CXR AI Model) study, led by Mass General Brigham, brought 20 hospitals across five continents together to train a neural network that predicts the level of supplemental oxygen a patient with COVID-19 symptoms may need.9

Deloitte’s AI Foundry built a FL proof-of-concept (POC) using cloud-deployed NVIDIA FLARE—a domain-agnostic, open-source, and extensible software development kit (SDK) for federated learning—to simulate and administer a FL system with multiple data providers to train and evaluate a deep-learning model using decentralized medical image datasets. We then designed a benchmarking study to analyze how well FL models perform in comparison to models that are trained on centralized (or siloed) datasets. Our study results showed that in 70% of experiments, the FL model’s area under the curve (AUC) was within 1% of the centrally trained model’s AUC, while in only 35% of experiments individually trained models’ AUCs were within 1% of the FL model’s AUC.

In addition to these two examples, several other studies have demonstrated the performance of FL models, thus validating the importance of this approach.9,10,11

Public health continually presents opportunities for new computing paradigms to enable breakthroughs in data science that can allow for timely and accurate prediction, prevention, and response to outbreaks and management of health risks. Federated learning can enable a paradigm shift in the CDC’s public health forecasting capabilities through improved privacy, collaboration, innovation, and ease of use. Importantly, FL allows organizations to retain ownership and control of their data, reducing the need for negotiation of complex data use agreements and transfer processes. It also can facilitate and simplify collaborative efforts across a complex network of STLT and health care partners to help enhance knowledge sharing, unlock the potential of nationwide health data, and, potentially, better position the CDC, health care industry stakeholders, and the public to prepare for and respond to the next disease outbreak—whenever and wherever it occurs.

Acknowledgements: Rebecca Schultz, Margaret Anderson, Julia Dahl, Ashton Christina Astbury

Latest news from @DeloitteHealth

This publication contains general information only and Deloitte is not, by means of this publication, rendering accounting, business, financial, investment, legal, tax, or other professional advice or services. This publication is not a substitute for such professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before making any decision or taking any action that may affect your business, you should consult a qualified professional advisor.

Deloitte shall not be responsible for any loss sustained by any person who relies on this publication.

Endnotes:

1History of the Center for Forecasting and Outbreak Analytics, Centers for Disease Control and Prevention, February 23, 2023

2CFA frequently asked questions, February 24, 2023

3Ibid

4Ibid

5CFA (about us), March 20, 2023

6NOAA Observation Systems

7CDC launches forecasting center to be like a National Weather Service for infectious diseases, CNN, April 19, 2022

8CDC-ONC Industry Days, CDC Foundation, February 27, 2023

9Machine Learning Ledger Orchestration for Drug Discovery (MELLODDY)

10Federated learning for predicting clinical outcomes in patients with COVID-19, Nature Medicine, September 15, 2021

11Federated learning improves site performance in multicenter deep learning without data sharing, Journal of the American Medical Informatics Association, February 4, 2021

12 Multi-institutional deep learning modeling without sharing patient data, Springer Nature Switzerland, January 26, 2019

13Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data, Scientific Reports, July 2020

Return to the Health Forward home page to discover more insights from our leaders.

Subscribe to the Health Forward blog via email