The C3.ai Digital Transformation Institute has selected 26 research proposals that advance Digital Transformation Science to mitigate COVID-19 and future pandemics. A total of $5.4 million and access to the C3 AI Suite and Microsoft Azure computing and storage have been awarded to support the following multidisciplinary projects.

On June 18, 2020, a virtual roundtable was held with some of the COVID-19 researchers, a short video from which can be found below. The full video of the roundtable is here.



Adding Audio-Visual Cues to Signs and Symptoms for Triaging Suspected or Diagnosed COVID-19 Patients

University of Illinois at Urbana-Champaign




The COVID-19 pandemic has placed unprecedented stress on hospital capacity. Increased emergency department (ED) patient volumes and admission rates have led to a scarcity in beds and the need to construct field hospitals in some regions. Bed-sparing protocols that identify COVID-19 patients stable for discharge from the ED or early hospital discharge have proven elusive given this population’s propensity to rapidly deteriorate up to one week after illness onset. Consequently, a significant number of stable patients are unnecessarily admitted to the hospital, while some discharged patients decompensate at home and subsequently require emergency transport to the ED. In order to conserve hospital beds, there is an urgent need for improved methods for assessing clinical stability of COVID-19 patients. Our project goal is to develop audiovisual tools to predict cardiopulmonary decompensation from facial videos captured from the ED and at home via telemedicine platforms. We will use explainable artificial intelligence (AI) and machine learning (ML) algorithms to derive criteria for predicting impending deterioration from health-relevant audiovisual features and provide explanations in terms of the clinical details within the electronic medical record. Successful completion of this project will provide the groundwork for prospective evaluation of these tools in a COVID-19 patient population. Once validated, these tools will augment provider clinical assessments of COVID-19 patients both at the bedside and across telemedicine platforms during virtual follow-up. More broadly, the techniques and algorithms developed in this project are likely to be applicable to other high-risk patient populations and emerging platforms such as telemedicine.

Narendra Ahuja
Research Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

David Beiser
Associate Professor of Medicine
University of Chicago

David Chestek
Assistant Professor of Clinical Emergency Medicine
University of Illinois, Chicago

Mark Hasegawa-Johnson
Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

Jerry Krishnan
Associate Vice Chancellor for Population Health Sciences
University of Illinois, Chicago

Arun Singh
National Advisor
All India Institute of Medical Sciences, Jodhpur






Mining Diagnostics Sequences for SARS-CoV-2 Using Variation-Aware, Graph-Based Machine Learning Approaches Applied to SARS-CoV-1, SARS-CoV-2, and MERS Datasets

University of Illinois at Urbana-Champaign




On March 11, 2020, the WHO determined that an outbreak of a novel coronavirus had begun in Wuhan, China had reached pandemic status. Deep meta-transcriptomic RNA sequencing of bronchoalveolar lavage fluid samples from COVID-19 affected patients admitted to and hospitalized in Wuhan in late December 2019 revealed sequence similarity to a SARS-like coronaviruses. This genus, Betacoronavirus, was the viral etiologic agent of the previous 2002-2003 SARS outbreak in humans of SARS (e.g., or SARS-CoV-1). Rapid and precise bacterial and viral diagnostics are extremely important in multiple clinical settings, ranging from regular visits to quick epidemic responses. This is an especially relevant question given the current COVID-19 outbreak, caused by a SARS-CoV-2 coronavirus. The goal of this project is to use human and viral whole transcriptome analysis (RNA-Seq) and genomic datasets to identify SARS-CoV-2 “within host” polymorphisms that may interfere with diagnostic platforms and to develop novel, graph-based approaches to study co-occurrence patterns for both consensus-level and low frequency variants. We will compare these results to SARS-CoV-1 and MERS genomic data, to glean population level differences and elucidate biologically relevant differences specific to SARS-CoV-2, and allow for sensitive and accurate identification and transmission analysis of SARS-CoV-2.

Nancy Amato
Abel Bliss Professor of Engineering
University of Illinois at Urbana-Champaign

Lawrence Rauchwerger
Professor of Computer Science
University of Illinois at Urbana-Champaign

Todd Treangen
Assistant Professor of Computer Science
Rice University






Pandemic Resilient Urban Mobility: Learning Spatiotemporal Models for Testing, Contact Tracing, and Reopening Decisions

Massachusetts Institute of Technology




This project focuses on the design of actionable information and effective intervention strategies to support safe mobilization of economic activity and reopening of mobility services in urban systems. We develop a spatiotemporal modeling, inference, and stochastic control framework to: 1) capture the dynamic interplay between the epidemiological state of different populations in an urban region and their mobility patterns; 2) estimate the local exposure rates and fractions of infections within populations using fine-grained contact tracing data, testing data, and occupancy measurements of key facilities/services; 3) compare and evaluate different response strategies (testing rates, partial capacity or access restrictions, and social distancing guidelines) in selected urban regions, in particular, Boston and San Francisco Bay Area; 4) design of stochastic control strategies for increasing the mobility in a phased manner by adaptively adjusting the operational capacity and testing rates. Our methodological approach is grounded in inference and learning-based control of Marked Temporal Point Processes (MTPPs) defined on a mobility network, and enables both qualitative and quantitative evaluation of contact tracing, testing, and response strategies. We will advance the inference algorithms for MTPPs by investigating both likelihood-based and likelihood-free approaches to learn model parameters from heterogeneous data. Our approach is useful for predicting the intensities of exposure and infections in different regions, conditioned on specific response strategies. Furthermore, we will design learning-based control algorithms for MTPPs to compute the optimal recovery and testing rates, which maximize the gains from sectoral reopening while limiting the risk of exposure.

Saurabh Amin
Robert N. Noyce Career Development Associate Professor of Civil and Environmental Engineering
Massachusetts Institute of Technology

Patrick Jaillet
Dugald C. Jackson Professor of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Jitendra Malik
Arthur J. Chick Professor of Electrical Engineering and Computer Sciences
University of California, Berkeley






Effective cocktail treatments for SARS-CoV-2 based on modeling lung single cell response data

Carnegie Mellon University




To fully model the impact of SARS-CoV-2 requires the integration of several different types of molecular and cellular data. SARS-CoV-2 is known to primarily impact cells via two viral entry factors, ACE2 and TMPRSS2. However, much less is currently known about the virus activity within lung cells. To model host response to viral infection, and to develop potential treatments, we will extend methods based on Continuous State Hidden Markov models (CSHMM) and further combine them with additional graphical models and graph analysis algorithms. Combined, these methods would allow us to reconstruct pathways leading from virus proteins via their host interactions to regulators and finally to the observed expression profiles in each cell. Such models identify not just the optimal individual targets but also combinations of targets that, together, lead to large decrease in viral loads. In parallel we will extend ChemProp, our deep learning method which learns to relate molecules and their associated activity values by developing advanced representations involving optimal transport and molecular prototypes, accumulating evidence about each compound’s profile from multiple sources using transfer learning and by extending it to learn the activity of compound mixtures to treat multiple targets. Using our recently developed protocols, we will differentiate human induced pluripotent stem cells to lung cells and perform time series single cell expression studies to profile cells infected with SARS-CoV-2. Following modeling, predicted CSHMM targets, and their predicted ChemProp compounds will be experimentally validated to identify treatments that reduce viral loads in lung cells.

Ziv Bar-Joseph
FORE Systems Professor of Computer Science
Carnegie Mellon University

Regina Barzilay
Delta Electronics Professor of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Tommi Jaakkola
Thomas Siebel Professor of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Darrell Kotton
Professor of Medicine and Pathology
Boston University School of Medicine






Using Data Science to Understand the Heterogeneity of SARS-COV-2 Transmission and COVID-19 Clinical Presentation in Mexico

University of California, Berkeley




In late April 2020, Mexico confirmed 15,529 positive cases of COVID-19 within its borders and the Mexican government announced the start of “Phase 3” of the pandemic, acknowledging widespread community transmission, thousands of cases of infection, and increased numbers of patients requiring hospitalization. The epidemic continues to grow rapidly in Mexico. There is a critical need in the country to use data science to advance COVID-19 prevention and treatment, and to support policy response. Country-level data and cooperation is essential to curbing the global pandemic and to support binational/cross-border relations with the U.S. and its Latin American neighbors. The health data available to our team via the Mexican Social Security Institute/Instituto Mexicano del Seguro Social (IMSS) are vast. These assets combined with the computational and data management strengths of the C3.ai tools and data lake—and the disciplinary strengths of the binational team—create an unprecedented opportunity to analyze a multitude of clinical, individual, facility and structural determinants of exposure and susceptibility to SARS-CoV-2, and the determinants of effective health services responses to the pandemic. This research project will reveal predictors of SARS-COV-2 infection and severity that will help guide prevention efforts, allowing Mexico to focus treatment on those at greatest risk. However, they should also generate hypotheses about mechanisms that might improve COVID-19 patient outcomes. We can also estimate disease burden and economic impact, and provide a best practice model for other countries with similar data systems to guide their policy decisions around pandemic management.

Stefano Bertozzi
Dean Emeritus and Professor of Health Policy and Management
University of California, Berkeley

Ziad Obermeyer
Acting Associate Professor of Health Policy and Management
University of California, Berkeley

Alan Hubbard
Professor of Biostatistics
University of California, Berkeley

Juan Pablo Gutierrez
Professor, Center for Policy, Population & Health Research
School of Medicine, UNAM

Gustavo Olaiz
Professor and Coordinator General, Center for Policy, Population & Health Research
School of Medicine, Universidad Nacional Autónoma de México (UNAM)

Alberto Rascón
Coordinator of Epidemiological Surveillance, Division of Health Services
Mexican Institute of Social Security






Toward analytics-based clinical and policy decision support to respond to the COVID-19 pandemic

Massachusetts Institute of Technology




The COVID-19 pandemic creates unprecedented challenges for healthcare providers and policy makers. How to triage patients when healthcare resources are limited? Whom to test? And how to design social distancing policies to contain the disease and its socioeconomic impact? Analytics can provide data-driven answers. We have collected comprehensive data from hundreds of clinical studies, case counts, and hospital collaborations. We have developed a new epidemiological model of the disease’s dynamics, a machine-learning model of mortality risk, and a resource allocation model—published on www.covidanalytics.io. In this project, we develop automated, interpretable, and scalable decision-making systems based on machine learning and artificial intelligence (ML/AI) to support clinical practices and public policies as they respond to the COVID-19 pandemic. We tackle four research questions: 1) How can we predict admissions in intensive care units (ICU) using machine learning? 2) How does COVID-19 impact different demographic and socioeconomic populations? 3) How does mobility impact the disease’s spread, and how to optimize social distancing policies? 4) How to augment COVID-19 tests with data-driven warnings that identify high-risk subjects? This project leverages large-scale datasets (from C3 AI and our own collection efforts), high-performance computing (using the C3 AI suite), and advanced ML/AI. Specifically, this project develops end-to-end ML/AI methods, spanning epidemiological modeling (to model the disease’s spread), machine learning (to predict ICU admissions and test results), causal inference (to investigate disparities across populations) and optimal control (to support social distancing guidelines). We will disseminate results to healthcare providers, policy makers, researchers, and the public.

Dimitris Bertsimas
Associate Dean of Business Analytics & Boeing Professor of Operations Research
MIT Sloan School of Management

Alexandre Jacquillat
Assistant Professor of Operations Research and Statistics
MIT Sloan School of Management






Dynamic Resource Management in Response to Pandemics

University of Illinois at Urbana-Champaign




Business-as-usual medical and public health practices cannot cope with the large shocks of a pandemic. Coordinated and well-timed response mechanisms are required. In this proposal, we aim to build a data analytic framework to optimize resource management for testing, prevention, and care, both prior to a potential spread, as well as during a rapidly developing outbreak. The proposed research builds upon our current work to help the State of Illinois to mitigate the ongoing COVID-19 outbreak. Key outcomes of our research include: 1) risk-aware dynamic equipment and workforce allocation mechanism implemented on the C3.ai platform, 2) comparison of federalized resource allocation to that where states compete with one another for resources via game theory, 3) flexible capacity provisioning in medical supply chains and dynamic inventory management for perishable goods such as N95 masks. While part of our research will directly help combat the current crisis, the rest will forge operational resilience and enhance the readiness to fight a potential future outbreak. The team is multidisciplinary and combines expertise in optimization and control, healthcare management, operations research, data analytics and game theory.

Subhonmesh Bose
Assistant Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

Anton Ivanov
Assistant Professor of Business Administration
University of Illinois at Urbana-Champaign

Ujjal Mukherjee
Assistant Professor of Business Administration
University of Illinois at Urbana-Champaign

Sridhar Seshadri
Alan J. and Joyce D. Baltz Endowed Professor of Business Administration
University of Illinois at Urbana-Champaign

Yuqian Xu
Assistant Professor of Business Administration and R.C. Evans Data Analytics Fellow
University of Illinois at Urbana-Champaign

Sebastian Souyris
Postdoctoral Research Associate, Department of Business Administration
University of Illinois at Urbana-Champaign






COVIDScholar: An NLP hub for COVID-19 research literature

University of California, Berkeley




The urgency of dealing with COVID-19 requires knowledge gathering and dissemination in novel ways. Building on our previous work using natural language processing (NLP) to extract latent knowledge from literature in the physical sciences, we have set out to apply similar techniques to the COVID-19 literature. As part of this effort we will build a knowledge portal tailored specifically for the needs of COVID-19 researchers that leverages state of the art NLP techniques to synthesize the information spread across tens of thousands of emergent research articles, patents, and clinical trials into actionable insights and new knowledge. We plan to create the largest and most current database of research findings for COVID-19 related work, automatically updated to remain current, and make this text data easily accessible to the research community. Moreover, we will use NLP to design unique search tools powered by custom machine learning models that allow them to engage with the literature more effectively and complete their research faster. Using our team’s expertise in large-scale data acquisition from diverse sources, natural language processing, and genomics, and engaging deeply with the biology and genomics community for feedback and guidance, we have already made significant progress towards this goal, letting users search within more COVID-related literature than any other repository through our prototype website (covidscholar.com).

Gerbrand Ceder
Daniel M. Tellep Distinguished Professor in Engineering
University of California, Berkeley

Kristin Persson
Associate Professor of Materials Science and Engineering
University of California, Berkeley

Marcin P. Joachimiak
Staff Researcher and Developer
Lawrence Berkeley National Laboratory






Housing Precarity, Eviction, and Inequality in the Wake of COVID-19

University of California, Berkeley




Ensuring housing security is vital to mitigating the spread of the COVID-19 virus and sustaining health, economic security, and family stability. This joint, interdisciplinary project will bring together academics and data scientists to track, analyze, and respond to pandemic-driven spikes in eviction and displacement risks. Doing so requires development of: 1) an innovative system for tracking real-time eviction filings after the outbreak; and 2) a housing precarity risk model using machine learning, to better analyze and predict areas at disproportionate risk of displacement in the wake of the COVID-19 pandemic. This project will provide major new sources of data and inform research and public policy regarding U.S. housing and inequality.

Karen Chapple
Professor and Chair of the Department of City and Regional Planning
University of California, Berkeley

Matthew Desmond
Maurice P. During Professor of Sociology
Princeton University

Joshua Blumenstock
Assistant Professor, School of Information
University of California, Berkeley






Reinforcement Learning to Safeguard Schools and Universities Against the COVID-19 Outbreak

Massachusetts Institute of Technology




The COVID-19 outbreak has disrupted normal activities in nearly all aspects of higher education. To reopen our universities, we need new technology and innovative practices to safeguard students against the potential second wave of the virus outbreak. In this proposal, we seek to develop analytical methods for modeling and mitigating the COVID-19 situation based on students’ location and symptom data collected via mobile apps. We adopt an optimal control approach and seek intervention policies that strike a balance between containing the virus and keeping productive on-campus activities. This problem is highly challenging due to the prevalence of hidden states, unknown dynamics, and high dimensionality. By leveraging recent advances in system identification, reinforcement learning, and adaptive control, we will develop predictive methods to infer the hidden health states of individual students and develop algorithms to recommend optimal interventions (e.g., testing and quarantine) for decision makers. We will develop simplified models to assess the impact of such policies on the stability of the system captured in the growth rate of infections. The methods will be validated using simulation and available data. We expect to apply and further develop the methods to analyze real campus data from MIT in the fall semester of 2020. By using the computing capabilities of C3.ai Suite and Microsoft Azure Cloud, we expect to analyze large volumes of location data as they are collected and adapt the intervention policy. We will make our research outcomes, including software, non-confidential data sets and analysis sharable on the C3.ai platform.

Munther Dahleh
William A. Coolidge Professor, Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Mengdi Wang
Associate Professor, Center for Statistics and Machine Learning, Electrical Engineering
Princeton University

Anette Hosoi
Associate Dean of Engineering, Neil and Jane Pappalardo Professor of Mechanical Engineering
Massachusetts Institute of Technology






Improving Fairness & Equity in COVID-19 Policy Applications of Machine Learning

Carnegie Mellon University




As governments and social service providers attempt to understand the COVID-19 pandemic –– including the significant and asymmetrical health, social, and economic risks to their constituents –– and plan for the future through acquiring and allocating scarce resources, AI researchers and practitioners have been developing detection, forecasting, and mitigation tools to support those efforts. When policy planning and resource allocation decisions are made using these AI methods, there is a risk that they could result in inequitable and unfair outcomes for vulnerable populations. Disparate impacts of the COVID-19 pandemic on racial minorities and economically disadvantaged populations are already evident, and the risk that these disparities through applications of AI could worsen is substantial. This proposal is focused on developing bias detection/audit, reduction, and mitigation methods and tools to ensure that the policy actions taken using AI and ML reduce the risk of inequitable outcomes for vulnerable populations. While our work will be broadly applicable, we focus on four use-cases: 1) COVID-19 forecasting to improve policy decision-making, 2) identifying individuals in California facing social and economic challenges due to the epidemic, 3) understanding potential disparities in the use of contact tracing and immunity passport technologies, and 4) mental health interventions to break the cycle of incarceration in Kansas.

Rayid Ghani
Distinguished Career Professor in the Machine Learning Department and the Heinz College of Information Systems and Public Policy
Carnegie Mellon University

Kit Rodolfa
Research Project Scientist, Machine Learning Department
Carnegie Mellon University

Aziz Huq
Frank and Bernice J. Greenberg Professor of Law
University of Chicago Law School

Ryan Tibshirani
Associate Professor Department of Statistics and Machine Learning Department
Carnegie Mellon University






Machine Learning Based Vaccine Design and HLA Based Risk Prediction for Viral Infections

Massachusetts Institute of Technology




We will develop and apply new methods for vaccine design and viral disease severity prediction based upon our recent developments in the prediction of the presentation of viral antigens by Class I and Class II MHC complexes. These methods will be implemented on the C3 AI Suite and other platforms, and use cloud resources for vaccine combinatorial optimization that involves considering over ~39,000 peptides to select a compact set for vaccine formulation. A benefit of our approach is that it can be targeted to any virus given its genome sequence. While our current work is based upon the development of vaccines and risk models for COVID-19, it is directly applicable to new viral strains. Given the potential escape of SARS-CoV-2 from vaccination by mutation, it is valuable to have methods to rapidly develop new vaccines and predict the severity of evolved viruses based on an individual’s genotype. Our efforts are organized in two specific aims. We will develop a new computational platform for selecting epitopes that will be focused on stimulating both CD4+ and CD8+ T Cell responses to viral infection based upon the combinatorial optimization of peptide display (Aim 1). We will develop a risk model of viral disease severity based upon an individual’s HLA type that determines what viral components will be displayed to their adaptive immune system (Aim 2).

David Gifford
Professor of Electrical Engineering and Computer Science & Professor of Biological Engineering
Massachusetts Institute of Technology






Medical Imaging Domain-Expertise Machine Learning for Interrogation of COVID

University of Chicago




The COVID-19 pandemic represents a pressing public health need for computational techniques to augment the interpretation of medical images in their role for: 1) surveillance, detection, and triaging of COVID-19 medical images given potential resurgence; 2) differential diagnosis of COVID-19 patients; and 3)prognosis, as well as prediction and monitoring of treatment response, to help in patient management. While thoracic imaging, including chest radiography and computed tomography (CT), are being re-examined for their role in patient management, the limitations for improved interpretation are partially due to the qualitative interpretation of the images, and thus this project’s aim is to develop machine intelligence methods to aid in the interrogation of medical images from COVID-19 patients. Successful completion of the research will demonstrate cascade-based deep transfer learning between similar but different thoracic disease states (e.g., interstitial diseases to COVID-19) and a clinical tool to aid in the triaging of COVID-19 patients in terms of detection, treatment planning, and monitoring.

Maryellen L. Giger
A.N. Pritzker Professor of Radiology, Committee on Medical Physics, and the College
University of Chicago

Jonathan Chung
MD, Vice Chair of Quality and Section Chief of Cardiopulmonary Imaging
University of Chicago

Samuel Armato
Associate Professor of Radiology
University of Chicago

Ravi Madduri
Computational Scientist in the Mathematics and Computer Science Division
Argonne National Laboratory
Senior Research Fellow
Computation Institute at University of Chicago

Hui Li
Research Associate Professor
University of Chicago




Colloquium on Digital Transformation Science, July 9, 2020



Scoring Drugs: Small Molecule Drug Discovery for COVID-19 using Physics-Inspired Machine LearningMedical Imaging Domain-Expertise Machine Learning for Interrogation of COVID

University of California, Berkeley




The rapid spread of SARS-CoV-2 has spurred the scientific world into action for therapeutics to help minimize fatalities from COVID-19. Molecular modeling is combating the current global pandemic through the traditional process of drug discovery, but the slow turnaround time for identifying leads for antiviral drugs, analyzing structural effects of genetic variation in the evolving virus, and targeting relevant virus-host protein interactions is still a great limitation during an acute crisis. The first component of drug discovery—the structure of potential drugs and the target proteins—has driven functional insight into biology ever since Watson, Crick, Franklin, and Wilkins solved the structure of DNA. What could we do with structural models of host and virus proteins and small molecule therapeutics? We can further enrich structure with dynamics for discovery of new surface sites exposed by fluctuations to bind drugs and peptide therapeutics not revealed by a static structural model. These “cryptic” binding sites offer new leads in drug discovery but will only yield fruit if they can be assessed rapidly for binding affinity for new small molecule drugs. We offer physics-inspired data-driven models to: 1) extend the chemical space of new drugs beyond those available; 2) create reliable scoring functions to evaluate drug binding affinities to cryptic binding sites of COVID-19 targets; 3) accelerate computation of binding affinities by training machine learning models; and 4) closing the loop of design and evaluation to bias the distribution of new drug candidates towards desired metrics enabled by C3-AI Suite.

Teresa Head-Gordon
Chancellor’s Professor, Department of Chemistry, Chemical and Biomolecular Engineering, and Bioengineering
University of California, Berkeley

Rommie Amaro
Distinguished Professor in Theoretical and Computational Chemistry, Department of Chemistry and Biochemistry
University of California, San Diego






Modeling the Impact of Social Determinants of Health on COVID-19 Transmission and Mortality to Understand Health Inequities

University of Chicago




The COVID-19 pandemic has highlighted drastic health inequities, which are particularly pronounced in cities such as Chicago, Detroit, New Orleans, and New York City. Reducing COVID-19 morbidity and mortality will likely require increased focus on social determinants of health given their disproportionate impact on populations most heavily affected by COVID-19. A better understanding of how factors such as financial hardship, housing instability, health care access, and incarceration contribute to COVID-19 transmission and mortality is needed to inform policies around social distancing and testing and vaccination scale-up. This proposal will build upon an existing agent-based model of COVID-19 transmission (CityCOVID) for the city of Chicago. Using multiple sources of existing data, including local COVID-19 contact tracing surveys and public health surveillance, we will apply machine learning methods to quantify the impact of social determinants of health on COVID-19 transmission dynamics and generate a more granular synthetic population with which to evaluate intervention approaches. The extended CityCOVID model will provide a more realistic model to guide local policy and intervention development.

Anna Hotton
Research Assistant Professor, Department of Medicine
University of Chicago

Aditya Khanna
Research Assistant Professor Director of Network Modeling
University of Chicago

Jonathan Ozik
Computational Scientist
Argonne National Laboratory

Charles Macal
Senior Technical Advisor & Social, Behavioral, and Decision Science Group Leader
Argonne National Laboratory

Harold Pollack
Helen Ross Professor, School of Social Service Administration
University of Chicago

John Schneider
Professor of Medicine and Epidemiology
University of Chicago






Secure Federated Learning for Clinical Informatics with Applications to the COVID-19 Pandemic

University of Illinois at Urbana-Champaign




Enabling health care providers to respond faster and with greater precision to pandemics requires both advanced machine learning and quickly accessible clinical data. Yet, the necessary medical data is often inaccessible across hospitals due to privacy and intellectual property concerns. This proposal leverages distributed machine learning and modern cryptography to introduce a computational protocol and software tools for securely training machine learning models with data spread over several medical establishments, while preserving privacy and IP rights. Our scientific contributions include innovative techniques that trade-off computation and communication to improve the predictive performance of federated learning in clinical settings and novel cryptographic techniques that trade off computation and robustness to enhance security. To complement our technical aims, we will develop open source software. We will evaluate our approach for COVID diagnosis using data available on the C3.ai Data Lake combined with clinical data from OSF HealthCare, to illustrate how private data can significantly improve prediction quality compared to public data alone. We also propose to serve as a hub for other c3 AI projects to enable the secure use of privately-held clinical datasets, which will improve results by other teams. Our broader vision and objective are to provide Secure Federated Learning as a Service (FLaaS) freely available to any hospital during a declared crisis. We envision that a robust, secure federated learning system will enable fast responses to minimize the impact of disease in the earliest stages.

Sanmi Koyejo
Assistant Professor of Computer Science
University of Illinois at Urbana-Champaign

Dakshita Khurana
Assistant Professor of Computer Science
University of Illinois at Urbana-Champaign

William Bond
Director of Research, Jump Simulation
OSF HealthCare

Joerg Heintz
Lead, Healthcare Engineering Systems Center
University of Illinois at Urbana-Champaign

Roopa Foulger
Vice President, Data Delivery Healthcare Analytics
OSF HealthCare






Data-Driven, High-Dimensional Design for Trustworthy Drug Discovery

University of California, Berkeley




Machine learning-based predictive modeling tools have been applied to a wide variety of tasks in computational biology and chemistry, such as predicting protein binding and stability, small molecule antibiotic properties, synthesizability, and drug-likeness. However, when such data-driven models are used to produce new designs, they are likely to encounter a major challenge. Learning-based design involves optimizing over the input to a predictive model. For example, a model that predicts how well a small molecule binds to a particular drug target takes as input some representation of the molecule, and outputs the binding efficiency. Hence, finding the best small molecule involves performing an optimization over the input to the model when its output is fixed to be, say, as large as possible. We refer to this problem setting as “high-dimensional model inversion” (HDMI). Critically, by definition of the design problem, the predictive model will never have seen any molecules with precisely the desired property, and thus we are asking the model to extrapolate. What does it mean to extrapolate in this context? Can we extrapolate? How far can we extrapolate? How can we trust such decisions? We will develop a new formal framework and associated algorithms for solving HDMI with high capacity models such as neural networks and high-dimensional inputs, which will enable us to answer these questions. We will draw on ideas from learning-based decision making (reinforcement learning), robust uncertainty estimation, and probabilistic modelling. We will focus on data-driven drug design, including a collaboration toward developing a therapeutic for COVID-19.

Jennifer Listgarten
Professor of Electrical Engineering and Computer Science
University of California, Berkeley

Sergey Levine
Assistant Professor of Electrical Engineering and Computer Science
University of California, Berkeley






Algorithms and Software Tools for Testing and Control of COVID-19

University of Illinois at Urbana-Champaign




Extensive and ongoing testing of populations for viremia and antibodies is expected to play a major role in shaping and managing the process of reopening the states. The goal of this interdisciplinary project is to bring together epidemiologists, systems theorists, and data scientists to develop models, algorithms, and software tools to support the state-level PCR (polymerase chain reaction) and serological testing efforts. Specifically, the team will develop: 1) algorithms to assimilate real-time testing data into networked epidemiological models; and 2) mean-field type control strategies to inform and evaluate the effect of social distancing and other control measures on the progression of the disease. A successful completion of this project will result in epidemiological models that are better and more realistic along two dimensions: 1) they are able to assimilate noisy data from ongoing population level testing; and 2) they include the effect of population level feedback that may result as a consequence of control measures. Such models are expected to be useful to inform testing guidelines, such as what groups to sample, with which tests, and with what frequency, and to better evaluate the effects of deploying control measures. The algorithms will be implemented as efficient, scalable, and open-source software and made available to policy makers and the public via an interactive website, assimilating daily observational data to generate real-time disease maps (with quantified uncertainty) and tools to allow simulation under different control policies. This will require substantial backend computation, with simulation and learning running on the C3.ai/Azure platform.

Prashant Mehta
Associate Professor of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign

Tamer Ba̧sar
Swanlund Endowed Chair, CAS Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

Carolyn Beck
Professor of Industrial and Enterprise Systems Engineering
University of Illinois at Urbana-Champaign

Philip E. Paré
Assistant Professor of Electrical and Computer Engineering
Purdue University

Rebecca Smith
Assistant Professor of Epidemiology
University of Illinois at Urbana-Champaign

Matthew West
Associate Professor of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign






Machine Learning Support for Emergency Triage of Pulmonary Collapse in COVID-19

University of Chicago




In emergency rooms across the world, doctors facing bed shortages must decide if patients with suspected or confirmed COVID-19 are safe to go home or need hospital-level monitoring. The current state of medical knowledge is failing here: some patients in the hospital ultimately do not require advanced care, wasting beds; others are sent home, only to deteriorate rapidly. Our goal is to produce an algorithm that helps physicians make better triage decisions, by predicting pulmonary collapse on the basis of X-rays that nearly all patients with respiratory complaints get in the ER. Our in-depth discussions with frontline doctors treating COVID-19 have identified this as an area of genuine need. And thanks to our close relationship with one of the largest healthcare systems in the Northwest, we already have a signed agreement in place, with access to 4 million chest X-rays linked to physiological markers of pulmonary collapse: acute respiratory distress syndrome (ARDS), the “final common pathway” for many infections including COVID-19, hypoxia, and mortality from linked Social Security data. This enables the modern machine learning toolkit to be deployed, and complements our own collective expertise in medical decision making, machine learning, and understanding of healthcare systems and behavior. If successful, we will deploy the algorithm in our partner’s 51 hospital-based ERs. More broadly, our work is a general prediction toolkit for pulmonary collapse, meant to transfer across healthcare systems. We will thus provide pro bono consultation to health systems wishing to integrate the tool, and open-source the prototype algorithms we develop.

Sendhil Mullainathan
Roman University Professor of Computation and Behavioral Science
University of Chicago Booth School of Business

Aleksander Madry
Professor of Computer Science
Massachusetts Institute of Technology

Ziad Obermeyer
Acting Associate Professor of Health Policy and Management
University of California, Berkeley






Targeted Interventions in Networked and Multi-Risk SIR Models: How to Unlock the Economy During a Pandemic

Massachusetts Institute of Technology




The main objective of the proposed research is to study optimal lockdown and testing policies for the containment of disease spread in networked environments. Our motivation comes from three peculiar aspects of the COVID-19 pandemic. First, since a vaccine is not yet available one needs to rely on virus containment strategies (such as lockdowns) as intervention tools, which have serious economic repercussions. Our first objective is to develop lockdown models that consider epidemic and economic aspects simultaneously. Since individuals in the population may have different risks and different productivity levels, this requires models that explicitly account for the heterogeneity of different groups and their network of interactions. Our second objective is to exploit these network models to study targeted interventions. In particular we aim at understanding what is the best policy to gradually reopen the economy by taking into account the role of the different groups in terms of their productivity and their risk level. Finally, since COVID-19 is a pandemic, it is important to note that interventions actuated by local governments will have ripple effects on neighboring states, hence coordination of efforts from different governments is of paramount importance. As the third objective, we aim at studying how state interconnections and mobility patterns affect the optimal response to an epidemic in networked environments. Results from the proposed research will help leaders and decision makers in understanding how to optimally unlock the economy within a state and how to optimally coordinate efforts between states.

Asuman Ozdaglar
Distinguished Professor of Engineering Department Head, Electrical Engineering and Computer Science, Deputy Dean of Academics, Schwarzman College of Computing
Massachusetts Institute of Technology

Daron Acemoglu
Institute Professor, Department of Economics
Massachusetts Institute of Technology

Francesca Parise
Postdoctoral Research Fellow, Laboratory for Information and Decision Systems
Massachusetts Institute of Technology






Bringing Social Distancing to Light: Crowd Management Using AI and Interactive Floor Projection

Princeton University




With the spread of COVID-19, social distancing has become an integral part of our everyday lives. Worldwide, efforts are focused on identifying ways to reopen public spaces, restart businesses, and reintroduce physical togetherness. We believe that architecture plays a key role in the return to a healthy public life by providing a means for controlling distances between people. Making use of computational processing power and data accessibility, we propose a multipronged approach that will promote healthy and efficient movement through public space. The goal of our research is to develop: 1) a computational tool that utilizes machine learning to predict people’s movement and provides suggestions for adapting existing spaces through local physical interventions; and 2) a physical intervention system based on light projections that provides direct realtime information about safe trajectories and movement behavior for pedestrians. The computational tool will use existing visual data from target case study spaces, identifying movement patterns and translating those into behavior rules. This data will be combined with swarm behavior knowledge from natural systems to provide an initial movement prediction. At the same time, the installation of the camera-projection system will allow us to evaluate the efficiency of the proposed measures, monitor flow, and inform the predictive model. Ultimately, we expect to identify strategies for efficient trajectory planning and repurposing of public space, while continually learning from their direct implementation. As such, we hope to identify novel spatial typologies pertaining to improved public health, resulting in planning rules that will reshape the built environment.

Stefana Parascho
Assistant Professor of Architecture
Princeton University

Corina Tarnita
Associate Professor of Ecology and Evolutionary Biology
Princeton University






Modeling and Control of COVID-19 Propagation for Assessing and Optimizing Intervention Policies

Princeton University



A key scientific goal concerning COVID-19 is to develop mathematical models that help us to understand and predict its spreading behavior, as well as to provide guidelines on what can be done to limit its spread. As such, this project pursues: 1) analysis and prediction of the spread of COVID-19 through a new mathematical model incorporating virus mutations; and 2) optimal and robust control of the spread of COVID-19 by carefully timed interventions. Expected outcomes could give authorities another tool to better assess the effectiveness of existing or potential countermeasures in limiting the spread of COVID-19. They could also help leaders assess the outcomes of eliminating existing countermeasures. Finally, they could help better prepare for different mutation scenarios, including worst-cases (for the current or a future pandemic).

H. Vincent Poor
Michael Henry Strater University Professor of Electrical Engineering and Interim Dean, School of Engineering and Applied Science
Princeton University

Simon A. Levin
James S. McDonnell Distinguished University Professor in Ecology and Evolutionary Biology and Director of the Center for BioComplexity
Princeton University

Osman Yagan
Associate Research Professor of Electrical and Computer Engineering and Dean’s Early Career Fellow
Carnegie Mellon University

Joshua Plotkin
Walter H. and Leonore C. Annenberg Professor of the Natural Sciences
University of Pennsylvania





Spatial Modeling of Covid-19: Optimizing PDE and Metapopulation Models for Prediction and Spread Mitigation

University of Illinois at Urbana-Champaign




This research proposal offers a comprehensive approach to the spatial dynamics of COVID-19 based on partial differential equation and metapopulation models. We aim to fill the modeling gap between studies with a detailed description of the disease dynamics, but lacking spatial dynamics, and those that while spatial in nature, do not account for the intricacies of the COVID-19 disease. We will use a diverse array of techniques ranging from dynamic traffic assignment models, which will inform the links in the metapopulation model, to manifold learning techniques for model parameterization. We will fully utilize the C3.ai platform to manage and integrate data, implement code, and build a user interface to increase research outcome accessibility.

Zoi Rapti
Associate Professor, Department of Mathematics
University of Illinois at Urbana-Champaign

Yannis G. Kevrekidis
Professor, Department of Applied Mathematics and Statistics
Johns Hopkins University Whiting School of Engineering
Professor, Department of Urology
Johns Hopkins University School of Medicine

Panayotis G. Kevrekidis
Professor of Mathematics and Statistics
University of Massachusetts at Amherst

Eleftheria Kontou
Assistant Professor of Civil and Environmental Engineering
University of Illinois at Urbana-Champaign






Detection and Containment of Emerging Diseases Using AI Techniques

University of California, Berkeley




This proposal focuses on developing data-efficient and reliable algorithms for healthcare AI in response to emerging diseases such as CoVID-19. In particular, we aim to: 1) develop machine learning methods targeting unseen disease types; 2) adapt existing medical AI tools to emerging diseases to rapidly mitigate the outbreak; 3) develop robust methods to deploy and proliferate new disease models to disparate healthcare institutions across geographies; and 4) utilize collaborative AI approaches that leverage experiences of medical professionals to enable rapid cycle development of emerging disease models. We will deploy our experience in developing algorithms for incipient fault detection in cyber-physical systems and in generative modeling for synthetic data generation in computer vision applications and interactive machine learning. Proposed healthcare AI models will be made robust and explainable for clinical use in collaboration with the School of Medicine, University of California, San Francisco (UCSF). We will make extensive use of the C3.ai Data Lake, along with additional data available from UCSF. Further, we intend to use the C3.ai Suite extensively to facilitate our algorithmic and code development work. We will make our results openly available to foster further research. We also plan to integrate our results in the C3.ai Suite to make it more powerful in AI applications to healthcare.

Alberto Sangiovanni Vincentelli
Edgar L. and Harold H. Buttner Chair of Electrical Engineering and Computer Sciences
University of California, Berkeley

Geoffrey Tison
Assistant Professor of Cardiology
School of Medicine, University of California, San Francisco

Yuxin Chen
Assistant Professor of Electrical Engineering
University of Chicago




Colloquium on Digital Transformation Science, July 9, 2020



COVID-19 Medical Best Practice Guidance System

University of Illinois at Urbana-Champaign




The surge of COVID-19 patients exceeds the available medical staff trained to care for them. To minimize the risk of preventable medical errors, we propose a medical best practice guidance system for COVID-19. Similar to how GPS calculates routes in real-time, our medical guidance system will provide real-time treatment guidance based on patient conditions with explanations according to COVID-19 guidelines. We will also provide a training system, based on the same model. Collaborating with physicians from OSF Children’s Hospital of Illinois and the University of Chicago Medical School, we will create a guidance system for COVID-19 as a web-based service, backed by a mathematically verifiable computational pathophysiology model to improve the efficacy of medical interventions. We will first develop the real-time guidance for Acute Respiratory Distress Syndrome (ARDS), as it is the most complex and deadliest phase of COVID-19 pneumonia. We have developed a simplified prototype for screening and management of ARDS. Next, we will add a COVID-19 cardiopulmonary resuscitation guidance module, followed by tuning and integration and consistency checking. Our guidance system will be reviewed and clinically validated by our collaborating hospitals before patient use. The training system will be reviewed and deployed first. Both systems and the verifier will use the C3.ai platform.

Lui Sha
Donald B. Gillies Chair in Computer Science
University of Illinois at Urbana-Champaign

Maryam Rahmaniheris
Postdoctoral Research Associate, Cyber Physical Systems Integration Lab
University of Illinois at Urbana-Champaign

Grigore Rosu
Professor of Computer Science
University of Illinois at Urbana-Champaign

Paul M. Jeziorczak
Pediatric Surgeon
OSF HealthCare Children’s Hospital of Illinois
Clinical Assistant Professor of Surgery
University of Illinois College of Medicine, Peoria

Priti Jani
Assistant Professor of Pediatrics
University of Chicago






AI Enabled Deep Mutational Scanning of Interaction between SARS-CoV-2 Spike Protein S and Human ACE2 Receptor

University of Illinois at Urbana-Champaign




We employ a recently developed platform, TLmutation, which could enable rapid investigation of the sequence-structure-function relationship between SARS-CoV-2 Spike Protein S and Human ACE2 Receptor. In particular, we employ a transfer learning approach to generate high-fidelity scans from noisy experimental data, and transfer the knowledge from single point mutation data to generate higher order mutational scans from the single amino-acid substitution data. Using deep mutagenesis, variants of ACE2 will be identified with increased binding to the receptor binding domain of S at a cell surface. In our preliminary results, we identify mutations across the interface and also at buried sites where they are predicted to enhance folding and presentation of the interaction epitope. The mutational landscape offers a blueprint for engineering high affinity ACE2 receptors to meet this unprecedented challenge. We plan to employ the information from the preliminary mutational landscape to generate the high order mutations in ACE2 receptor that could enhance binding to S protein and help in the design of future vaccines for treatment of SARS-CoV-2. We also aim to investigate this problem using distributed computing approaches to understand the underlying physics of the protein S and ACE2. In particular, we aim to perform molecular dynamics simulations to identify thermodynamic interactions that could enhance the ACE2 binding. Our preliminary results show that ACE2 variants identified from deep mutational scan not only stabilize the structural fluctuations but also strongly couple the motions of the two proteins. These simulations would be performed using Microsoft Azure and NCSA Blue Waters.

Diwakar Shukla
Blue Waters Assistant Professor, Department of Chemical and Biomolecular Engineering
University of Illinois at Urbana-Champaign

Erik Procko
Professor of Biophysics and Quantitative Biology
Assistant Professor of Biochemistry
University of Illinois at Urbana-Champaign