The Digital Transformation Institute has selected 26 research proposals that advance Digital Transformation Science to mitigate COVID-19 and future pandemics.

A total of $5.4 million and access to the C3 AI Suite and Microsoft Azure computing and storage have been awarded to support the following multidisciplinary projects.

On June 18, 2020, a virtual roundtable was held with some of the COVID-19 researchers, a short video from which can be found below. The full video of the roundtable is here. DTI Announces 1st Grant Winners for Using AI to Mitigate COVID-19 and Future Pandemics

Adding Audio-visual Cues to Signs and Symptoms for Triaging Suspected or Diagnosed COVID-19 Patients

The COVID-19 pandemic has placed unprecedented stress on hospital capacity. Increased emergency department (ED) patient volumes and admission rates have led to a scarcity in beds and the need to construct field hospitals in some regions. Bed-sparing protocols that identify COVID-19 patients stable for discharge from the ED or early hospital discharge have proven elusive given this population’s propensity to rapidly deteriorate up to one week after illness onset. Consequently, a significant number of stable patients are unnecessarily admitted to the hospital, while some discharged patients decompensate at home and subsequently require emergency transport to the ED. In order to conserve hospital beds, improved methods for assessing clinical stability of COVID-19 patients is critical. Our goal is to develop audio-visual tools to predict cardiopulmonary decompensation from facial videos captured from EDs and homes through telemedicine platforms. We will use explainable artificial intelligence (AI) and machine learning algorithms to derive criteria for predicting impending deterioration from health-relevant audio-visual features and provide explanations in terms of the clinical details within the electronic medical record. Successful completion of this project will provide the groundwork for prospective evaluation of these tools in a COVID-19 patient population. Once validated, these tools will augment provider clinical assessments of COVID-19 patients both at the bedside and across telemedicine platforms during virtual follow-up. More broadly, the techniques and algorithms developed in this project are likely to be applicable to other high-risk patient populations and emerging platforms such as telemedicine.

Narendra Ahuja
Research Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

David Beiser
Associate Professor of Medicine
University of Chicago

David Chestek
Assistant Professor of Clinical Emergency Medicine
University of Illinois, Chicago

Mark Hasegawa-Johnson
Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

Jerry Krishnan
Associate Vice Chancellor for Population Health Sciences
University of Illinois, Chicago

Arun Singh
National Advisor
All India Institute of Medical Sciences, Jodhpur

Mining Diagnostics Sequences for SARS-CoV-2 Using Variation-aware, Graph-based Machine Learning Approaches Applied to SARS-CoV-1, SARS-CoV-2, and MERS Datasets

On March 11, 2020, the World Health Organization determined that an outbreak of a novel coronavirus from Wuhan, China had reached pandemic status. Deep meta-transcriptomic RNA sequencing of bronchoalveolar lavage fluid samples from COVID-19 patients hospitalized in Wuhan in December 2019 revealed sequence similarity to SARS-like coronaviruses. This genus, Betacoronavirus, was the viral etiologic agent of the 2002-2003 SARS outbreak in humans of SARS (e.g., or SARS-CoV-1). Rapid and precise bacterial and viral diagnostics are extremely important in multiple clinical settings, ranging from regular visits to quick epidemic responses. This is an especially relevant question given the current COVID-19 outbreak, caused by a SARS-CoV-2 coronavirus. This project aims to use human and viral whole transcriptome analysis (RNA-Seq) and genomic datasets to identify SARS-CoV-2 “within-host” polymorphisms that may interfere with diagnostic platforms, and to develop novel graph-based approaches to study co-occurrence patterns for both consensus-level and low-frequency variants. We will compare these results to SARS-CoV-1 and MERS genomic data to glean population-level differences and elucidate biologically relevant differences specific to SARS-CoV-2, and allow for sensitive and accurate identification and transmission analysis of SARS-CoV-2.

Nancy Amato
Abel Bliss Professor of Engineering
University of Illinois at Urbana-Champaign

Lawrence Rauchwerger
Professor of Computer Science
University of Illinois at Urbana-Champaign

Todd Treangen
Assistant Professor of Computer Science
Rice University

Pandemic-resilient Urban Mobility: Learning Spatiotemporal Models for Testing, Contact Tracing, and Reopening Decisions

This project focuses on the design of actionable information and effective intervention strategies to support safe mobilization of economic activity and reopening of mobility services in urban systems. We develop a spatiotemporal modeling, inference, and stochastic control framework to: 1) capture the dynamic interplay between the epidemiological state of different populations in an urban region and their mobility patterns; 2) estimate the local exposure rates and fractions of infections within populations using fine-grained contact tracing data, testing data, and occupancy measurements of key facilities/services; 3) compare and evaluate different response strategies (testing rates, partial capacity or access restrictions, and social distancing guidelines) in selected urban regions, in particular, Boston and San Francisco Bay Area; 4) design of stochastic control strategies for increasing the mobility in a phased manner by adaptively adjusting the operational capacity and testing rates. Our methodological approach is grounded in inference and learning-based control of Marked Temporal Point Processes (MTPPs) defined on a mobility network, and enables both qualitative and quantitative evaluation of contact tracing, testing, and response strategies. We will advance the inference algorithms for MTPPs by investigating both likelihood-based and likelihood-free approaches to learn model parameters from heterogeneous data. Our approach is useful for predicting the intensities of exposure and infections in different regions, conditioned on specific response strategies. Furthermore, we will design learning-based control algorithms for MTPPs to compute the optimal recovery and testing rates, which maximize the gains from sectoral reopening while limiting the risk of exposure.

Saurabh Amin
Robert N. Noyce Career Development Associate Professor of Civil and Environmental Engineering
Massachusetts Institute of Technology

Patrick Jaillet
Dugald C. Jackson Professor of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Jitendra Malik
Arthur J. Chick Professor of Electrical Engineering and Computer Sciences
University of California, Berkeley

Effective cocktail treatments for SARS-CoV-2 based on modeling lung single-cell response data

To fully model the impact of SARS-CoV-2 requires the integration of several different types of molecular and cellular data. SARS-CoV-2 is known to primarily impact cells via two viral entry factors, ACE2 and TMPRSS2. However, much less is currently known about the virus activity within lung cells. To model host response to viral infection, and to develop potential treatments, we will extend methods based on Continuous State Hidden Markov models (CSHMM) and further combine them with additional graphical models and graph analysis algorithms. Combined, these methods would allow us to reconstruct pathways leading from virus proteins via their host interactions to regulators and finally to the observed expression profiles in each cell. Such models identify not just the optimal individual targets but also combinations of targets that, together, lead to large decrease in viral loads. In parallel we will extend ChemProp, our deep learning method which learns to relate molecules and their associated activity values by developing advanced representations involving optimal transport and molecular prototypes, accumulating evidence about each compound’s profile from multiple sources using transfer learning and by extending it to learn the activity of compound mixtures to treat multiple targets. Using our recently developed protocols, we will differentiate human induced pluripotent stem cells to lung cells and perform time series single cell expression studies to profile cells infected with SARS-CoV-2. Following modeling, predicted CSHMM targets, and their predicted ChemProp compounds will be experimentally validated to identify treatments that reduce viral loads in lung cells.

Ziv Bar-Joseph
FORE Systems Professor of Computer Science
Carnegie Mellon University

Regina Barzilay
Delta Electronics Professor of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Tommi Jaakkola
Thomas Siebel Professor of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Darrell Kotton
Professor of Medicine and Pathology
Boston University School of Medicine

Using Data Science to Understand the Heterogeneity of SARS-COV-2 Transmission and COVID-19 Clinical Presentation in Mexico

In late April 2020, Mexico confirmed 15,529 positive cases of COVID-19 within its borders and the Mexican government announced the start of “Phase 3” of the pandemic, acknowledging widespread community transmission, thousands of cases of infection, and increased numbers of patients requiring hospitalization. The epidemic continues to grow rapidly in Mexico. There is a critical need in the country to use data science to advance COVID-19 prevention and treatment, and to support policy response. Country-level data and cooperation is essential to curbing the global pandemic and to support binational/cross-border relations with the U.S. and its Latin American neighbors. The health data available to our team via the Mexican Social Security Institute/Instituto Mexicano del Seguro Social (IMSS) are vast. These assets combined with the computational and data management strengths of the tools and data lake—and the disciplinary strengths of the binational team—create an unprecedented opportunity to analyze a multitude of clinical, individual, facility and structural determinants of exposure and susceptibility to SARS-CoV-2, and the determinants of effective health services responses to the pandemic. This research project will reveal predictors of SARS-COV-2 infection and severity that will help guide prevention efforts, allowing Mexico to focus treatment on those at greatest risk. However, they should also generate hypotheses about mechanisms that might improve COVID-19 patient outcomes. We can also estimate disease burden and economic impact, and provide a best practice model for other countries with similar data systems to guide their policy decisions around pandemic management.

Stefano Bertozzi
Dean Emeritus and Professor of Health Policy and Management
University of California, Berkeley

Ziad Obermeyer
Acting Associate Professor of Health Policy and Management
University of California, Berkeley

Alan Hubbard
Professor of Biostatistics
University of California, Berkeley

Juan Pablo Gutierrez
Professor, Center for Policy, Population & Health Research
School of Medicine, UNAM

Gustavo Olaiz
Professor and Coordinator General, Center for Policy, Population & Health Research
School of Medicine, Universidad Nacional Autónoma de México (UNAM)

Alberto Rascón
Coordinator of Epidemiological Surveillance, Division of Health Services
Mexican Institute of Social Security

Toward analytics-based clinical and policy decision support to respond to the COVID-19 pandemic

The COVID-19 pandemic creates unprecedented challenges for healthcare providers and policy makers. How to triage patients when healthcare resources are limited? Whom to test? And how to design social distancing policies to contain the disease and its socioeconomic impact? Analytics can provide data-driven answers. We have collected comprehensive data from hundreds of clinical studies, case counts, and hospital collaborations. We have developed a new epidemiological model of the disease’s dynamics, a machine-learning model of mortality risk, and a resource allocation model — published on In this project, we develop automated, interpretable, and scalable decision-making systems based on machine learning and artificial intelligence (ML/AI) to support clinical practices and public policies as they respond to the COVID-19 pandemic. We tackle four research questions: 1) How can we predict admissions in intensive care units (ICU) using machine learning? 2) How does COVID-19 impact different demographic and socioeconomic populations? 3) How does mobility impact the disease’s spread, and how to optimize social distancing policies? 4) How to augment COVID-19 tests with data-driven warnings that identify high-risk subjects? This project leverages large-scale datasets (from C3 AI and our own collection efforts), high-performance computing (using the C3 AI Suite), and advanced ML/AI. Specifically, this project develops end-to-end ML/AI methods, spanning epidemiological modeling (to model the disease’s spread), machine learning (to predict ICU admissions and test results), causal inference (to investigate disparities across populations) and optimal control (to support social distancing guidelines). We will disseminate results to healthcare providers, policy makers, researchers, and the public.

Dimitris Bertsimas
Associate Dean of Business Analytics and Boeing Professor of Operations Research
MIT Sloan School of Management

Alexandre Jacquillat
Assistant Professor of Operations Research and Statistics
MIT Sloan School of Management

Dynamic Resource Management in Response to Pandemics

Business-as-usual medical and public health practices cannot cope with the large shocks of a pandemic. Coordinated and well-timed response mechanisms are required. In this proposal, we aim to build a data analytic framework to optimize resource management for testing, prevention, and care, both prior to a potential spread, as well as during a rapidly developing outbreak. The proposed research builds upon our current work to help the State of Illinois to mitigate the ongoing COVID-19 outbreak. Key outcomes of our research include: 1) risk-aware dynamic equipment and workforce allocation mechanism implemented on the C3 AI platform, 2) comparison of federalized resource allocation to that where states compete with one another for resources via game theory, 3) flexible capacity provisioning in medical supply chains and dynamic inventory management for perishable goods such as N95 masks. While part of our research will directly help combat the current crisis, the rest will forge operational resilience and enhance the readiness to fight a potential future outbreak. The multidisciplinary team combines expertise in optimization and control, healthcare management, operations research, data analytics, and game theory.

Subhonmesh Bose
Assistant Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

Anton Ivanov
Assistant Professor of Business Administration
University of Illinois at Urbana-Champaign

Ujjal Mukherjee
Assistant Professor of Business Administration
University of Illinois at Urbana-Champaign

Sridhar Seshadri
Alan J. and Joyce D. Baltz Endowed Professor of Business Administration
University of Illinois at Urbana-Champaign

Yuqian Xu
Assistant Professor of Business Administration
University of Illinois at Urbana-Champaign

COVIDScholar: An NLP hub for COVID-19 research literature

The urgency of dealing with COVID-19 requires novel ways of gathering and disseminating knowledge. Building on our previous work using natural language processing (NLP) to extract latent knowledge from literature in the physical sciences, we set out to apply similar techniques to COVID-19 literature. We will build a knowledge portal tailored for the needs of COVID-19 researchers that leverages state-of-the-art NLP techniques to synthesize information spread across tens of thousands of emergent research articles, patents, and clinical trials into actionable insights. We aim to create the largest and most current database of research findings for COVID-19-related work, automatically updated to remain current, and make text data easily accessible to the research community. Moreover, we will use Natural Language Processing (NLP) to design unique search tools powered by custom machine learning models that allow them to engage with the literature more effectively and complete research faster. Using the team’s expertise in large-scale data acquisition from diverse sources, NLP, and genomics, and engaging deeply with biology and genomics communities for feedback and guidance, we have already made significant progress towards this goal, letting users search within more COVID-related literature than any other repository through our prototype website,

Gerbrand Ceder
Daniel M. Tellep Distinguished Professor in Engineering
University of California, Berkeley

Kristin Persson
Associate Professor of Materials Science and Engineering
University of California, Berkeley

Marcin P. Joachimiak
Staff Researcher and Developer
Lawrence Berkeley National Laboratory

Housing Precarity, Eviction, and Inequality in the Wake of COVID-19

Ensuring housing security is vital to mitigating the spread of the COVID-19 virus and sustaining health, economic security, and family stability. This joint interdisciplinary project will bring together academics and data scientists to track, analyze, and respond to pandemic-driven spikes in eviction and displacement risks. Doing so requires development of: 1) an innovative system for tracking real-time eviction filings after the outbreak; and 2) a housing precarity risk model using machine learning, to better analyze and predict areas at disproportionate risk of displacement in the wake of the COVID-19 pandemic. This project will provide major new sources of data and inform research and public policy regarding U.S. housing and inequality.

Karen Chapple
Professor and Chair of the Department of City and Regional Planning
University of California, Berkeley

Matthew Desmond
Maurice P. During Professor of Sociology
Princeton University

Joshua Blumenstock
Assistant Professor, School of Information
University of California, Berkeley

Reinforcement Learning to Safeguard Schools and Universities Against the COVID-19 Outbreak

The COVID-19 outbreak has disrupted normal activities in nearly all aspects of higher education. To reopen our universities, we need new technology and innovative practices to safeguard students against the potential second wave of the virus outbreak. In this proposal, we seek to develop analytical methods for modeling and mitigating the COVID-19 situation based on students’ location and symptom data collected via mobile apps. We adopt an optimal control approach and seek intervention policies that strike a balance between containing the virus and keeping productive on-campus activities. This problem is highly challenging due to the prevalence of hidden states, unknown dynamics, and high dimensionality. By leveraging recent advances in system identification, reinforcement learning, and adaptive control, we will develop predictive methods to infer the hidden health states of individual students and develop algorithms to recommend optimal interventions (e.g., testing and quarantine) for decisionmakers. We will develop simplified models to assess the impact of such policies on the stability of the system captured in the growth rate of infections. The methods will be validated using simulation and available data. We expect to apply and further develop the methods to analyze real campus data from MIT in the fall of 2020. By using the computing capabilities of C3 AI Suite and Microsoft Azure Cloud, we expect to analyze large volumes of location data as they are collected and adapt the intervention policy. We will make our research outcomes, including software, non-confidential data sets and analysis sharable on the C3 AI platform.

Munther Dahleh
William A. Coolidge Professor, Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Mengdi Wang
Associate Professor, Center for Statistics and Machine Learning, Electrical Engineering
Princeton University

Anette Hosoi
Associate Dean of Engineering, Neil and Jane Pappalardo Professor of Mechanical Engineering
Massachusetts Institute of Technology

Improving Fairness & Equity in COVID-19 Policy Applications of Machine Learning

As governments and social service providers attempt to understand the COVID-19 pandemic – including the significant and asymmetrical health, social, and economic risks to their constituents – and plan for the future by acquiring and allocating scarce resources, AI researchers have been developing detection, forecasting, and mitigation tools to support those efforts. When policy planning and resource allocation decisions are made using AI methods, there is a risk that they could result in inequitable and unfair outcomes for vulnerable populations. Disparate impacts of the COVID-19 pandemic on racial minorities and the economically disadvantaged are already evident, and the risk that these disparities could worsen through AI applications is substantial. This proposal focuses on building bias detection/audit, reduction, and mitigation methods and tools to ensure that policy actions taken using AI and ML reduce the risk of inequitable outcomes for vulnerable populations. While our work will be broadly applicable, we focus on four use-cases: 1) COVID-19 forecasting to improve policy decision-making, 2) identifying individuals in California facing social and economic challenges due to the epidemic, 3) understanding potential disparities in the use of contact-tracing and immunity passport technologies, and 4) mental health interventions to break the cycle of incarceration.

Rayid Ghani
Distinguished Career Professor in the Machine Learning Department and Heinz College of Information Systems and Public Policy
Carnegie Mellon University

Kit Rodolfa
Research Project Scientist, Machine Learning Department
Carnegie Mellon University

Aziz Huq
Frank and Bernice J. Greenberg Professor of Law
University of Chicago Law School

Ryan Tibshirani
Associate Professor Department of Statistics and Machine Learning Department
Carnegie Mellon University

Machine Learning-Based Vaccine Design and HLA-Based Risk Prediction for Viral Infections

We will develop and apply new methods for vaccine design and viral disease severity prediction based on our recent developments in the prediction of the presentation of viral antigens by Class I and Class II MHC complexes. These methods will be implemented on the C3 AI Suite and other platforms, and use cloud resources for vaccine combinatorial optimization that involves considering over 39,000 peptides to select a compact set for vaccine formulation. Our novel approach can be targeted to any virus given its genome sequence. While our current work is based upon the development of vaccines and risk models for COVID-19, it is directly applicable to new viral strains. Given the potential escape of SARS-CoV-2 from vaccination by mutation, it is valuable to have methods to rapidly develop new vaccines and predict the severity of evolved viruses based on individual genotypes. Aim 1: We will develop a new computational platform for selecting epitopes focused on stimulating both CD4+ and CD8+ T Cell responses to viral infection based upon the combinatorial optimization of peptide display. Aim 2: We will develop a risk model of viral disease severity based on individual HLA types that determines what viral components will be displayed to their adaptive immune system.

David Gifford
Professor of Electrical Engineering and Computer Science and Professor of Biological Engineering
Massachusetts Institute of Technology

Medical Imaging Domain-Expertise Machine Learning for Interrogation of COVID

The COVID-19 pandemic represents a pressing public health need for computational techniques to augment the interpretation of medical images in their role for: 1) surveillance, detection, and triaging of COVID-19 medical images given potential resurgence; 2) differential diagnosis of COVID-19 patients; and 3) prognosis, as well as prediction and monitoring of treatment response, to help in patient management. While thoracic imaging, including chest radiography and computed tomography (CT), are being re-examined for their role in patient management, the limitations for improved interpretation are partially due to the qualitative interpretation of the images, and thus this project’s aim is to develop machine intelligence methods to aid in the interrogation of medical images from COVID-19 patients. Successful completion of the research will demonstrate cascade-based deep transfer learning between similar but different thoracic disease states (e.g., interstitial diseases to COVID-19) and a clinical tool to aid in the triaging of COVID-19 patients in terms of detection, treatment planning, and monitoring.

Maryellen L. Giger
A.N. Pritzker Professor of Radiology, Committee on Medical Physics
University of Chicago

Jonathan Chung
MD, Vice Chair of Quality and Section Chief of Cardiopulmonary Imaging
University of Chicago

Samuel Armato
Associate Professor of Radiology
University of Chicago

Ravi Madduri
Computational Scientist, Mathematics and Computer Science Division
Argonne National Laboratory
Senior Research Fellow, Computation Institute
University of Chicago

Hui Li
Research Associate Professor
University of Chicago

Scoring Drugs: Small Molecule Drug Discovery for COVID-19 using Physics-Inspired Machine Learning

The rapid spread of SARS-CoV-2 has spurred the scientific world into action for therapeutics to help minimize fatalities from COVID-19. Molecular modeling is combating the global pandemic through the traditional process of drug discovery, but slow turnaround times for identifying leads for antiviral drugs, analyzing structural effects of genetic variation in the evolving virus, and targeting relevant virus-host protein interactions is a great limitation during an acute crisis. The first component of drug discovery — the structure of potential drugs and the target proteins — has driven functional insight into biology ever since Watson, Crick, Franklin, and Wilkins solved the structure of DNA. What could we do with structural models of host and virus proteins and small molecule therapeutics? We can further enrich structure with dynamics for discovery of new surface sites exposed by fluctuations to bind drugs and peptide therapeutics not revealed by a static structural model. These “cryptic” binding sites offer new leads in drug discovery but will only yield fruit if they can be assessed rapidly for binding affinity for new small molecule drugs. We offer physics-inspired, data-driven models to: 1) extend the chemical space of new drugs beyond those available; 2) create reliable scoring functions to evaluate drug binding affinities to cryptic binding sites of COVID-19 targets; 3) accelerate computation of binding affinities by training machine learning models; and 4) closing the loop of design and evaluation to bias the distribution of new drug candidates towards desired metrics enabled by the C3 AI Suite.

Teresa Head-Gordon
Chancellor’s Professor, Department of Chemistry, Chemical and Biomolecular Engineering, and Bioengineering
University of California, Berkeley

Rommie Amaro
Distinguished Professor in Theoretical and Computational Chemistry, Department of Chemistry and Biochemistry
University of California, San Diego

Modeling the Impact of Social Determinants of Health on COVID-19 Transmission and Mortality to Understand Health Inequities

The COVID-19 pandemic highlights drastic health inequities,  particularly in cities such as Chicago, Detroit, New Orleans, and New York City. Reducing COVID-19 morbidity and mortality will likely require increased focus on social determinants of health given their disproportionate impact on populations most heavily affected by COVID-19. We need a better understanding of how factors such as financial hardship, housing instability, health care access, and incarceration contribute to COVID-19 transmission and mortality to inform policies around social distancing and testing and vaccination scale-up. This proposal builds on an existing agent-based model of COVID-19 transmission (CityCOVID) for the city of Chicago. Using multiple sources of existing data, including local COVID-19 contact tracing surveys and public health surveillance, we will apply machine learning methods to quantify the impact of social determinants of health on COVID-19 transmission dynamics and generate a more granular synthetic population we will use to evaluate intervention approaches. The extended CityCOVID model will provide a more realistic model to guide local policy and intervention development.

Anna Hotton
Research Assistant Professor, Department of Medicine
University of Chicago

Aditya Khanna
Research Assistant Professor, Director of Network Modeling
University of Chicago

Jonathan Ozik
Computational Scientist
Argonne National Laboratory

Charles Macal
Senior Technical Advisor and Social, Behavioral, and Decision Science Group Leader
Argonne National Laboratory

Harold Pollack
Helen Ross Professor, School of Social Service Administration
University of Chicago

John Schneider
Professor of Medicine and Epidemiology
University of Chicago

Secure Federated Learning for Clinical Informatics with Applications to the COVID-19 Pandemic

Enabling healthcare providers to respond faster and with greater precision to pandemics requires both advanced machine learning and quickly accessible clinical data. Yet the necessary medical data is often inaccessible across hospitals due to privacy and intellectual property concerns. This proposal leverages distributed machine learning and modern cryptography to introduce a computational protocol and software tools for securely training machine learning models with data spread over several medical establishments, while preserving privacy and IP rights. Our scientific contributions include innovative techniques to trade off computation and communication to improve  predictive performance of federated learning in clinical settings, along with novel cryptographic techniques to trade off computation and robustness to enhance security. We will develop open-source software to complement our technical aims. We will evaluate our approach for COVID-19 diagnosis using data available on the C3 AI Data Lake combined with clinical data from OSF HealthCare to illustrate how private data can significantly improve prediction quality compared to public data alone. We propose to serve as a hub for other C3 AI projects to enable the secure use of privately-held clinical datasets to improve results by other teams. Our broader vision is to provide Secure Federated Learning as a Service (FLaaS) freely available to any hospital during a declared crisis. We envision that a robust, secure federated learning system will enable fast responses to minimize the impact of disease in the earliest stages.

Sanmi Koyejo
Assistant Professor of Computer Science
University of Illinois at Urbana-Champaign

Dakshita Khurana
Assistant Professor of Computer Science
University of Illinois at Urbana-Champaign

William Bond
Director of Research, Jump Simulation
OSF HealthCare

Joerg Heintz
Assistant Director, Health Data Analytics
University of Illinois at Urbana-Champaign

Roopa Foulger
Vice President, Data Delivery Healthcare Analytics
OSF HealthCare

Data-Driven, High-Dimensional Design for Trustworthy Drug Discovery

Machine learning-based predictive modeling tools have been applied to a wide variety of tasks in computational biology and chemistry, such as predicting protein binding and stability, small molecule antibiotic properties, synthesizability, and drug-likeness. However, when such data-driven models are used to produce new designs, they are likely to encounter a major challenge. Learning-based design involves optimizing over the input to a predictive model. For example, a model that predicts how well a small molecule binds to a particular drug target takes as input some representation of the molecule, and outputs the binding efficiency. Hence, finding the best small molecule involves performing an optimization over the input to the model when its output is fixed to be, say, as large as possible. We refer to this problem setting as “high-dimensional model inversion” (HDMI). Critically, by definition of the design problem, the predictive model will never have seen any molecules with precisely the desired property, and thus we are asking the model to extrapolate. What does it mean to extrapolate in this context? Can we extrapolate? How far can we extrapolate? How can we trust such decisions? We will develop a new formal framework and associated algorithms for solving HDMI with high capacity models such as neural networks and high-dimensional inputs, which will enable us to answer these questions. We will draw on ideas from learning-based decision making (reinforcement learning), robust uncertainty estimation, and probabilistic modeling. We will focus on data-driven drug design, including a collaboration toward developing a therapeutic for COVID-19.

Jennifer Listgarten
Professor of Electrical Engineering and Computer Science
University of California, Berkeley

Sergey Levine
Assistant Professor of Electrical Engineering and Computer Science
University of California, Berkeley

Algorithms and Software Tools for Testing and Control of COVID-19

Extensive and ongoing testing of populations for viremia and antibodies will play a major role in shaping and managing the process of reopening the states. This interdisciplinary project aims to bring together epidemiologists, systems theorists, and data scientists to develop models, algorithms, and software tools to support state-level PCR (polymerase chain reaction) and serological testing efforts. Specifically, the team will develop: 1) algorithms to assimilate real-time testing data into networked epidemiological models; and 2) mean-field type control strategies to inform and evaluate the effect of social distancing and other control measures on the progression of the disease. At its successful completion, this project will result in epidemiological models that are better and more realistic along two dimensions: 1) they assimilate noisy data from ongoing population-level testing; and 2) they include the effect of population-level feedback that may result as a consequence of control measures. Such models are expected to be useful to inform testing guidelines, including what groups to sample, with which tests, and at what frequency, and to better evaluate effects of deploying control measures. The algorithms will be implemented as efficient, scalable, open-source software and made available to policy makers and the public through an interactive website, assimilating daily observational data to generate real-time disease maps (with quantified uncertainty) and tools to allow simulation under different control policies. This will require substantial backend computation, with simulation and learning running on the C3 AI and Azure platforms.

Prashant Mehta
Associate Professor of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign

Tamer Ba̧sar
Swanlund Endowed Chair, CAS Professor of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign

Carolyn Beck
Professor of Industrial and Enterprise Systems Engineering
University of Illinois at Urbana-Champaign

Philip E. Paré
Assistant Professor of Electrical and Computer Engineering
Purdue University

Rebecca Smith
Assistant Professor of Epidemiology
University of Illinois at Urbana-Champaign

Matthew West
Associate Professor of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign

Machine Learning Support for Emergency Triage of Pulmonary Collapse in COVID-19

In emergency rooms across the world, doctors facing bed shortages must decide if patients with suspected or confirmed COVID-19 are safe to release or need hospital-level monitoring. The current state of medical knowledge is failing here: some patients in the hospital ultimately do not require advanced care, wasting beds; others are sent home, only to deteriorate rapidly. Our goal is to produce an algorithm that helps physicians improve triage decisions by predicting pulmonary collapse with X-rays that most patients with respiratory complaints get in the ER. Our discussions with frontline doctors treating COVID-19 identify this as an area of genuine need. Thanks to our close relationship with one of the largest healthcare systems in the Northwest, we already have a signed agreement in place, to access 4 million chest X-rays linked to physiological markers of pulmonary collapse: acute respiratory distress syndrome (ARDS), the “final common pathway” for many infections including COVID-19, hypoxia, and mortality, from linked Social Security data. This enables the modern machine learning toolkit to be deployed, and complements our own collective expertise in medical decision-making, machine learning, and understanding healthcare systems and behavior. If successful, we will deploy the algorithm in our partner’s 51 hospital-based ERs. More broadly, our work is a general prediction toolkit for pulmonary collapse, designed to transfer across healthcare systems. We aim to provide pro bono consultation to health systems wishing to integrate the tool, and provide open-source algorithms for the prototypes we develop.

Sendhil Mullainathan
Roman University Professor of Computation and Behavioral Science, Booth School of Business
University of Chicago 

Aleksander Madry
Professor of Computer Science
Massachusetts Institute of Technology

Ziad Obermeyer
Blue Cross of California Distinguished Associate Professor of Health Policy and Management
University of California, Berkeley

Targeted Interventions in Networked and Multi-risk SIR Models: How to Unlock the Economy During a Pandemic

We propose to study optimal lockdown and testing policies for the containment of disease spread in networked environments. We are motivated by three peculiar aspects of the COVID-19 pandemic. First, without an available vaccine, the only intervention is virus containment strategies (such as lockdowns), with their serious economic repercussions. Our first objective is to develop lockdown models that consider both epidemic and economic aspects. Since individuals have different risks and productivity levels, this requires models that account for the heterogeneity of different groups and networks of interactions. Our second objective is exploiting these network models to study targeted interventions. We aim to understand what is the best policy to gradually reopen the economy by considering the role of different groups in terms of their productivity and risk level. Pandemic interventions by local governments will have ripple effects on neighboring states, so coordinating efforts from different governments is critically important. The third objective is to study how state interconnections and mobility patterns affect optimal responses to  epidemics in networked environments. Results from this research will help leaders and decisionmakers to understand how to unlock a state’s economy and how to optimally coordinate efforts between states.

Asuman Ozdaglar
Distinguished Professor of Engineering Department Head, Electrical Engineering and Computer Science, Deputy Dean of Academics, Schwarzman College of Computing
Massachusetts Institute of Technology

Daron Acemoglu
Professor, Department of Economics
Massachusetts Institute of Technology

Bringing Social Distancing to Light: Crowd Management Using AI and Interactive Floor Projection

With the spread of COVID-19, social distancing has become an integral part of our everyday lives. Worldwide efforts are focused on identifying ways to reopen public spaces, restart businesses, and reintroduce physical togetherness. We believe that architecture plays a key role in the return to a healthy public life, by providing a means for controlling distances between people. Making use of computational processing power and data accessibility, we propose an approach to promote healthy and efficient movement through public space. The research aims to develop: 1) a computational tool utilizing machine learning to predict people’s movement and adapting spaces with local physical interventions; and 2) a physical intervention system based on light projections that provides direct realtime information about safe trajectories and movement behavior for pedestrians. The computational tool will use existing visual data from case-study spaces, identifying movement patterns and translating those into behavior rules. This data will be combined with swarm behavior knowledge from natural systems to provide initial movement predictions. Installing the camera-projection system will enable us to evaluate the efficiency of proposed measures, monitor flow, and inform the predictive model. Ultimately, we expect to identify strategies for efficient trajectory planning and repurposing of public space, while learning from their direct implementation. As such, we hope to identify novel spatial typologies pertaining to improved public health, resulting in planning rules that will reshape the built environment.

Stefana Parascho
Assistant Professor of Architecture
Princeton University

Corina Tarnita
Associate Professor of Ecology and Evolutionary Biology
Princeton University

Modeling and Control of COVID-19 Propagation for Assessing and Optimizing Intervention Policies

A key scientific goal regarding COVID-19 is to develop mathematical models that enable us to understand and predict its spreading behavior, as well as provide guidelines on what can be done to limit its spread. As such, this project pursues: 1) analysis and prediction of the spread of COVID-19 through a new mathematical model incorporating virus mutations; and 2) optimal and robust control of the spread of COVID-19 by carefully timed interventions. Expected outcomes could give authorities a tool to better assess the effectiveness of existing or potential countermeasures in limiting the spread of COVID-19. They could also help leaders assess outcomes of eliminating existing countermeasures. Finally, they could help prepare for different mutation scenarios, including worst cases for this or future pandemics.

H. Vincent Poor
Michael Henry Strater University Professor of Electrical Engineering
Interim Dean, School of Engineering and Applied Science

Princeton University

Simon A. Levin
James S. McDonnell Distinguished University Professor in Ecology and Evolutionary Biology, Director of the Center for BioComplexity
Princeton University

Osman Yagan
Associate Research Professor of Electrical and Computer Engineering, Dean’s Early Career Fellow
Carnegie Mellon University

Joshua Plotkin
Walter H. and Leonore C. Annenberg Professor of the Natural Sciences
University of Pennsylvania

Spatial Modeling of Covid-19: Optimizing PDE and Metapopulation Models for Prediction and Spread Mitigation

This research proposal offers a comprehensive approach to the spatial dynamics of COVID-19 based on partial differential equation and metapopulation models. We aim to fill the modeling gap between studies with a detailed description of the disease dynamics, but lacking spatial dynamics, and those that while spatial in nature, do not account for the intricacies of the COVID-19 disease. We will use a diverse array of techniques ranging from dynamic traffic assignment models, which will inform the links in the metapopulation model, to manifold learning techniques for model parameterization. We will use the C3 AI platform to manage and integrate data, implement code, and build a user interface to increase research outcome accessibility.

Zoi Rapti
Associate Professor, Department of Mathematics
University of Illinois at Urbana-Champaign

Yannis G. Kevrekidis
Professor, Department of Applied Mathematics and Statistics, Whiting School of Engineering 
Professor, Department of Urology, School of Medicine
Johns Hopkins University 

Panayotis G. Kevrekidis
Professor of Mathematics and Statistics
University of Massachusetts at Amherst

Eleftheria Kontou
Assistant Professor of Civil and Environmental Engineering
University of Illinois at Urbana-Champaign

Detection and Containment of Emerging Diseases Using AI Techniques

This proposal focuses on developing data-efficient and reliable algorithms for healthcare AI in response to emerging diseases such as CoVID-19. In particular, we aim to: 1) develop machine learning methods targeting unseen disease types; 2) adapt existing medical AI tools to emerging diseases to rapidly mitigate the outbreak; 3) develop robust methods to deploy and proliferate new disease models to disparate healthcare institutions across geographies; and 4) use collaborative AI approaches that leverage experiences of medical professionals to enable rapid-cycle development of emerging disease models. We will deploy our experience in developing algorithms for incipient fault detection in cyber-physical systems and in generative modeling for synthetic data generation in computer-vision applications and interactive machine learning. Proposed healthcare AI models will be made robust and explainable for clinical use in collaboration with the School of Medicine, University of California, San Francisco. We will make extensive use of the C3 AI Data Lake, along with additional data available from UCSF. Further, we intend to use the C3 AI Suite to facilitate algorithmic and code development. We will make results openly available to foster further research. We also plan to integrate our results in the C3 AI Suite to make it more powerful in AI healthcare applications.

Alberto Sangiovanni Vincentelli
Edgar L. and Harold H. Buttner Chair of Electrical Engineering and Computer Sciences
University of California, Berkeley

Geoffrey Tison
Assistant Professor of Cardiology, School of Medicine
University of California, San Francisco

Yuxin Chen
Assistant Professor of Electrical Engineering
University of Chicago

COVID-19 Medical Best-Practice Guidance System

The surge of COVID-19 patients exceeds the available medical staff trained to care for them. To minimize risks of preventable medical errors, we propose a medical best-practice “guidance system” for COVID-19. Similar to how GPS calculates routes in real-time, our medical guidance system will provide real-time treatment guidance based on patient conditions with explanations according to COVID-19 guidelines. We will also build a training system based on the same model. Collaborating with physicians from OSF Children’s Hospital of Illinois and the University of Chicago Medical School, we will create a guidance system for COVID-19 as a web-based service, backed by a mathematically verifiable computational pathophysiology model to improve the efficacy of medical interventions. We will first develop the real-time guidance for Acute Respiratory Distress Syndrome (ARDS), the most complex and deadliest phase of COVID-19 pneumonia. We have developed a simplified prototype for ARDS screening and management. Next, we will add a COVID-19 cardiopulmonary resuscitation guidance module, followed by tuning and integration and consistency checking. Our guidance system will be reviewed and clinically validated by collaborating hospitals before patient use. The training system will be reviewed and deployed first. Both systems and the verifier will use the C3 AI platform.

Lui Sha
Donald B. Gillies Chair in Computer Science
University of Illinois at Urbana-Champaign

Grigore Rosu
Professor of Computer Science
University of Illinois at Urbana-Champaign

Paul M. Jeziorczak
Pediatric Surgeon
OSF HealthCare Children’s Hospital of Illinois
Clinical Assistant Professor of Surgery, College of Medicine
University of Illinois, Peoria

Priti Jani
Assistant Professor of Pediatrics
University of Chicago

AI-Enabled Deep Mutational Scanning of Interaction between SARS-CoV-2 Spike Protein S and Human ACE2 Receptor

We employ a recently developed platform, TLmutation, which could enable rapid investigation of the sequence-structure-function relationship between SARS-CoV-2 Spike Protein S and Human ACE2 Receptor. In particular, we employ a transfer learning approach to generate high-fidelity scans from noisy experimental data, and transfer the knowledge from single-point mutation data to generate higher order mutational scans from the single amino-acid substitution data. Using deep mutagenesis, variants of ACE2 will be identified with increased binding to the receptor binding domain of S at a cell surface. In -preliminary results, we identify mutations across the interface and also at buried sites where they are predicted to enhance folding and presentation of the interaction epitope. The mutational landscape offers a blueprint for engineering high affinity ACE2 receptors to meet this unprecedented challenge. We plan to employ the information from the preliminary mutational landscape to generate high-order mutations in ACE2 receptors that could enhance binding to S protein and help in the design of future vaccines for treatment of SARS-CoV-2. We aim to investigate this using distributed computing approaches to understand the underlying physics of the protein S and ACE2. In particular, we aim to perform molecular dynamics simulations to identify thermodynamic interactions to enhance the ACE2 binding. Preliminary results show that ACE2 variants identified from deep mutational scan not only stabilize the structural fluctuations but also strongly couple the motions of the two proteins. These simulations would be performed using Microsoft Azure and NCSA Blue Waters resources.

Diwakar Shukla
Blue Waters Assistant Professor, Department of Chemical and Biomolecular Engineering
University of Illinois at Urbana-Champaign

Erik Procko
Professor of Biophysics and Quantitative Biology, Assistant Professor of Biochemistry
University of Illinois at Urbana-Champaign