Frontiers in Digital Transformation Science

Inaugural C3.ai Digital Transformation Institute Annual Research Symposium 2021

The annual C3.ai DTI Research Symposium brings together leaders and students of the new science of digital transformation from around the globe. This year’s inaugural symposium showcases leading research and applications of Artificial Intelligence and Machine Learning that are advancing the science of digital transformation, including the use of AI and ML to mitigate the COVID-19 pandemic.

January 20-21, 2021

Watch the recorded talks on our YouTube Channel.

Day One (Wednesday, Jan 20)

8:00 am – 8:05 am
Welcome and C3.ai DTI Overview

SPEAKER

Shankar Sastry (C3.ai DTI Co-Director, University of California, Berkeley) and R. Srikant (C3.ai DTI Co-Director, University of Illinois at Urbana-Champaign)

8:05 am – 8:15 am
Opening Remarks

SPEAKER

Thomas Siebel, Chairman and CEO, C3 AI

8:15 am – 9:00 am
Keynote: Modeling the Spread and Mitigation of COVID-19 in a Large Public University

ABSTRACT

This talk will describe agent-based models to predict the spread of COVID-19 at a large public university. The purpose of the modeling was to determine whether or not it is possible to reopen such a university without exponential growth of the epidemic on campus, and what surveillance testing and other non-pharmaceutical interventions are required to accomplish this goal. Distinctive features of this modeling approach are: (1) Physical modeling of transmission of SARS-CoV-2 by aerosols; (2) Representation of student behavioral heterogeneity through a range of infection zones, with differing transmission characteristics and risks; (3) Evaluation of infection through classroom exposure using a social network generated by the timetable of more than 45,000 students; (4) Estimation of the effective reproduction number achievable by a series of non-pharmaceutical interventions. The models show that as long as surveillance testing of the entire university community is performed at a high enough frequency, and compliance is high, the epidemic can be contained on campus with a moderate daily incidence rate. One of the outcomes of the model was the finding that the transmission of COVID-19 was strongly dependent on the time since infection to isolation, due to the viral dynamics in the host. We verified that inevitable delays in testing arising from logistical bottlenecks did not drive infection spikes, and that cases were correlated with social activity as reflected in the number of police reports from the previous week. Our work shows how a blend of physical, behavioral, and epidemic modeling can be used to predict trends and semi-quantitative features of COVID-19 spread in congregate settings such as a large public university.

SPEAKER

Speaker: Nigel Goldenfeld, who holds a Swanlund Endowed Chair, is a Center for Advanced Study Professor in Physics, Director of the NASA Astrobiology Institute for Universal Biology, and leads the Biocomplexity Group at the Institute for Genomic Biology Biocomplexity Group, all at the University of Illinois at Urbana-Champaign. Nigel received his Ph.D. from the University of Cambridge (U.K.) in 1982, was a postdoctoral fellow at the Institute for Theoretical Physics, University of California at Santa Barbara, and co-founded NumeriX, a company that specializes in high-performance software for the derivatives marketplace. He has served on the editorial boards of several journals, including The Philosophical Transactions of the Royal Society and Physical Biology. Selected honors include: Alfred P. Sloan Foundation Fellow, University Scholar of the University of Illinois, the Xerox Award for research, the A. Nordsieck award for excellence in graduate teaching, and the American Physical Society’s Leo P. Kadanoff Prize. Nigel is a Fellow of the American Physical Society and the American Academy of Arts and Sciences and a Member of the U.S. National Academy of Sciences.

9:00 am – 9:20 am
Using Data Science to Understand the Heterogeneity of SARS-COV-2 Transmission and COVID-19 Clinical Presentation in Mexico

ABSTRACT

In late April 2020, Mexico confirmed 15,529 positive cases of COVID-19 within its borders and the Mexican government announced the start of “Phase 3” of the pandemic, acknowledging widespread community transmission, thousands of cases of infection, and increased numbers of patients requiring hospitalization. The epidemic continues to grow rapidly in Mexico. There is a critical need in the country to use data science to advance COVID-19 prevention and treatment, and to support policy response. Country-level data and cooperation is essential to curbing the global pandemic and to support binational/cross-border relations with the U.S. and its Latin American neighbors. The health data available to our team via the Mexican Social Security Institute/Instituto Mexicano del Seguro Social (IMSS) are vast. These assets combined with the computational and data management strengths of the C3.ai tools and data lake — and the disciplinary strengths of the binational team — create an unprecedented opportunity to analyze a multitude of clinical, individual, facility, and structural determinants of exposure and susceptibility to SARS-CoV-2, and the determinants of effective health services responses to the pandemic. This research project will reveal predictors of SARS-COV-2 infection and severity that will help guide prevention efforts, allowing Mexico to focus treatment on those at greatest risk. However, they should also generate hypotheses about mechanisms that might improve COVID-19 patient outcomes. We can also estimate disease burden and economic impact, and provide a best practice model for other countries with similar data systems to guide their policy decisions around pandemic management.

SPEAKER

Stefano M. Bertozzi is Dean Emeritus and Professor of Health Policy and Management at the University of California, Berkeley School of Public Health and Interim Director of University of California systemwide programs with Mexico (UC-MEXUS, the UC-Mexico Initiative and Casa de California). Previously, he worked at the Bill and Melinda Gates Foundation, the Mexican National Institute of Public Health, the World Health Organization, UNAIDS, the World Bank, and the Government of the Democratic Republic of the Congo. He recently co-edited the Disease Control Priorities (DCP3) volume on HIV/AIDS, Malaria & Tuberculosis, has served on governance and advisory boards for the East Bay Community Foundation, HopeLab, UNICEF, WHO, UNAIDS, the Global Fund, PEPFAR, the NIH, Duke University, the University of Washington, and the AMA, has advised NGOs and ministries of health and social welfare in Asia, Africa, and Latin America, and is a member of the National Academy of Medicine.

Juan Pablo Gutierrez is Professor at the Center for Policy, Population & Health Research, National Autonomous University of Mexico (UNAM), Chair of the Technical Committee of the Morelos’ Commission on Evaluation of Social Development, and Member of GAVI Evaluation Advisory Committee. His research focuses on comprehensive evaluation of social programs and policies, universal health coverage and effective access, and social inequalities in health. He has been responsible for the evaluation of social and health programs in Mexico, Ecuador, Guatemala, Dominican Republic, Honduras, and India, as well as several population-based health surveys both in households and facilities. He is a member of the National Observatory on Health Inequalities in Mexico and has authored or co-authored more than 60 papers in peer-reviewed journals.

Presentation Video

9:20 am – 9:40 pm
Effective Cocktail Treatments for SARS-CoV-2 Based on Modeling Lunch Single Cell Response Data

ABSTRACT

To fully model the impact of SARS-CoV-2 requires the integration of several different types of molecular and cellular data. SARS-CoV-2 is known to primarily impact cells via two viral entry factors, ACE2 and TMPRSS2. However, much less is currently known about the virus activity within lung cells. To model host response to viral infection, and to develop potential treatments, we will extend methods based on Continuous State Hidden Markov models (CSHMM) and further combine them with additional graphical models and graph analysis algorithms. Combined, these methods would allow us to reconstruct pathways leading from virus proteins via their host interactions to regulators and finally to the observed expression profiles in each cell. Such models identify not just the optimal individual targets but also combinations of targets that, together, lead to large decrease in viral loads. In parallel we will extend ChemProp, our deep learning method which learns to relate molecules and their associated activity values by developing advanced representations involving optimal transport and molecular prototypes, accumulating evidence about each compound’s profile from multiple sources, using transfer learning and extending it to learn the activity of compound mixtures to treat multiple targets. Using our recently developed protocols, we will differentiate human induced pluripotent stem cells to lung cells and perform time series single cell expression studies to profile cells infected with SARS-CoV-2. Following modeling, predicted CSHMM targets and their predicted ChemProp compounds will be experimentally validated to identify treatments that reduce viral loads in lung cells.

SPEAKER

Speaker: Ziv Bar-Joseph is the FORE Systems Professor of Computational Biology and Machine Learning at Carnegie Mellon University. His work focuses on the development of machine learning methods for the analysis, modeling, and visualization of time series high throughput biological data. Bar-Joseph is the recipient of the Overton Prize, an NSF CAREER Award, and several conference Best Paper awards. He is currently leading the Computational Tools Center for the National Institutes of Health Human BioMolecular Atlas Program (HuBMAP). He has served on the advisory board of several national efforts including the National Institute for Allergy and Infectious Diseases Systems Biology Program. Software tools developed by his group are widely used for the analysis of genomics data.

Presentation Video

9:40 am – 10:00 am
Secure Federeated Learning for Clinical Informatics with Applications to the COVID-19 Pandemic

ABSTRACT

Abstract: Enabling health care providers to respond faster and with greater precision to pandemics requires both advanced machine learning and quickly accessible clinical data. Yet, the necessary medical data is often inaccessible across hospitals due to privacy and intellectual property concerns. This proposal leverages distributed machine learning and modern cryptography to introduce a computational protocol and software tools for securely training machine learning models with data spread over several medical establishments, while preserving privacy and IP rights. Our scientific contributions include innovative techniques that trade-off computation and communication to improve the predictive performance of federated learning in clinical settings and novel cryptographic techniques that trade off computation and robustness to enhance security. To complement our technical aims, we will develop open source software. We will evaluate our approach for COVID diagnosis using data available on the C3.ai Data Lake combined with clinical data from OSF HealthCare, to illustrate how private data can significantly improve prediction quality compared to public data alone. We also propose to serve as a hub for other C3 AI projects to enable the secure use of privately-held clinical datasets, which will improve results by other teams. Our broader vision and objective are to provide Secure Federated Learning as a Service (FLaaS) freely available to any hospital during a declared crisis. We envision that a robust, secure federated learning system will enable fast responses to minimize the impact of disease in the earliest stages.

SPEAKER

Dakshita Khurana is an Assistant Professor of Computer Science at the University of Illinois at Urbana-Champaign. Khurana works in Cryptography and related topics in Privacy, Security and Theoretical Computer Science. She obtained her Ph.D. from UCLA, under the supervision of Professors Rafail Ostrovsky and Amit Sahai. She was recently a Google Research Fellow at the Simons Institute, UC Berkeley, and was named to Forbes List of 30 under 30 in Science. She has previously received the 2017-18 Dissertation Year Fellowship, the 2017-18 UCLA CS Outstanding Graduating Ph.D. Student Award, 2017-18 Symantec Outstanding Graduate Student Research Award, and the 2016-17 Cisco Outstanding Graduate Student Research Award
10:00 am – 10:15 am
Break
10:15 am – 10:35 am
Adding Audio-Visual Cues to Signs and Symptoms for Triaging Suspected or Diagnosed COVID-19 Patients

ABSTRACT

The COVID-19 pandemic has placed unprecedented stress on hospital capacity. Increased emergency department (ED) patient volumes and admission rates have led to a scarcity in beds and the need to construct field hospitals in some regions. Bed-sparing protocols that identify COVID-19 patients as stable for discharge from the ED or early hospital discharge have proven elusive, given this population’s propensity to rapidly deteriorate up to one week after illness onset. Consequently, a significant number of stable patients are unnecessarily admitted to the hospital, while some discharged patients decompensate at home and subsequently require emergency transport to the ED. In order to conserve hospital beds, there is an urgent need for improved methods for assessing clinical stability of COVID-19 patients. Our project goal is to develop audiovisual tools to predict cardiopulmonary decompensation from facial videos captured from the ED and at home via telemedicine platforms. We will use explainable artificial intelligence (AI) and machine learning (ML) algorithms to derive criteria for predicting impending deterioration from health-relevant audiovisual features and provide explanations in terms of the clinical details within the electronic medical record. Successful completion of this project will provide the groundwork for prospective evaluation of these tools in a COVID-19 patient population. Once validated, these tools will augment provider clinical assessments of COVID-19 patients both at the bedside and across telemedicine platforms during virtual follow-up. More broadly, the techniques and algorithms developed in this project are likely to be applicable to other high-risk patient populations and emerging platforms such as telemedicine.

SPEAKER

Speaker: Narendra Ahuja is a Research Professor of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. His research is in Artificial Intelligence fields of computer vision, pattern recognition, machine learning, and image processing and their applications, including on problems in developing societies. Ahuja has co-authored over 400 papers in journals and conferences and supervised research of over 50 Ph.D.s, 15 MS and 100 undergraduate students and 10 Postdoctoral Scholars. He received his Ph.D. from the University of Maryland, College Park, in 1979. He is a fellow of the Institute of Electrical and Electronics Engineers, the American Association for Artificial Intelligence, the International Association for Pattern Recognition, the Association for Computing Machinery, the American Association for the Advancement of Science, and the International Society for Optical Engineering.

Presentation Video

10:35 am – 10:55 am
Algorithms and Software Tools for Testing and Control of COVID-19 Suspected or Diagnosed COVID-19 Patients

ABSTRACT

Extensive and ongoing testing of populations for viremia and antibodies is expected to play a major role in shaping and managing the process of reopening the states. The goal of this interdisciplinary project is to bring together epidemiologists, systems theorists, and data scientists to develop models, algorithms, and software tools to support the state-level PCR (polymerase chain reaction) and serological testing efforts. Specifically, the team will develop: 1) algorithms to assimilate real-time testing data into networked epidemiological models; and 2) mean-field type control strategies to inform and evaluate the effect of social distancing and other control measures on the progression of the disease. A successful completion of this project will result in epidemiological models that are better and more realistic along two dimensions: 1) they are able to assimilate noisy data from ongoing population level testing; and 2) they include the effect of population level feedback that may result as a consequence of control measures. Such models are expected to be useful to inform testing guidelines, such as what groups to sample, with which tests, and with what frequency, and to better evaluate the effects of deploying control measures. The algorithms will be implemented as efficient, scalable, and open-source software and made available to policy makers and the public via an interactive website, assimilating daily observational data to generate real-time disease maps (with quantified uncertainty) and tools to allow simulation under different control policies. This will require substantial backend computation, with simulation and learning running on the C3.ai/Azure platform.

SPEAKER

Prashant Mehta is an Associate Professor of Mechanical Science and Engineering at the University of Illinois at Urbana-Champaign. Mehta co-founded and served as Chief Science Officer of the startup Rithmio; he also worked at United Technologies Research Center. His research is in the area of mathematical and computational aspects of dynamics and control theory. He has served as the Associate Editor for IEEE Transactions on Automatic Control (2019-), Systems and Control Letters (2011-14), and the ASME Journal of Dynamic Systems, Measurement and Control (2012-16) and guest-edited the special issue of the ASME Journal of Dynamic Systems, Measurement and Control to commemorate the life, achievements, and impact of Rudolph E. Kalman.

Presentation Video

10:55 am – 11:15 am
Optimal Adaptive Testing for Epidemic Control: Combining Molecular and Serology Tests

ABSTRACT

This research studies optimal lockdown and testing policies for the containment of disease spread in networked environments. Our motivation comes from three peculiar aspects of the COVID-19 pandemic. First, since a vaccine is not yet available, one needs to rely on virus containment strategies (such as lockdowns) as intervention tools, which have serious economic repercussions. Our first objective is to develop lockdown models that consider epidemic and economic aspects simultaneously. Since individuals in the population may have different risks and different productivity levels, this requires models that explicitly account for the heterogeneity of different groups and their network of interactions. Our second objective is to exploit these network models to study targeted interventions. In particular we aim at understanding what is the best policy to gradually reopen the economy by taking into account the role of the different groups in terms of their productivity and their risk level. Finally, since COVID-19 is a pandemic, it is important to note that interventions actuated by local governments will have ripple effects on neighboring states, hence coordination of efforts from different governments is of paramount importance. As the third objective, we aim at studying how state interconnections and mobility patterns affect the optimal response to an epidemic in networked environments. Results from the proposed research will help leaders and decision makers in understanding how to optimally unlock the economy within a state and how to optimally coordinate efforts between states.

SPEAKER

Francesca Parise joined the School of Electrical and Computer Engineering at Cornell University as an assistant professor in July 2020. Before then, she was a postdoctoral researcher at the Laboratory for Information and Decision Systems at MIT. She defended her PhD at the Automatic Control Laboratory, ETH Zurich, Switzerland in 2016 and she received the B.Sc. and M.Sc. degrees in Information and Automation Engineering in 2010 and 2012, from the University of Padova, Italy, where she simultaneously attended the Galilean School of Excellence. Francesca was recognized as an EECS rising star in 2017 and is the recipient of the Guglielmo Marin Award from the “Istituto Veneto di Scienze, Lettere ed Arti,” the SNSF Early Postdoc Fellowship, the SNSF Advanced Postdoc Fellowship, and the ETH Medal for her doctoral work.

Presentation Video

11:15 am – 11:35 am
Spatial Modeling of Covid-19: Optimizing PDE and Metapopulation Models for Prediction and Spread Mitigation

ABSTRACT

This research offers a comprehensive approach to the spatial dynamics of COVID-19 based on partial differential equations and metapopulation models. We aim to fill the modeling gap between studies with a detailed description of disease dynamics, but lacking spatial dynamics, and those that while spatial in nature do not account for the intricacies of the COVID-19 disease. We will use a diverse array of techniques, ranging from dynamic traffic assignment models, which will inform the links in the metapopulation model, to manifold learning techniques for model parameterization. We will fully utilize the C3.ai platform to manage and integrate data, implement code, and build a user interface to increase research outcome accessibility.

SPEAKER

Speaker: Zoi Rapti is an Associate Professor in the Department of Mathematics at the University of Illinois at Urbana-Champaign and the Carle R. Woese Institute for Genomic Biology. Her research focus is on nonlinear dynamics with applications to mathematical biology (DNA denaturation, epidemiology, community assembly, phage-bacteria interactions, data-theory coupling) and physics (Nonlinear Schrodinger-type equations, discrete Klein-Gordon equations). Rapti holds a bachelor’s degree in Mechanical Engineering from the National Technical University of Athens, Greece and a Ph.D. in Mathematics from the University of Massachusetts, Amherst.

Presentation Video

11:35 am – 11:55 am
Detection and Containment of Emerging Diseases Using AI Technique

ABSTRACT

This work focuses on developing data-efficient and reliable algorithms for healthcare AI in response to emerging diseases such as COVID-19. In particular, we aim to: 1) develop machine learning methods targeting unseen disease types; 2) adapt existing medical AI tools to emerging diseases to rapidly mitigate the outbreak; 3) develop robust methods to deploy and proliferate new disease models to disparate healthcare institutions across geographies; and 4) utilize collaborative AI approaches that leverage experiences of medical professionals to enable rapid cycle development of emerging disease models. We will deploy our experience in developing algorithms for incipient fault detection in cyber-physical systems and in generative modeling for synthetic data generation in computer vision applications and interactive machine learning. Healthcare AI models will be made robust and explainable for clinical use in collaboration with the UCSF School of Medicine. We will make extensive use of the C3.ai Data Lake, along with additional data available from UCSF. Further, we intend to use the C3 AI Suite extensively to facilitate our algorithmic and code development work, integrate our results in the C3 AI Suite to make it more powerful in AI applications to healthcare, and make our results openly available to foster further research.

SPEAKER

Alberto Sangiovanni-Vincentelli is the Edgar L. and Harold H. Buttner Chair of Electrical Engineering and Computer Science at University of California, Berkeley, where he has served on faculty since 1975. He is Special Advisor to the Dean of Engineering for Entrepreneurship and Chair of the Faculty Advisors to SkyDeck, UC Berkeley’s accelerator. His many honors include the IEEE-EDAA Kaufman Award for pioneering contributions to electronic design automation (EDA) and the IEEE/RSE Maxwell Medal “for groundbreaking contributions that have had an exceptional impact on the development of electronics and electrical engineering or related fields.” He co-founded Cadence and Synopsys, two leading EDA companies, is a board member of Cadence, KPIT, ISEO, ExpertSystem, Cogisen, Cy4gate, has consulted for companies worldwide, including Intel, IBM, ST, Mercedes, BMW, UTC, GM, and is International Advisory Council Chair of the Milano Innovation District. He is a member of the NAE, a IEEE and ACM Fellow, and holds Honorary Doctorates from Aalborg University and KTH. He has published over 1,000 papers and 19 books with an h-index of 117 (Google Scholar) and graduated over 100 doctorate students

Presentation Video

11:55 am – 12:15 pm
AI Enabled Deep Mutational Scanning of Interaction between SARS-CoB-2 Spike Protein S and Human ACE2 Receptor

ABSTRACT

We employ a recently developed platform, TLmutation, which could enable rapid investigation of the sequence-structure-function relationship between SARS-CoV-2 Spike Protein S and Human ACE2 Receptor. In particular, we employ a transfer learning approach to generate high-fidelity scans from noisy experimental data, and transfer the knowledge from single point mutation data to generate higher order mutational scans from the single amino-acid substitution data. Using deep mutagenesis, variants of ACE2 will be identified with increased binding to the receptor binding domain of S at a cell surface. In our preliminary results, we identify mutations across the interface and also at buried sites, where they are predicted to enhance folding and presentation of the interaction epitope. The mutational landscape offers a blueprint for engineering high affinity ACE2 receptors to meet this unprecedented challenge. We plan to employ the information from the preliminary mutational landscape to generate the high order mutations in ACE2 receptor that could enhance binding to S protein and help in the design of future vaccines for treatment of SARS-CoV-2. We also aim to investigate this problem using distributed computing approaches to understand the underlying physics of the protein S and ACE2. In particular, we aim to perform molecular dynamics simulations to identify thermodynamic interactions that could enhance the ACE2 binding. Our preliminary results show that ACE2 variants identified from deep mutational scan not only stabilize the structural fluctuations but also strongly couple the motions of the two proteins. These simulations would be performed using Microsoft Azure and NCSA Blue Waters.

SPEAKER

Diwakar Shukla is the Blue Waters Assistant Professor, Department of Chemical and Biomolecular Engineering at the University of Illinois at Urbana-Champaign. His research is focused on understanding the complex biological processes using novel physics-based models and techniques. He received his B.Tech and M.Tech. degrees from the Indian Institute of Technology in Bombay and his MS and Ph.D. degrees from the Massachusetts Institute of Technology. His postdoctoral work was at Stanford University. He has received several awards for his research, including the Peterson award from ACS, Innovation in Biotechnology award from AAPS, COMSEF Graduate student award from AIChE, Institute Silver Medal and Manudhane Award from IIT Bombay.

Presentation Video

12:15 pm – 12:20 pm
Closing Remarks and Day 2 Preview

Day Two (Thursday, Jan 21)

8:00 am – 8:45 am
Keynote: AI Partners in the Classroom: Challenges and Oppurtunities

ABSTRACT

In tomorrow’s classroom will intelligent computers work side-by-side with groups of students to support their engagement in meaningful and productive learning experiences designed by their teachers? This is the goal of a new $20M, 5-year AI Institute for Student-AI Teaming at the University of Colorado, funded by NSF. The Institute encompasses nine U.S. universities in a close collaboration with two public school districts, private companies, and community leaders. It will focus on three main challenges. First, researchers will work to develop new advances in the fundamental science of how machines process human language, gestures, and emotions. Next, the team will strive to better understand how students, AI Partners, and teachers can collaborate effectively in both classrooms and remote learning contexts. Finally, researchers will go to classrooms in Denver Public Schools and other school partners—virtually, during the age of COVID-19—to work hand-in-hand with students and teachers to think up new technologies. This talk will focus on the technical issues inherent in the first challenge, the AI Partner’s capabilities for analyzing and generating spoken language and detecting gestures, facial expressions, and emotions. All of these tasks face unique algorithmic and ethical challenges, especially in this environment.

SPEAKER

Martha Palmer is an Arts and Sciences Professor of Distinction for Linguistics, and the Helen & Hubert Croft Professor of Engineering in the Computer Science Department at the University of Colorado at Boulder. She is also an Institute of Cognitive Science Faculty Fellow, a co-Director of CLEAR, an Association of Computational Linguistics (ACL) Fellow, and an Association for the Advancement of Artificial Intelligence (AAAI) Fellow. She received a BFA 2010 Research Award, was the Director of the 2011 Linguistics Institute in at CU-Boulder and was named an Outstanding Graduate Advisor in 2014. She was the first woman to obtain a Ph.D. in Artificial Intelligence from the University of Edinburgh in 1985, and was an Associate Professor in Computer and Information Sciences at Penn for six years before coming to Colorado in 2005. Her research is focused on capturing elements of the meanings of words that can comprise automatic representations of complex sentences and documents. She co-edits Linguistic Issues in Language Technology (LiLT) and has been a co-editor of Natural Language Engineering and on the Computational Linguistics Editorial Board.

Presentation Video

8:45 am – 9:05 am
Data-Driven, High-Dimensional Design for Trustworthy Drug Discovery

ABSTRACT

Machine learning-based predictive modeling tools have been applied to a wide variety of tasks in computational biology and chemistry, such as predicting protein binding and stability, small molecule antibiotic properties, synthesizability, and drug-likeness. However, when such data-driven models are used to produce new designs, they are likely to encounter a major challenge. Learning-based design involves optimizing over the input to a predictive model. For example, a model that predicts how well a small molecule binds to a particular drug target takes as input some representation of the molecule, and outputs the binding efficiency. Hence, finding the best small molecule involves performing an optimization over the input to the model when its output is fixed to be, say, as large as possible. We refer to this problem-setting as “high-dimensional model inversion” (HDMI). Critically, by definition of the design problem, the predictive model will never have seen any molecules with precisely the desired property, and thus we are asking the model to extrapolate. What does it mean to extrapolate in this context? Can we extrapolate? How far can we extrapolate? How can we trust such decisions? We will develop a new formal framework and associated algorithms for solving HDMI with high-capacity models such as neural networks and high-dimensional inputs, which will enable us to answer these questions. We will draw on ideas from learning-based decision making (reinforcement learning), robust uncertainty estimation, and probabilistic modelling. We will focus on data-driven drug design, including a collaboration toward developing a therapeutic for COVID-19.

SPEAKER

Sergey Levine is an Assistant Professor of Electrical Engineering and Computer Sciences at the University of California, Berkeley. He received a BS and MS in Computer Science in 2009 and a Ph.D. in Computer Science in 2014, all from Stanford University, and joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as computer vision and graphics. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, and deep reinforcement learning algorithms. His work has been featured in many popular press outlets, including the New York Times, the BBC, MIT Technology Review, and Bloomberg Business.

Presentation Video

9:05 am – 9:25 am
Medical Imaging Domain-Expertise Machine Learning for Interrogation of COVID

ABSTRACT

The COVID-19 pandemic represents a pressing public health need for computational techniques to augment the interpretation of medical images in their role for: 1) surveillance, detection, and triaging of COVID-19 medical images given potential resurgence; 2) differential diagnosis of COVID-19 patients; and 3) prognosis, as well as prediction and monitoring of treatment response, to help in patient management. While thoracic imaging, including chest radiography and computed tomography (CT), are being re-examined for their role in patient management, the limitations for improved interpretation are partially due to the qualitative interpretation of the images, and thus this project’s aim is to develop machine intelligence methods to aid in the interrogation of medical images from COVID-19 patients. Successful completion of the research will demonstrate cascade-based deep transfer learning between similar but different thoracic disease states (e.g., interstitial diseases to COVID-19) and a clinical tool to aid in the triaging of COVID-19 patients in terms of detection, treatment planning, and monitoring.

SPEAKER


Speaker:
Maryellen L. Giger is the A.N. Pritzker Professor of Radiology and Medical Physics at the University of Chicago. For decades, she has worked on computer-aided diagnosis/machine learning/deep learning in medical imaging and cancer diagnosis and management. Her AI research in breast cancer for risk assessment, diagnosis, prognosis, and therapeutic response has yielded various translated components, and she is using these “virtual biopsies” in imaging-genomics association studies. Giger is a former president of AAPM and of SPIE, and is Editor-in-Chief of the Journal of Medical Imaging. She is a member of the National Academy of Engineering; Fellow of AAPM, AIMBE, SPIE, SBMR, IEEE, and IAMBE; and was cofounder, equity holder, and scientific advisor of Quantitative Insights (now Qlarity Imaging), which produces QuantX, the first FDA-cleared, machine-learning driven CADx system.
9:25 am – 9:45 am
Modeling the Impact of Social Determinants of Health on COVID-19 Transmission and Mortality to Understand Health Inequities

ABSTRACT

The COVID-19 pandemic has highlighted drastic health inequities, which are particularly pronounced in cities such as Chicago, Detroit, New Orleans, and New York City. Reducing COVID-19 morbidity and mortality will likely require increased focus on social determinants of health given their disproportionate impact on populations most heavily affected by COVID-19. A better understanding of how factors such as financial hardship, housing instability, health care access, and incarceration contribute to COVID-19 transmission and mortality is needed to inform policies around social distancing and testing and vaccination scale-up. This proposal will build upon an existing agent-based model of COVID-19 transmission (CityCOVID) for the city of Chicago. Using multiple sources of existing data, including local COVID-19 contact tracing surveys and public health surveillance, we will apply machine learning methods to quantify the impact of social determinants of health on COVID-19 transmission dynamics and generate a more granular synthetic population with which to evaluate intervention approaches. The extended CityCOVID model will provide a more realistic model to guide local policy and intervention development.

SPEAKER

 

Anna Hotton is a Research Assistant Professor in the Section of Infectious Diseases and Global Health at the University of Chicago Department of Medicine. She earned her B.S. degree at Cornell University and her MPH and Ph.D. at the School of Public Health at the University of Illinois at Urbana-Champaign. As staff scientist at the Chicago Center for HIV Elimination, Hutton studied the relationship between social factors and viral spread. Her C3.ai DTI-funded project aims to adapt that work to COVID-19, using machine learning to identify data elements that are most important to include in modeling to better simulate various scenarios of disease spread and virtually test how different public health or social policy strategies can help mitigate the disease.

Presentation Video

9:45 am – 10:05 am
Toward Analytics-Based Clinical and Policy Decision Support to Respond to the COVID-19 Pandemic

ABSTRACT

The COVID-19 pandemic creates unprecedented challenges for healthcare providers and policy makers. How to triage patients when healthcare resources are limited? Whom to test? And how to design social distancing policies to contain the disease and its socioeconomic impact? Analytics can provide data-driven answers. We have collected comprehensive data from hundreds of clinical studies, case counts, and hospital collaborations. We have developed a new epidemiological model of the disease’s dynamics, a machine-learning model of mortality risk, and a resource allocation model — published on www.covidanalytics.io. In this project, we develop automated, interpretable, and scalable decision-making systems based on machine learning and artificial intelligence (ML/AI) to support clinical practices and public policies as they respond to the COVID-19 pandemic. We tackle four research questions: 1) How can we predict admissions in intensive care units (ICU) using machine learning? 2) How does COVID-19 impact different demographic and socioeconomic populations? 3) How does mobility impact the disease’s spread, and how to optimize social distancing policies? 4) How to augment COVID-19 tests with data-driven warnings that identify high-risk subjects? This project leverages large-scale datasets (from C3 AI and our own collection efforts), high-performance computing (using the C3 AI suite), and advanced ML/AI. Specifically, this project develops end-to-end ML/AI methods, spanning epidemiological modeling (to model the disease’s spread), machine learning (to predict ICU admissions and test results), causal inference (to investigate disparities across populations) and optimal control (to support social distancing guidelines). We will disseminate results to healthcare providers, policy makers, researchers, and the public.

SPEAKER

Dimitris Bertsimas is the Boeing Professor of Operations Research and the Associate Dean of Business Analytics at the Sloan School of Management at the Massachusetts Institute of Technology. His research interests include machine learning, optimization, and their applications in health care. He has co-authored more than 250 scientific papers and five graduate-level textbooks. He is currently Editor in Chief of INFORMS Journal on Optimization and former Area Editor of Management Science in Optimization and of Operations Research in Financial Engineering. He has supervised 76 doctoral students and is currently supervising 25. He is a member of the National Academy of Engineering and recipient of the John von Neumann Theory Prize for fundamental contributions in Operations Research and Management Science and the INFORMS President Award for significant impact in society, both in 2019. Since March, 2020 he has led a group of 30 doctoral students, postdocs, and professors to study multiple aspects of COVID-19. These efforts are detailed at COVID Analytics. He has co-founded several companies including Dynamic Ideas, a financial services company sold to American Express in 2002, D2 Hawkeye, sold to Verisk in 2009, Benefit Sciences, ReClaim, and Savvi Financial.
10:05 am – 10:20 am
Break
10:20 am – 10:40 am
Improving Fairness & Equity in COVID-19 Policy Applications of Machine Learning

ABSTRACT

As governments and social service providers attempt to understand the COVID-19 pandemic –– including the significant and asymmetrical health, social, and economic risks to their constituents –– and plan for the future through acquiring and allocating scarce resources, AI researchers and practitioners have been developing detection, forecasting, and mitigation tools to support those efforts. When policy planning and resource allocation decisions are made using these AI methods, there is a risk that they could result in inequitable and unfair outcomes for vulnerable populations. Disparate impacts of the COVID-19 pandemic on racial minorities and economically disadvantaged populations are already evident, and the risk that these disparities through applications of AI could worsen is substantial. This proposal is focused on developing bias detection/audit, reduction, and mitigation methods and tools to ensure that the policy actions taken using AI and ML reduce the risk of inequitable outcomes for vulnerable populations. While our work will be broadly applicable, we focus on four use cases: 1) COVID-19 forecasting to improve policy decision-making, 2) identifying individuals in California facing social and economic challenges due to the epidemic, 3) understanding potential disparities in the use of contact tracing and immunity passport technologies, and 4) mental health interventions to break the cycle of incarceration in Kansas.

SPEAKER

Rayid Ghani is a Distinguished Career Professor in the Machine Learning Department and the Heinz College of Information Systems and Public Policy at Carnegie Mellon University. Rayid is focused on using large-scale Artificial Intelligence/Machine Learning/Data Science to solve public policy and social challenges in a fair and equitable manner. Ghani works with governments and nonprofits on policy in health, criminal justice, education, public safety, economic development, and urban infrastructure. Ghani is passionate about teaching practical data science and started the Data Science for Social Good Fellowship that trains computer scientists, statisticians, and social scientists to work on data science problems with social impact. Before joining Carnegie Mellon University, Ghani was the Founding Director of the Center for Data Science & Public Policy, a Research Associate Professor in Computer Science, a Senior Fellow at the Harris School of Public Policy at the University of Chicago, and Chief Scientist of the Obama 2012 Election Campaign.

Presentation Video

10:40 am – 11:00 am
Model Choice and Structure – Connections to Prediction, Policy, and Values

ABSTRACT

A key scientific goal concerning COVID-19 is to develop mathematical models that help us understand and predict its spreading behavior, as well as to provide guidelines on what can be done to limit its spread. As such, this project pursues: 1) analysis and prediction of the spread of COVID-19 through a new mathematical model incorporating virus mutations; and 2) optimal and robust control of the spread of COVID-19 by carefully timed interventions. Expected outcomes could give authorities another tool to better assess the effectiveness of existing or potential countermeasures in limiting the spread of COVID-19. They could also help leaders assess the outcomes of eliminating existing countermeasures. Finally, they could help better prepare for different mutation scenarios, including worst-cases for the current or a future pandemic.

SPEAKER

Ben Schaffer is a Postdoctoral Research Fellow at Princeton University, working jointly between the Ecology and Electrical Engineering departments on COVID-19 modeling problems. He received his PhD from Princeton in 2018 in Civil and Environmental Engineering, with a focus on the ecohydrology of water-limited systems. He has worked in a variety of other areas, including theoretical ecology, geomorphology, and climate dynamics. His primary interest is the mathematical structure of complex systems.

Presentation Video

11:00 am – 11:20 am
Housing Precarity, Eviction, and Inequality in the wake of COVID-19

ABSTRACT

Ensuring housing security is vital to mitigating the spread of the COVID-19 virus and sustaining health, economic security, and family stability. This joint, interdisciplinary project will bring together academics and data scientists to track, analyze, and respond to pandemic-driven spikes in eviction and displacement risks. Doing so requires development of: 1) an innovative system for tracking real-time eviction filings after the outbreak; and 2) a housing precarity risk model using machine learning, to better analyze and predict areas at disproportionate risk of displacement in the wake of the COVID-19 pandemic. This project will provide major new sources of data and inform research and public policy regarding U.S. housing and inequality.

SPEAKER

 

Speaker: Karen Chapple, Ph.D., is a city planner by training who studies inequalities in the planning, development, and governance of regions in the U.S. and Latin America, with a focus on economic development and housing. Her most recent book is Transit-Oriented Displacement or Community Dividends? Understanding the Effects of Smarter Growth on Communities.
11:00 am – 11:20 am
Dynamic Resource Management in Response to Pandemics

ABSTRACT

Business-as-usual medical and public health practices cannot cope with the large shocks of a pandemic. Coordinated and well-timed response mechanisms are required. In this proposal, we aim to build a data analytic framework to optimize resource management for testing, prevention, and care, both prior to a potential spread, as well as during a rapidly developing outbreak. The proposed research builds upon our current work to help the State of Illinois to mitigate the ongoing COVID-19 outbreak. Key outcomes of our research include: 1) risk-aware dynamic equipment and workforce allocation mechanism implemented on the C3.ai platform, 2) comparison of federalized resource allocation to that where states compete with one another for resources via game theory, 3) flexible capacity provisioning in medical supply chains and dynamic inventory management for perishable goods such as N95 masks. While part of our research will directly help combat the current crisis, the rest will forge operational resilience and enhance the readiness to fight a potential future outbreak. The multidisciplinary team combines expertise in optimization and control, healthcare management, operations research, data analytics, and game theory.

SPEAKER

Speaker: Ujjal Mukherjee is an Assistant Professor at the Gies College of Business at University of Illinois at Urbana-Champaign, IL. His research is focused on healthcare analytics, particularly the use of machine learning and statistical methods in healthcare, including precision medicine for cancer treatment, and healthcare process and technology management. He has conducted in-depth field studies of operational issues related to the effective use of surgical robots for delivering critical surgical care to OB/GYN and Urological patients and, using data analysis, he’s helped select the right surgical procedures and patients to optimize cost and quality of care. Recently, he has been active with COVID-19 research and mitigation, managing the manufacturing and supply of critical care ventilators, Rapidvent, developed by UIUC, and supporting the state of Illinois COVID-19 response and the UIUC SHIELD testing program.

Presentation Video

11:40 am – 12:00 pm
Impacts of COVID-19 Interventions: Health, Economics, and Inequality

ABSTRACT

The COVID-19 outbreak has disrupted normal activities in nearly all aspects of higher education. To reopen our universities, we need new technology and innovative practices to safeguard students against the potential second wave of the virus outbreak. In this work, we seek to develop analytical methods for modeling and mitigating the COVID-19 situation based on students’ location and symptom data collected via mobile apps. We adopt an optimal control approach and seek intervention policies that strike a balance between containing the virus and keeping productive on-campus activities. This problem is highly challenging due to the prevalence of hidden states, unknown dynamics, and high dimensionality. By leveraging recent advances in system identification, reinforcement learning, and adaptive control, we will develop predictive methods to infer the hidden health states of individual students and develop algorithms to recommend optimal interventions (e.g., testing and quarantine) for decision makers. We will develop simplified models to assess the impact of such policies on the stability of the system captured in the growth rate of infections. The methods will be validated using simulation and available data. We expect to apply and further develop the methods to analyze real campus data from MIT and, by using the computing capabilities of the C3 AI Suite and Microsoft Azure Cloud, we expect to analyze large volumes of location data as they are collected and adapt the intervention policy.

SPEAKER

Munther Dahleh is the William A. Coolidge Professor, Electrical Engineering and Computer Science at the Massachusetts Institute of Technology. Dahleh received his Ph.D. from Rice University in 1987 in Electrical and Computer Engineering. Since then, he has been with the Department of Electrical Engineering and Computer Science (EECS), a faculty affiliate of the Sloan School of Management, and Director of the Institute for Data, Systems, and Society (IDSS), all at MIT. Previously, he was Associate Department Head of EECS, Acting Director of the Engineering Systems Division, and Acting Director of the Laboratory for Information and Decision Systems. Dahleh is well-known for his seminal contributions to the field of networked systems and robust control. His research has impacted transportation and autonomous systems, power grid, financial systems, and social networks. Dahleh’s work has appeared in economics and operations research venues, as well as eight different IEEE Transactions. Dahleh is a four-time recipient of the George Axelby outstanding paper award for best paper in IEEE Transactions on Automatic Control. He is also the recipient of the Donald P. Eckman award from the American Control Council in 1993 for the best control engineer under 35

Presentation Video

12:15 pm – 12:20 pm
Closing Remarks