C3.ai DTI’s quarterly newsletter covers news of the Institute’s Principal Investigators and digital transformation research around the consortium. You can sign up to receive the newsletter here.

The winter edition covers this news:

  • MIT Group Releases White Papers on AI Governance
  • From Fawkes, to Glaze, to Nightshade, According to Ben Zhao
  • The Global Project to Make a General Robotic Brain
  • From Ground, to Air, to Space, Tillage Estimates Get Tech Boost
  • Learn about GenAI from the Experts
  • Updates and Quick Takes

See the Winter 2024 C3.ai DTI newsletter pdf here.

Nature Biotechnology: In this first-person piece, C3.ai DTI COVID-19 researcher and UC Berkeley Professor of EECS and Bioengineering Jennifer Listgarten writes, “As a longtime researcher at the intersection of artificial intelligence (AI) and biology, for the past year I have been asked questions about the application of large language models and, more generally, AI in science. For example: ‘Since ChatGPT works so well, are we on the cusp of solving science with large language models?’ or ‘Isn’t AlphaFold2 suggestive that the potential of AI in biology and science is limitless?’ And inevitably: ‘Can we use AI itself to bridge the lack of data in the sciences in order to then train another AI?'”

Listgarten continues, “I do believe that AI — equivalently, machine learning — will continue to advance scientific progress at a rate not achievable without it. I don’t think major open scientific questions in general are about to go through phase transitions of progress with machine learning alone. The raw ingredients and outputs of science are not found in abundance on the internet, yet the tremendous power of machine learning lies in data — and lots of them.”

Read more here.

Two C3.ai DTI researchers were quoted in Quanta about their work on autonomous driving.

Sayan Mitra, a computer scientist at the University of Illinois Urbana-Champaign leads a team that has managed to prove the safety of lane-tracking capabilities for cars and landing systems for autonomous aircraft. Their strategy is now being used to help land drones on aircraft carriers, and Boeing plans to test it on an experimental aircraft this year. “Their method of providing end-to-end safety guarantees is very important,” said Corina Pasareanu, a research scientist at Carnegie Mellon University and NASA’s Ames Research Center.

Their work involves guaranteeing the results of the machine-learning algorithms that are used to inform autonomous vehicles.

The aerospace company Sierra Nevada is currently testing these safety guarantees while landing a drone on an aircraft carrier. This problem is in some ways more complicated than driving cars because of the extra dimension involved in flying.

Read more here.

Image: Señor Salme for Quanta Magazine

Industry forecasting firm Visiongain released its Antibacterial Drugs Market Report 2024-2034 today, which projected the antibacterial drug market to grow at a compound annual growth rate of 3.6 percent during the 10-year forecast period.

One reason cited for this growth is the impact of artificial intelligence and machine learning on the drug discovery process. The report highlights the recent discovery of a drug to treat the bacteria Acinetobacter baumannii, which built upon efforts two C3.ai Co-P.I.s who worked on the breakthrough, MIT professors Regina Barzilay and Tommi Jaakkola, had make to investigate interventions for COVID-19 that were funded by the C3.ai Digital Transformation Institute.

The new drug was identified from a library of ~7,000 potential drug compounds using a machine-learning model trained to evaluate whether a chemical compound will inhibit the growth of A. baumannii. Once approved, the drug could help combat Acinetobacter baumannii found in hospitals that leads to pneumonia, meningitis, and other serious infections.

See the Visiongain report summary here.

See our news brief on the drug discovery study here.

Image: CDC

CBS News Pittsburgh interviews C3.ai DTI researcher Zico Kolter of Carnegie Mellon University about his discovery of how easy it can be to engineer a “jailbreak” in chatbots to break through safeguards and generate harmful information.

These vulnerabilities can make it easier to allow humans to use the chatbots for all sorts of dangerous purposes, generating hate speech or fake social media accounts to spread false information — something the author fears in the upcoming presidential election, increasing divisions and making all information suspect.

“I think the biggest risk of all of this isn’t that we believe all the false information, it’s that we stop trusting information period. I think this is already happening to a degree,” Kolter said. “Used well, these can be useful, and I think a lot of people can use them and can use them effectively to improve their lives if used properly as tools.”

Read the KDKA-TV story here.

KDKA-TV/CBS News Pittsburgh graphic

AIThority: A novel antibiotic that can kill a type of bacterium responsible for many drug-resistant diseases has been identified by researchers at the Massachusetts Institute of Technology (MIT) and McMaster University using an artificial intelligence algorithm.

Regina Barzilay and Tommi Jaakkola, MIT professors and co-authors of the current paper, “Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii,” set out a few years ago to tackle this using machine learning. Nine antibiotics, including a highly effective one, were produced during those tests. This chemical, which was first investigated for its use as a diabetes medication, was found to be highly efficient against A. baumannii.

The C3.ai Digital Transformation Institute was among the organizations that contributed to the funding of this research.

Barzilay and Jaakkola were co-P.I.s on a 2020 C3.ai DTI grant awarded to Ziv Bar-Joseph of Carnegie Mellon University for a project using AI to mitigate COVID-19, research that led to this later discovery.

Read the full AIThority story here.

Read the paper here.

Image from Nature Chemical Biology paper

Chicago Booth Review: In this episode of the Capitalisn’t podcast, hosts Bethany McLean and Luigi Zingales sit down with Chicago Booth’s Sendhil Mullainathan to discuss if AI is really “intelligent” and whether a profit motive is always bad. In the process, they shed light on what it means to regulate in the collective interest and if we can escape the demands of capitalism when capital is the very thing that’s required for progress.

Says Mullainathan, “My view is these technologies are going to make us all better off, for sure. The question is, how do we make sure that happens? Because there is risk associated with them. And for me, governance, regulation, it’s all just the way to get us to what I think is a really good state that we couldn’t imagine before.

“That’s not to minimize the risk, but I think I’m fundamentally optimistic that there’s a much better world out there because of these technologies. For me, that’s what makes me excited about governance and regulation. I feel like it’s stuff in the service of getting us to good places we couldn’t otherwise get to.”

Sendhil Mullainathan served as C3.ai DTI Principal Investigator on the project “Machine Learning Support for Emergency Triage of Pulmonary Collapse in COVID-19.”

Listen to the podcast here.

Forbes covers the work of Dandelion Health, a startup sparked by the work of two C3.ai DTI researchers, Ziad Obermeyer of the University of California, Berkeley, and Sendhil Mullainathan of the University of Chicago Booth School of Business.

In 2019, the two co-authored a research paper on bias in healthcare algorithms that was published in Science. That paper’s findings would inspire them to start Dandelion, along with two other colleagues.

The paper revealed how differences in access to healthcare services among Black and white patients could ultimately result in fewer Black patients being flagged by an algorithm that used overall healthcare costs as a proxy for which patients need extra care.

That’s because if you just consider the total cost of care – that the sickest patients would be the ones with the highest bills – the data will skew towards people who can afford to go to the doctor. The result was that only around half of the Black patients who should get extra services were identified.

Access can vary wildly “depending on where you live, who you are, the color of your skin, the language you speak,” Obermeyer told Forbes. In this case, white patients were more likely to go to clinics and get treatment or surgery and had higher costs, while Black patients were more likely to use the emergency room once their untreated conditions were spiraling out of control. The end result? “The bias just piles up.”

Dandelion is creating a massive, de-identified dataset from millions of patient records so that developers can build and test the performance of their algorithms across diverse types of patients. The founding team hopes they can help establish a framework for testing and validating healthcare AI “while regulators play catchup.”

Read the full Forbes story here.

Read the 2019 Science paper here.

Forbes photo via Dandelion Health

In its yearly report of the global economy’s most compelling facts and figures, the Atlantic Council includes specs on Generative AI among its 26 highlights, as described by Giulia Fanti, assistant professor of electrical and computer engineering at Carnegie Mellon University, a nonresident senior fellow at the Council’s GeoEconomics Center, and also a Principal Investigator on cybersecurity for the C3.ai Digital Transformation Institute.

Fanti brings to light the staggering parameters in state-of-the-art Large Language Models (LLMs), followed by her explanation.

This year, generative artificial intelligence (AI) captured the public’s imagination with its ability to generate photorealistic images, videos, audio, and text. Many believe that models such as GPT-4, PaLM 2, Llama 2, and Mistral will revolutionize how humans interact with computers for government services, education, and enterprise settings, to name a few. However, the amazing capabilities of generative models come at a cost.

Today, the leading models are growing quickly in size (as measured by their number of parameters, the values that control LLMs’ behavior). This matters because larger models are more expensive to train and more expensive to use once trained. For example, the Llama 2 (70B) model has 70 billion parameters and required a staggering 1.7 million graphics processing unit (GPU) hours, or the equivalent of almost two hundred years, to train. (This was sped up in practice by using these resources in parallel.)

The geoeconomic implications of these trends are likely to become more severe in the coming years. To train or host these models, organizations will need access to data centers with many GPUs. Moreover, due to data use and data locality restrictions in many regions, such data centers may need to be local. However, data centers are distributed inequitably across the world, with the vast majority of data centers located in the United States and Europe. This is likely to lead to a massive disparity in the ability to train, use, and benefit from generative AI.

Read the full story here.

Atlantic Council image: Mark Schiefelbein/Pool via REUTERS