The hype and hope of AI-assisted drug discovery

It was the sort of shop talk that you hear often in Silicon Valley these days. “The data amount alone is just obscene. If you compare it to ImageNet, which is a couple hundred thousand images — a million to 10 million for some of the biggest ones — we’re generating datasets of that size quite routinely,” Max Salick said, pointing to his computer. “I’m crashing every AWS server all the time.”

But the images winking on the screen weren’t cats or dogs or blueberry muffins or stop signs. They were human cells, captured by a high-resolution microscope. “Each of these little dots is actually telling us —” Salick pointed to the fluorescent pinpoints flashing on the screen — “the mutation that cell received.”

On this spring day in South San Francisco, Salick was showing me around the mixed-use offices at Insitro, a biotech startup that uses machine learning to help tackle cancer and neurological and metabolic diseases. Salick, Insitro’s director of functional genomics, is an effervescent redhead with a knack for simplifying complicated scientific concepts, and as he led me from room to room, he stopped occasionally to sketch things out on the whiteboards. All around us, the lab whirred with quiet efficiency: wave machines and echo acoustic liquid handlers, flow cytometers and custom microscopes — all the gadgetry that $700 million in venture funding can buy. Less visible was the stuff through which money hums invisibly: cloud storage and the computing capacity to train machine-learning models, as well as the employees with the expertise to run them.

Insitro was founded in 2018 by Daphne Koller, the Stanford professor and a co-founder of Coursera, the online education platform. Coursera made Koller a star, but now, at 57, following its IPO, she had come back to her original academic focus: the intersection of computation, AI, and biology.

To a certain extent, the first phase of the current deep-learning craze, which began in 2013 with the publication of the ImageNet cohort by her friend and colleague Fei Fei Li, passed Koller by. “All that happened while I was at Coursera, working on things that had nothing to do with AI. So I kind of watched it from the sidelines,” she says. But she has devoted the last six years to Insitro because she believes in the power of those same algorithms and computer architectures to transform another legacy industry, perhaps the most capricious and technically challenging one of all — drug discovery.

Koller is hardly alone in this regard. Over the past seven years, hundreds if not thousands of biotech startups have emerged with the mission of applying artificial intelligence to every segment of the drug discovery process: target discovery, molecular design, and candidate selection for clinical trials. These AI-assisted drug discovery (AIDD) startups have collectively received billions of dollars in funding, promising to shorten development timelines and turn drug discovery from a haphazard process with tremendous uncertainty into something more akin to software: repeatable and systematic.

They've had early successes. In 2020, Exscientia announced the first AI-designed drug molecule to enter human clinical trials with DSP-1181, a candidate for treating obsessive-compulsive disorder. In 2022, Insilico Medicine reached another milestone when it began clinical trials for a drug that represented a double first: an AI-discovered molecule targeting a novel protein that AI had also identified. More recently, Salt Lake City-based Recursion (which has since acquired Exscientia) identified RBM39 as a target for treating solid tumors and lymphomas, and then advanced a drug candidate from initial discovery to pre-clinical testing in just 18 months, less than half the typical industry timeline.

All this said, while there are currently about 20 drugs discovered by AI-assisted drug discovery processes in clinical trials, none have yet been approved by the FDA. Once you get a glimpse of what it takes to bring a successful, novel drug to market, it’s not hard to understand why.

“I think as far as industrial R&D goes, pharma is probably the closest to fundamental science of any of the major industries,” says Chaitanya Rastogi, the chief executive and co-founder of Metric, an early-stage AIDD startup. “It’s the largest industry out there with so much scientific uncertainty.”

The promise of AIDD startups is not so much that they will lessen the uncertainty but that they will better at navigating it — by using AI to see things, generate insights, explore molecular spaces that are inaccessible to humans.

Take Insitro. The startup’s scientists are growing stem cell lines into disease models — liver cells for the study of liver diseases, neurons for the study of neurological disorders. Insitro’s scientists then use CRISPR to perturb one gene in each cell, observing which of these perturbations causes the cell to revert to a healthy state. The perturbed gene with the desired effect is a potential drug target. Because millions of cells are involved, it’s impossible for a person to look through them one-by-one to determine whether they have been reverted to a healthy state, so high-definition images of each cell are used to train machine-learning models that identify the genetic changes that correct the disease. Cell imaging provides a broader picture of what happens inside a cell than focusing on a single gene or protein. It also allows for time-course analysis, where the same cells can be tracked over time. In the past, analyzing cell images was challenging because the data was completely unstructured - it was just a bunch of pixels that don’t immediately tell you what’s going on, but the advent of AI models has changed this.

Six years into its existence, Insitro has a half-dozen drug candidates in the pipeline. It will probably take another five years before any of those candidates come to market — and that’s the best-case scenario.

The management guru Peter Drucker once described the pharmaceutical industry as an information industry. “If you think of it like that — that it’s about data and decision making — then actually, all the new technologies and approaches that we now use in machine learning should apply,” says Andrew Hopkins, the CEO and co-founder of late-stage biotech startup Exscientia. After all, the models and algorithms that have driven the current AI boom are uniquely good at taking huge volumes of unstructured data and finding patterns. AI luminaries like Dario Amodei, the Anthropic co-founder, have spoken publicly about how AI models “already have all the information they need to make biological breakthroughs.”

And yet there’s a part of research that isn’t just pattern matching, that goes beyond combing through information. That part requires judgment, reasoning, and genuine insight, things that up until now, AI algorithms, for all their “pausing to think” mimeticism, have yet to truly demonstrate.

The truth about AI’s transformative potential is that it varies dramatically based on the complexity and stakes of the domain. In areas like customer support or content creation, where imperfection is acceptable and the costs of errors are relatively low, AI is already reshaping how businesses operate. But in drug discovery, which comprises an unforgiving chain of precise decisions, where a single mistake can derail years of work, AI remains only one tool among many. Drug discovery has been changed by AI, yes, but not dramatically more than it’s been changed by the invention of iPSCs, or single-cell sequencing, or CRISPR.

Traditionally, the process of developing a drug starts in the lab, where scientists identify a disease hypothesis, some biological reason for why a disease occurs. For example, for many years one popular disease hypothesis was that Alzheimer’s was caused by a buildup of a specific protein into plaques in the brain. The disease hypothesis informed the identification of a target — usually a protein or a set of proteins that, if regulated in some way, would alleviate the disease. This portion of drug development is called “target discovery.”

With a protein target in hand, companies then proceed to the second step in drug development: molecular design. In this phase, they try either to find an existing molecule or to create a new molecule from scratch that will bind to and modulate the target in some way. This proto-drug is refined, then tested in a sequence of successively more “realistic” models — first in vitro in the lab, then in mice, monkeys and, finally, humans.

In clinical trials, not only does the drug have to be proven to be safe and effective; it also has to be more effective than the standard-of-care drugs already on the market. For many common diseases, this is a high bar. Of the drugs that manage to make it to clinical trials, more than 90 percent of them fail. Often that’s because the biology of the disease was poorly understood and the wrong target was selected. Recently, it became clear after a series of failed clinical trials that protein plaques were in fact not the cause of Alzheimer’s — the molecules inhibited the protein perfectly, but the patients didn’t get any better.

Given the complexity of the drug development pipeline, AI-assisted drug discovery startups largely begin by concentrating on just one segment of the process. Insitro, for example, is focused primarily on target discovery, which it prefers to refer to as “therapeutic hypotheses.”

Another company that is using machine learning primarily to target discovery is Verge Genomics, a San Francisco-based startup focused on neurodegenerative diseases like ALS; it currently has an ALS drug in Phase 1 trials. Verge’s key insight is that many diseases of the brain are driven not by a single gene, but by networks of genes, or a combination of genetic and environmental factors. While ALS has around 56 known genetic drivers, the majority of ALS patients are classified as "sporadic" or "idiopathic," meaning they have no known genetic mutation associated with their disease. Verge’s founder, Alice Zhang, told me that the “holy grail of ALS, and a lot of these diseases, is how do we go from the 2 percent genetic to the 99 percent of non-genetic patients using human data?”

Verge’s answer to that question is a proprietary dataset of neurological data, derived from post-mortem brain tissue samples from public brain banks and nonprofit institutions, as well as bespoke sources like a neurosurgeon at Mt. Sinai who performs deep brain stimulation surgeries.

In Verge’s labs, they dissect these brain samples further, separating white matter from gray matter to extract RNA and perform transcriptomic profiling (gene expression analysis). This data is then fed into Verge’s machine learning platform, which identifies the genetic “signatures” of disease. Further analysis identifies genes that could be master regulators of the disease network, turning dysregulated networks back towards a healthy state. These master regulators become potential drug targets.

One afternoon in March, I visited the Verge offices in South San Francisco, not five minutes from where Insitro is located. The company had just moved in, and the rows of desks in the open floor plan were mostly empty. To enter the lab spaces, I donned a lab coat and plastic eye shields. All the biotech labs I visited were spotless and sparsely populated, the scientists mostly stationed behind machines, whether computers or hulking laboratory equipment, while robotic arms with mechanical pincers rotated on their axes, pipetting reagents and switching out plates. The automation is one of the things that has changed over the last 70 years: historically, protein engineering work evoked the butcher’s shop. Tierchemie ist nur Schimirchemie, an old German phrase goes: “Animal chemistry is just the chemistry of slimes and messes.”

Like Insitro’s, Verge’s research relies heavily on stem cells, in particular, induced pluripotent stem cells (iPSCs). Creating an iPSC is an act of near-alchemy, the invention of which won the 2012 Nobel Prize in medicine: by taking a sample of mature cells, such as skin or blood cells, from a donor, and manipulating them genetically, the mature cells can be turned back to stem cells. These stem cells can then be grown into anything — neurons for the study of neurological diseases, liver cells for the study of liver diseases. By generating and culturing many different iPSC lines from both healthy individuals and patients with genetic diseases, Verge scientists can create disease models in a dish.

But at Verge, unlike at Insitro, the iPSC cell lines are used primarily for target validation, rather than target discovery. After the iPSCs are cultured for two to three months and exhibit the relevant disease, they’re frozen in place with a chemical fixation, then imaged. Measurements are taken of, for instance, the energy of the cell, and protein turnover. Using this information, Verge scientists are then able to interrogate how the target behaves within a cell, and whether it’s a promising target.

For all the automation, it’s clear that validating targets in the wet lab is a laborious process. iPSCs are tricky to culture, and slow to image. “You know, cells — they don’t do what you want them to do. They don’t do the same thing twice over,” Koller, whose company Insitro uses a similar technique for target discovery, told me. “You have to develop machine-learning models able to deal with the kind of variation that we see in live samples as opposed to things that are electronically derived.” On a daily basis, Verge’s machine-learning platform, called Converge, churns out hundreds of possible drug targets. (One of the criticisms of AI-powered target discovery is that it actually increases the amount of time to market because it increases the search space).

“There are approaches in genome engineering where you literally test hundreds of targets at once.” Heika Blockus, one of Verge’s target validation scientists, told me. “And that’s something we are really interested in for the future — to increase the throughput of our biological wet lab to keep pace with our platform, which is always churning out new targets.”

In Barry Werth’s 1994 book “Billion Dollar Molecule,” a history of the early days of Vertex Pharmaceuticals, there’s a passage about what happens in the molecular design stage of drug discovery, after a target has been generated and validated. “Once a drug target is identified, the next task is to solve its structure — to reveal the lock’s inner workings with its tumblers exposed. Steeped in 50 years of advanced science, this remains something of a black art.”

And so it remains today. In the early 2000’s, what was in vogue was rational design, or designing protein sequences that would fold to a particular structure (and thus have a particular function). By 2015, an alternative approach, directed evolution, had gained mainstream popularity. Directed evolution invoked a geneticist’s approach: just as Mendel bred two strains of peas together to see what they produced, protein engineers could start with a gene, mutate various parts of it to create a library of variants, then subject these variants to selection. The variants that performed the best, according to the chosen criteria, would proceed to the next round, where this test-tube evolution would be repeated. Directed evolution was in many ways a less arrogant paradigm than rational design, and it worked. In 2018, work on directed evolution won the Nobel Prize in Chemistry.

In the last few years, a third method has emerged, championed by AIDD startups like Genesis Therapeutics. This approach focuses on the molecular design component of the drug development pipeline. These startups are using generative AI models similar to ChatGPT’s Dall-E to “dream up” new molecules that will lock onto a protein target.

Evan Feinberg, Genesis’ CEO and founder, came of age at Stanford in the 2010’s, completing his Ph.D. under Vijay Pande, previously director of Stanford’s biophysics program, and like Koller, a crossover star who is also a general partner at the venture capital firm Andreesen Horowitz. At the time, Stanford was a hotbed of AI excitement, but it was also clear that AI was not having the same impact on biology and chemistry that it was having on images, or language. “A drug is a small molecule. It's a system of atoms and bonds, spatial interactions, it's not a cute picture of a cat. It's not a string of text waiting to be translated from a different language, it's a different modality entirely,” Feinberg says. “So it stood to reason that a new type of AI had to be invented that was as natural and expressive for molecules in the same way that cognates were for images.” Feinberg’s Ph.D. research had to do with training a new kind of machine learning model that learned the principles of synthetic chemistry to generate plausible molecules. And that work forms the core of Genesis’ molecular design platform today.

When we spoke in March, Feinberg and his cofounder, Ben Skaroff, give me a demo of the Genesis platform, a user interface for the company’s chemists. They start by drawing out a molecule or fragment in 2D — and how they want to modify or build off it. They also specify some desired properties for the molecule, like absorption, distribution, metabolism, and excretion, which can be predicted from the 2D structure. A language model then generates millions of molecular ideas within the specified region of chemical space. These generated molecules are fed into a suite of 3D structure-based models that predict the binding pose and potency, and the top ones are returned to the chemist, along with visualizations of the pose.

Of course, once a molecule has been suggested, it still has to be tested and validated in the wet lab, which is a months-long process involving more experimental work — cell culturing, imaging, testing. But the diffusion models are able to imagine many more molecules, of much higher chemical diversity, much more cheaply, compared to traditional methods like high-throughput screening. “HTS can be, call it a million dollars. The number of molecules screened, call it, hundreds of thousands of compounds. And then you look at in terms of chemical diversity, usually pretty low. Oftentimes, they're already molecules that exist, therefore their novelty is limited,” Feinberg says. In particular, pharma companies are turning to Genesis for hard-to-drug targets, ones where, for example, there’s a lack of a well-defined binding pocket on the protein surface, or highly dynamic targets that don’t have a single, stable conformation.

While Genesis’ user interface for generating molecules seems like it could be monetized through an SaaS or licensing model, Feinberg and his team have gone in a different direction, mostly because SaaS models have largely been failures in biotech. The company has two sets of targets it is working on, one given to them by their main pharmaceutical partners, Genentech and Eli Lilly and the other that they plan on taking through the whole drug discovery pipeline themselves.

For Genesis, the partnerships are a way to get early revenue without huge risk. The company is given an upfront payment, followed by subsequent payments depending on success criteria. “We're doing only in silico work for them [Eli Lilly and Genentech],” Skaroff said. “So they do all the chemistry and biology, versus our internal pipeline, where we are doing chemistry and biology and obviously, taking on all of the sort of target responsibility.”

For the latter set of their own in-house targets, Genesis is generally choosing “well-understood targets that have minimal biological or clinical risk, ideally with a tight genetic link to the disease.” The idea is to hold some variables constant, to decrease the uncertainty in every part of the pipeline except their core technology, which is doing chemistry. In a world where the odds are already against you, you can’t innovate everywhere. You can’t run both delivery risk and target risk. You’ve got to pick one.

Every drug that goes into a human body needs to pass through a set of gates that begin with pre-clinical trials and proceed from Phase 1 through Phase 4 trials. In pre-clinical trials, data is gathered from everything from test tubes to monkeys. The drug has to be shown to be non-toxic in mice, then in rabbits, then in monkeys. Next come Phase 1 trials: showing non-toxicity in humans. Then Phase 2 trials: showing the efficacy of the drug. Phase 3 trials are about proving that the drug is more effective than the current standard of care, while Phase 4 trials might deal with different dosages and patient populations. After passing through these gauntlets, the drug is finally approved by the FDA.

This last stage of the drug discovery pipeline is the most expensive, mostly because it involves humans, and is difficult to shorten. A trial measuring the safety of a drug over 12 months, after all, has to run for at least 12 months. There’s no way to cheat time. There’s also the problem that animal and cellular models are often not good proxies for efficacy in humans. You could have the wrong target, or have designed the wrong molecule, and not know it until you’ve spent $200 million on a clinical trial.

Which isn’t to say that there isn’t room to improve the odds. A number of AIDD companies have begun to use AI not just in target discovery or molecular design, but in clinical trial design and patient management.

Insilico, a late-stage AIDD startup that currently has a rare lung disease drug in Phase 2 trials, for example, is using machine learning algorithms to select the parameters — everything from the sites to the criteria for participation, like thresholds for blood pressure — that will maximize the drug’s change of successfully advancing through Phase 2 trials. The company is also using biological information to target drugs at patients in which the drug is more likely to induce a positive response. “By utilizing a very precise biomarker, you can accurately identify and select the appropriate patients for the drug — effectively linking the patients with the target of choice,” said Insilico CEO Alex Zhavoronkov. “This also allows you to narrow down the patient population.”

There’s some debate as to whether using AI to pick the right participants for a clinical trial moves the needle as much as using AI for molecular design does, or whether applying AI to molecule design is an “easier problem” than applying AI to target discovery. “As impact increases, feasibility decreases”, one analyst wrote in the journal, “AI in target discovery, for example, would be highly impactful, but is less feasible. Clinical trial optimization is lower impact, but highly feasible.”

But these arguments feel pedantic, in part because it seems obvious that all the biotech startups, no matter their original focus, are going to apply AI everywhere, if that’s fruitful. [That’s a really fascinating thought — I wonder if we could spin out examples of how AI might actually be applied “everywhere” in the process?]

In other words, the startups like Genesis currently focused on molecular design are eventually going to expand into target discovery, and the ones focused on target discovery like Insitro and Verge are going to move into molecular design. Even the work being done to optimize patient selection in clinical trials is only possible if biotech startups verticalize. “A lot of our discovery right now is done via clinical samples,” Koller says. “They often reveal which patient populations a particular drug might work for. And that patient selection is a critical enabler in … running an effective clinical trial. Without those insights, how are we going to make sure that it's done in the right way? And then commensurately, how do we take the insights from our trials and bring them back into our discovery effort if it's all in someone else's hands and we probably don't even get access to the data?”

One consequence of this convergence and competition is that biotech will likely remain a much more closed industry than tech is. So much of the tech industry has been enabled by infrastructure work that was largely done under open-source licenses and for free. Meanwhile, the incentives in biotech do not incline towards openness. Even in the pharmaceutical partnerships that are the lifeblood of biotech startups, there’s perhaps less sharing than there should be. The two parties, after all, are somewhat competitive.

Late last year, I talked to Dan Skovronsky, the chief scientific officer at Eli Lilly, about how Lilly made decisions around which startups to partner with, and the new generation of AIDD startups like Genesis doing AI-powered chemistry.

He told me that the companies today with the really big datasets, the kinds of datasets necessary to train AI algorithms, are almost exclusively large pharmaceutical companies like Lilly, that have spent decades testing millions of chemical compounds against tens of thousands of targets.

“So we have the data. They have the expertise and the computational models. How do we match that together so we can have a system that learns? That’s complicated and right now, there aren’t too many great examples of success. Because of course, each side sees what they have as so valuable and proprietary. There are targets in biology that we've been working on for a decade. We don't want to just give away all of that hard work.”

There is a tendency, when it comes to sentiment around AIDD startups, to swing between the extremes. Either drug discovery is about to be solved by AI, or AIDD is useless. The reality is probably closer to somewhere in the middle. And it probably won’t become apparent for a long time.

“I think a 'true victory for AI drug discovery' would be hard to characterize via a single drug,” said Abhishaike Mahajan, a machine learning engineer at the AIDD startup Dyno Therapeutics and author of the Owl Posting biotech blog. “There's lots of ways that these sorts of metrics can be cheated a little; potentially [the drug] looks very similar to some other drug and it being repurposed…isn't actually very impressive. Or maybe it's not actually that efficacious. Or a long list of other goalposts that people often have.”

Take REC-1245, a drug candidate for solid state tumors that was recently announced by Recursion Pharmaceuticals in Salt Lake City. REC-1245 was discovered via a novel approach that evokes the fever dreams of Silicon Valley: traditionally, cancer researchers focus on a handful of well-documented targets mentioned frequently in scientific literature. One such target is CDK12, a protein known to play a role in cancer but notoriously difficult to drug effectively, in large part because drugs that affect CDK12 tend to also inhibit a similar protein, CDK13, creating too much toxicity. But Recursion has been able to leverage AI and cell imaging to identify an alternative target, RBM39, which functionally resembles CDK13, and to optimize small molecules that can block it without affecting CDK12 or CDK13.

Today, there are those who argue that RBM39 isn’t actually as novel as Recursion is trying to make it out to be, that’s been a known target for some time in oncology. Others argue that the use of AI to re-discover it is nevertheless validating, while still others note that it’s the speed with which the ensuing Rec 1245 was able to get to pre-clinical trials, that is impressive. But even if REC 1245 eventually ends up receiving FDA approvals, what value should one impute upon that result? The history of drug discovery is riddled with one-hit wonders, both of companies and of scientific techniques.

Similarly, for AI in drug discovery, “true victory will look more like a trend line rather than any single asset being useful,” Mahajan said. Perhaps this is the reason why all the surviving AIDD startups spend a lot of time talking about their platforms, with the implication that there may need to be multiple shots on goal in order to prove success. Indeed, the history of drug discovery is riddled, also, with things happening slowly, and then all at once.

Reading an article from 1989 by Malcolm Gladwell, then a young science writer at the Washington Post, gives at first a sense of deja vu: “These are uncertain times for biotechnology. Investors have become wary of the long development times associated with bringing biotech products to market. ‘The general perception among investors is pretty jaundiced. ... Unless a biotech company has a plan for near-term revenues and earning, most people aren't interested in playing.”

The article goes on to describe Viral Technologies, a two-year-old Washington company trying to develop a vaccine for AIDS. “Company officials say the results of their work so far has been promising. But they admit that they are probably five years away from reaching any definitive verdict on whether their research will be successful.”

As it turns out, Viral Technologies was not successful. The company shut down in 1994, after running out of money. But PreP, a medication that prevents AIDS, was approved in 2012, and today AIDS is no longer the scourge it was in those terrifying years. Those who contract the virus can expect to live a full, long, healthy life.

Science, it turns out, marches inexorably, if slowly, on.