Good afternoon, everyone. Thank you so much for joining us. We're really pleased to have Michael Secora, CFO of Recursion, with us. And with that, Michael, perhaps you could start here with an overview of your platform, clinical programs, and upcoming milestones. When thinking about the TechBio space broadly, how would you recommend that investors delineate between the many players and type of technology? So I know I just threw two questions at you.
Yeah, well, appreciate that, and again, also, you know, thank you for being here at the conference. Always good to chat with you. To get first into that first part around the platform, overview of the platform, I think at first, it's important to acknowledge the big problem that we are looking to solve. We're quite acquainted with the time and cost it takes to develop a new drug, and we also believe that to have impact in that problem, it takes a full stack, a holistic solution. And I think at this time and place, we find this interesting confluence of technologies across compute, across data, across automation, tools to control biology, like CRISPR, and all of those come together in the Recursion operating system.
It is that operating system that begins with an automated wet laboratory that has done now over 250 million experiments. We do up to 2 million experiments per week across many different kind of biological, chemical contexts, and it serves a primary purpose. The primary purpose is in data generation, and in that data generation, we are either mapping biological, chemical, patient-centric insights, or we are using that same operating system to validate a given insight. To then complement that wet lab is our computational dry lab, our supercomputer, all the different software tools that we have developed that allows us to analyze that data, and then from there, map all of those relationships, again, across biological, chemical, patient-centric contexts. Now, with that, that's...
That, that tool set, one has a number of connected modules across how one is looking at, map-based search across a number of different, different properties to give rise to a certain, say, target compound relationship, for which one goes through a sequence of validations, phenomics, transcriptomics, chemoproteomics, all in service to trying to find something that is chemically, clinically tractable, and then going through a process of optimization, both in an in silico way, looking at, novel molecule design, as well as an in physical way, for which one is experimentally testing and refining a given molecule, and going back and forth again between dry lab, wet lab capabilities. From there, one is going into translation across animal models and then into IND-enabling studies in the clinic.
Now, for how we've been driving value through that operating system, it's one part is the pipeline that you allude to, also in the large partnerships that we have and also in our data strategy. To unpack a little bit about the pipeline that Recursion has built, I think it's important, big picture, to call out that we are gonna have seven readouts in the span of about 18 months, five of those being phase II, two of those being phase I. And, you know, it begins next quarter with a phase II readout in a cerebral cavernous malformation. This is a rare disease with no approved therapy, affects approximately 360,000 patients in the U.S. and EU5. And then, you know, Q4 of next year...
Sorry, Q4 of this year, we have a phase II readout in neurofibromatosis type 2, affects 33,000 patients in the U.S. and EU5 . Followed up, first half of next year, we have data coming for familial adenomatous polyposis, as well as AXIN1 APC-driven mutant cancers. Those populations, approximately 50,000 and over 100,000 patients respectively. We also have a phase II in C. difficile infection. That'll be initiating here this year, affects over 700,000 patients. And then we have phase I trials. We have first HR-proficient cancer, where we have RBM39 as a target, as well as target Epsilon. This is a novel target, fibrotic disease. Watch those to continue through IND-enabling studies and watch for an IND submission for HR-proficient cancers this year.
So a lot of programs moving forward on the pipeline, and I think taking a step back, you know, I look at the pipeline, and I see breadth, I see depth, I see a platform that is able to be plastically applied across a number of therapeutic domains. So that's a little about the platform, the pipeline, the milestones that you called out. If we were to get into a little bit the second question you brought up about what are the drivers of differentiation between this space, between the within this TechBio space? I think maybe just to kind of give a little color, maybe just a little bit of definition or my definition of what TechBio is, you know, I'd been a scientist and then investor, now operator.
I believe TechBio is embracing a technology industry-first perspective, whereby one is generating, integrating, utilizing all these different forms of data, biological, chemical, patient-centric, and that data has certain important properties like relatability, so that data is able to be used by some of these tools, like AI and ML, for which one is able to derive novel insights and with volume. And I don't know if you remember this, but I remember a conference we had done years back, where it was some tech, some bio, bio meets tech, tech meets bio. So in my mind, that was when that term like came to life for me, was whenever we did that conference a few years back. That's right. That's right.
And I think from that perspective, that definition, there are three, you know, factors that I really look at across trying to understand or make sense of this space, and I think number one is the scale of proprietary data. It's the scale of compute, and it's the scale of impact. And to unpack those a bit more, with respect to the scale of proprietary data, it's around not just the size, but perhaps more importantly, the means by which one is continually able to continue to scale a data set for which one then has the substrate to uncover new insight. It's then also the scale of the compute that couples to the data for which one is then able to analyze the data and extract out those insights.
Then the scale of the impact is around how are you, what kind of novel insights are you finding? What kind of programs are you driving forward? How are you having impact for partners?
What do you see, you know, when you think about that, what do you see as your key advantage or differentiating factor, and how are you thinking of maintaining it forward in a rapidly evolving field? You know, I guess simply, I guess one could say you have a, you have a problem you're trying to solve for, you find the tool, and you kind of adopt it, and, you know, and you integrate it, and then you modify as needed. But help us understand how you then differentiate and kind of stand out in this field.
Yeah. Absolutely. Well, I think, you know, you highlight it well. It's... There, there are three things that I think Recursion really tries to do and continue to do, and that is in the integration of a new technology, the scaling of a technology, and then getting that technology to be automated, and then doing it again and doing it again, and trying to do it along the entirety of the drug discovery development value chain, from target identification to novel chemistry into, you know, translation, biomarker development, and then into the clinic, and even thinking about how you industrialize there as well. I think that that premise of operation can affect those three, you know, primary dimensions that I talked about around data, around compute, and around impact.
I think if we unpack that for Recursion a little bit, on data, over the last year, you know, we've scaled our proprietary data set to over 50 PB in scale. We have now sequenced our 1 millionth transcriptome, so now really building out, just a, you know, complementary, bolus of data to augment the phenomics that we had. We struck partnerships with Tempus and Helix around accessing multimodal patient-centric data, which gets into not just clinical record, but DNA sequencing and RNA sequencing, and even having relatability there that can then transit back to the incellular level and derive all these different kind of causal effects. On the compute side, you know, we've acquired companies in the last year, like Cyclica and Valence, that help bolster digital chemistry as well as generative AI capabilities.
We've also, you know, we've also have constructed some vast chemoproteomic layers using some of the Cyclica software and also being helped by our partner, NVIDIA. And we also have, you know, designed and built our next-generation supercomputer, BioHive-1 , again, also with our partner, NVIDIA, which is now a top 50 supercomputer across any industry. And then on the impact, you know, we talked a little bit about the impact we're already having on the pipeline with seven readouts over the span of about 18 months. On the partnership side, we have some, I think, some terrific partners, and I think watch for potential program options in the near term, watch for potential map-building initiatives in the near term, and also watch for new partnerships in the near term as well.
Let's turn to your clinical pipeline here, where you have multiple upcoming data readouts, as you noted, starting with the phase II data of REC-994 in cerebral cavernous malformation in the third quarter. Can you walk us through the mechanistic rationale here and the prior data that gives you confidence in this program, recognizing that you've made changes internally as well?
Sure. Yeah, so cerebral cavernous malformation, again, is a rare disease, a very large rare disease with no approved therapy. It is characterized by these vascular malformations in the CNS, lesions in the brain, driven by loss of function mutation of the CCM1, CCM2, or CCM3 genes. And if we kind of unpack this a little bit mechanistically, those genes, the CCM genes, are involved in the regulation of FoxO1 transcription factor, which is involved in superoxide dismutase 2, which is involved in the regulation of reactive oxygen species. When that is all appropriately regulated, you can have, you know, healthy functioning of the endothelium, particularly the endothelial nitric oxide synthase, which just allows the vasculature to maintain a certain healthy state. And I think it...
To, to kind of get into, you know, and if you have any kind of mutation of CCM, you see the breakdown of that chain, you see a rise of the reactive oxidative species, which then causes breakdown of the cell barrier and gives rise to the lesions that characterize this disease. If we look at some of the preclinical data and the data that's given us confidence, I think, number one, it is the insights that we had from our operating system, where we had taken a target-agnostic approach, gave rise to thinking about the molecule in relation to this disease. I think it's also in the preclinical data, where we were seeing a number of lesions, the lesion number changing, the size of lesions changing, having real impact at a preclinical level.
And then I think it's also in the phase I data that we’ve been able to show, which showed a very safe and tolerable molecule. And I think even beyond that, the fact that, you know, we’re now completing the phase II trial, and after a year of treatment, the vast majority, nearly all patients, have chosen to opt into long-term extension, which I think continues to highlight the overall safety of the molecule.
Why have there been few industry-sponsored clinical trials in this indication, despite the relatively large population? I think what I'm trying to understand is the complexity here and the read-through to a probability of success.
Sure. Sure. Well, I think, you know, It's been really interesting. For a disease that has such a large patient population, approximately 360,000 patients in the U.S. and EU5, it's a relatively unknown disease. And I think as I've had interactions with, you know, with investors, sometimes it's only the investors who may have had a medical background as a trauma doctor, because usually that's when it's identified. Someone who's had a concussion goes and gets an MRI, and they see that these lesions are there, and they themselves may be a familial carrier of disease. And it may even be known in the family, but not even mentioned from, say, you know, parent to child.
I think there are a few factors that make this disease that has perhaps you know that that has kind of stymied kind of traction within the clinic here. I think number one is just the overall kind of vicious cycle of rare disease, meaning, you know, we don't know anything about this disease, therefore, there is no, there is no therapy. Because there's no therapy, we don't diagnose it. We don't diagnose it because we don't know about it, and so on and so on. I think secondly, there is I think it gets into the scaffolding proteins that are involved here. So here, if you look at the core CCM proteins that are involved in signaling, there has been a perspective that the direct drugging of those proteins could be undruggable.
In our context, we're utilizing a different kind of mechanism of action than trying to drug those proteins directly. And then thirdly, I think to your point, Salvina, around complexity, I think that there is a lot of variation around, of course, the lesions themselves, around size, rate of change, location, you know, and overall number. And then to kind of complement that is also variation in the symptoms that are manifest in terms of seizure, hemorrhage, you know, as well as other focal neurological deficit. I think all of that together makes for a disease that has been relatively unknown, despite the overall patient population.
Could you frame the format and content of the third quarter disclosure, and given the lack of precedent here, how you're thinking about what would be clinically meaningful?
Yeah. Well, happy to. So with respect to this disease and this trial, I believe we are the first industry-sponsored phase, you know, industry-sponsored phase II trial for this disease. So we have, you know, real, you know, first in disease opportunity here. I think because of that, you know, we've been working closely with the FDA to work on a number of potential efficacy measures, where we're effectively defining the guardrails for this disease. And so to unpack some of the measurements that we take into account, there is more objective measures like MRI imaging of lesions. We're looking at number, size, rate of change. We are also looking at, you know, patient- and clinician-measured outcomes, as well as more objective measures like patient-reported outcomes.
And, you know, we believe that movement in any one of these, you know, we and our KOLs believe that movement in any one or more of these variables certainly warrant next steps for this program. But I also think that the combination of perhaps some of these more objective measures, like imaging, with some of the more subjective measures around patient reported outcome, I think really make for, I think, a compelling discussion with the FDA, and I think help chart a path to registration.
For the next two programs, 2282 and NF2, as well as 4881 in familial adenomatous polyposis, as well as specific mutant cancers, could you walk through the clinical measures here that are key for us to focus on and what you'll present at the first data release, and how you're thinking about establishing proof of concept here?
Sure, sure. So you know, we have a number of programs coming with data coming. Let's talk about NF2, neurofibromatosis type 2 first. That, again, is a rare disease, no approved therapy. It is a disease characterized by benign tumors in the CNS, and it you know, affects 33,000 patients in the U.S. and EU5. That data coming fourth quarter of this year. This disease, you know, characterized by mutations of the NF2 suppressor gene. And here, the things that we're gonna be looking to measure is, of course, safety, tolerability, as well as progression-free survival, overall response rate, time to progression, and change in the sort of the progression slope. And so a number of different measures there that we'll be looking at as well.
If we then look at, you know, AXIN1. If we then look at familial adenomatous polyposis, this disease also a rare disease with no approved therapy. This disease characterized by these polyps that are in the GI tract, which have a high risk for malignant transformation. This affects approximately 50,000 patients in the U.S. and EU5, and this is driven by a mutation of the APC APC tumor suppressor gene. Here, like AXIN1 APC mutant cancers, expect to have data first half of next year. And the measurements we have here, again, also safety tolerability, but predominantly looking at change in baseline of polyp burden at 12 weeks.
We'll be looking at the, you know, how the number of polyps, effectively the overall polyp expression before and after has changed. Going on then to exon 1 APC mutant cancers. Again, this is affecting over 100,000 patients, oncology indication with substantial need. Here, this is driven by mutations of the exon 1 APC, loss of APC, tumor suppressor genes. Here, we'll be looking at, again, safety tolerability, and then in addition to that, looking at overall response rate.
... Can you specify the differences between how these assets were developed versus how you plan to develop these assets in the long term?
Sure, sure. So if we look at, you know, if we think about the platform that we started talking about at the beginning, that depiction of that platform has been 10 years in the making, and the Recursion operating system is always evolving. As new capabilities, new ways to think about data is being integrated and scaled and automated. And so if we think about different vintages of programs, they represent different versions of the operating system for which they manifested. So if we think about some of the earlier programs, like CCM, like NF2, like FAP, this vintage of programs highlight, you know, an operating system that was predominantly driven by phenotypic identification of novel biological insight, which was then complemented with known chemical entities. If we then move to a program like C. difficile infection, for which, you know, we again was using the operating system to find novel biological insight, but also highlights a time when we started to wade into novel chemical design. And so now you see biology being coupled with novel chemistry. And then if we move to sort of the more recent programs like HR-proficient cancers with our target RBM39, you see again, the OS being used not just for novel biological insight identification, but novel chemical design, and also including patient-centric data for biomarker identification, as well as patient targeting, patient stratification.
I think that, to your point about new programs to be developed in the future, either but, you know, for us or for our partners, I think it's always, you know, we always want to be using the bolus of capabilities that we have, that we continue to kind of refine across biology, chemistry, patient-centric data going forward.
When might we see the C. difficile data?
Yeah. So, that is going to be initiated this year. And so if we think about the data that we've seen thus far, you know, last year, end of last year, we had very positive phase I data. We had positive data across SAD/MAD cohorts, very safe molecule, also a molecule that showed no severe adverse effects, and there was no treatment-related discontinuations. Sets us up for a nice, you know, phase II, and I think, you know, watch for that phase II to initiate this year.
You also have a download day coming up at the end of the month, and help us understand what you're looking to highlight at that event. Interestingly, you seem to have a guest speaker there that you had earlier this year, and, you know, what is it about Recursion that seems to attract so much attention from the CEO of NVIDIA?
Yeah, well, I think, you know, I think we're looking forward to our Investor Day, our R&D Day. I think, to highlight a little bit of the run of show that we have for that day, you know, we'll be going over kind of state of the company, way of looking at kind of the state of tech bio. We'll be talking more deeply around the operating system, what capabilities have we added recently? What capabilities are we looking to add? Why are they important? What impact are they driving? We'll be talking about some of our preclinical work.
Here, we'll be talking about the impact in identifying new preclinical programs, and I think we'll be getting into working with RBM39 as an example program of highlighting what—how was novel biology identified, how was novel chemistry identified? How was, you know, this done? You know, what tools were being utilized across optimization, translation, biomarker development? You know, then we'll have some discussion around our clinical programs, where, you know, we want to be able to walk through all of these different clinical programs, highlighting what can we expect, you know, what is clinically meaningful, and, and how does that—how does this potentially set us up next, and what could that look like?
We'll also have some discussion around our partnerships, and I think at this event, you know, we'll certainly welcome some external speakers to be a part of this, some of those being our partners. There'll be some KOLs, and I think, you know, as we often with some of the Recursion events, really try to make it an enjoyable moment for all in attendance. It's kind of a very, you know, collegial conversation around these tools and how it ultimately can impact life science.
Could you provide a brief overview of your recent partnerships and business development deals and how that has impacted the Recursion OS?
Sure, sure. So there's... You know, we've been fortunate to form, you know, two kinds of partnerships. There's partnerships on the discovery side with large pharma companies in, in therapeutic areas, and then there are partnerships on the technology side, mostly in looking at accessing new technologies or new relatable datasets. So we'll look at the life science large pharma partnerships first. We've had partnerships with Roche-Genentech in the space of neuroscience and one GI oncology indication, substantial partnership, where last year, last fall, we actually had our first program optioned by them for a novel target in GI oncology. And that partnership continues to go quite well and watch for potential program options and map-building initiatives to play out.
There's also a partnership we have with Bayer in the space of undruggable oncology. That partnership, again, also, substantial in size, and we're working across seven different targets in undruggable oncology, and again, making a lot of progress there and looking forward to continuing to work with those partners. On the technology side, we've had partnerships first with NVIDIA. NVIDIA has been a partner of ours in high-performance compute. We've had a partnership or relationship with them for probably about four or five years at this point now. Most recently, they helped us design and build our next-generation supercomputer, BioHive- 2.
We have you know, a foundation model that they host on their platform, BioNeMo, and it's certainly been a great partner in thinking about what the future of this industry can look like. We also have partnerships with Tempus as well as Helix. This is around data patient-centric multimodal data, so data that comes with not just patient record, but DNA sequencing, RNA sequencing, and here we use that data not just to derive biomarker signals or think about how to target or stratify patient populations, but also taking that relatable data at an inpatient level, connecting it back to the incellular level for which we wanna derive these large-scale causal models for what you're able to perturb in the cell to what it means to have an outcome in a patient.
And then lastly is a partnership with Enamine in the space of cheminformatics and chemical synthesis. That partnership, you know, we've done some great work with them and also with NVIDIA, once we'd acquired Cyclica, to go about building out this digital chemoproteomic layer, looking at the entirety of the Enamine real space and how each of those molecules could be interacting with the entirety of the human proteome. Also, we're working with them on constructing a number of orthogonal chemical libraries. Now, in terms of the impact that you got to around the what it's meant for Recursion, I think ultimately, these partnerships do one fundamental purpose, and that is to drive value, but how they drive value is slightly different.
I think on the, you know, on the technology side, it's around the addition of new capabilities to augment the OS in order to drive value. And on the large pharma partnership side, it's adding new channels to apply the OS to drive value, which again, highlights the overall plasticity of the platform across therapeutic area.
You've also spoken about an emerging third business segment focused on data strategy. Maybe help us understand the monetization aspect here and how your NVIDIA collaboration comes into play?
Sure. So there's, you know, three pillars to the Recursion business strategy. There's our pipeline, there's our partnerships, there's our data strategy. And that data strategy is around the potential licensing of subsets of our data or subsets of the tools that we have available, like LOWE, like Phenom, which is our phenomics foundation model. And with that strategy, you know, we would potentially look to commercialize. There could be with, you know, with us hosting a model, perhaps on NVIDIA's BioNeMo platform. But there's other ways we could also think about monetizing this, too. Perhaps in the context of a partnership, maybe a new partnership, for which perhaps accessing some of Recursion's technology tools is part of that partnership and maybe as some kind of recurring revenue basis.
And so what's kind of interesting there is maybe under the umbrella of a large-scale enterprise-scale pharma partnership, you have elements of a recurring SaaS business model that is scoped, again, in trying to advance those efforts forward.
So with that, let me open it up to the audience for any questions.
If you're not here for me, you've acquired two AI-enabled drug discovery companies, Cyclica, and are partnered with Tempus and two other... How have the technologies and data from these companies been integrated into your platform, and what early benefits have you observed here?
Yeah. Well, I think, let's talk a little bit about... May of last year to bolster our digital AI, digital chemistry capabilities, as well as our generative AI capabilities. And I think some of the benefits that we've seen in the last year or so is, of course, improvement in the in silico drug design capabilities that we have. You know, we've applied these tools to construct that chemoproteomic layer, and we've also, you know, have worked with these teams to develop some of the models that we've put forward, like Lowe, which, you know, partners. And then on the data side of it all, data with Tempus and Helix, and then cheminformatics with Enamine.
On the patient-centric side, there is, you know, how we've been applying that to think about biomarker development, patient targeting, patient stratification, these kind of causal models of AI. And on the with Enamine, with the cheminformatics, is around the construction of some of these orthogonal, diverse chemical libraries.
Maybe you'll put this [audio distortion]
$1 million.
Yes.
What is your bottleneck? Is it kind of diseases? Is it, gene profiles? What is your bottleneck?
Yes. Well, I think the funny thing about bottlenecks is that they represent some kind of engineering problem that need to be solved. And I think if we look at Recursion's application of technology, it's always thinking about what's the next thing to try to debottleneck. And so we began with the large-scale mapping of, you know, biological insight, knocking out every gene in the genome and looking how that could how one gene could relate to another gene. Then we started to look at chemical perturbation. We have a internal library of about 2 million chemical molecules, and so now you start to look at how a gene can relate to a gene, a gene to another compound, and you start to map that out.
That starts to debottleneck how to find novel insights, right, around target identification, compound identification. Then you get into how you're potentially, you know, validating that. You want to debottleneck that across the multi-omic validation experiments, phenomics, transcriptomics, et cetera. Then you want to think about perhaps how to debottleneck optimization. So there, it's around, you know, improving your in silico to find certain scaffolds, as well as have, you know, different kind of module that gives rise to, you know, ADME data with that kind of those kind of chemical properties start to refine molecular structure, but also refine your digital chemistry models. You start to debottleneck that, move into translation. So we've applied that kind of great technology, automate technology, scale technology, along of, of drug.
Greatest bottleneck simply is, how one thinks about, how one thinks about increasing the, transition probability, from some kind of, you know, validated lead, optimized, you know, optimized candidate into some kind of, a developmental candidate on, on, in path to, to the clinic, and even also thinking about how one starts to industrialize clinical trial design with data-driven, patient targeting, data-driven patient selection. You know, and, and I think, you know, there's, like, a theoretical bound for a clinical trial. There's a time to run a clinical trial, let's say, one year of, of, of dosing. That's the theoretical bound. How much can you shrink the margins to enroll, to, to run the trial, and then again, post-trial run? And if I said a million- if I said a billion transcriptomes, I apologize. I meant a million. I don't know if I...
I don't know if I said that.
Great. Well, with that, thank you, Michael. Really appreciate the discussion.
Thank you so much. I have, like, really broad questions regarding competition. I mean, like, there's a lot of private company, like, working on AI, generating drugs, no matter from tissue-based or genome-based, and it's also, like, powered by a lot of, like, AI-based company. I just wanted to understand, like, how Recursion seeing the upcoming competition that would be and, how is our own platform kind of differentiated from all the others? Thank you. What part?
Well, I think it would really follow along the dimensions that I... Without knowing the company specifically or understanding the technology, I think I would imagine- I would look at along the dimensions that I talked about before, the scale of the data, the ability to continue to scale the data, the compute that's applied to make sense of the data, and the impact that one is able to derive from having made sense of the data to find novel insight.
And so I think that, you know, there's, I think there's a, there's a great opportunity to define and scale whole new omics across many different kind of, many different kind of layers that helps us augment our understanding of biology and chemistry and math, and, certainly look forward to all these tools having more and more impact on, on how we go about getting drugs to patients where we can have, you know, impact on, on lives.
Great. With that, Michael, thank you so much.
All right. Thank you, Sylvia.