All right. We're gonna go ahead and get started. Thanks everybody for joining here in the room, and those that are joining us online. Really appreciate it. I'm Nick Smith. I'm the CFO here at Alto Neuroscience. We're excited for everybody to join us for what is, an important investor day here ahead of our, ALTO-100 data readout that's coming in October. As we jump in, just a quick reminder that we will be making forward-looking statements today. I think a really important disclaimer here today is that nobody at the company that you'll hear from or in the company have any access to the data yet. The data remain blinded, so we're making this presentation today, with no knowledge of the clinical data coming in the ALTO-100 Phase 2b study.
We have an exciting program here for you today, but really important to ground us on the mission of the company and what we're here to do. We are a company entirely focused on developing novel medicines for neuropsychiatric disorders, and we do that leveraging a precision approach. We believe there's obviously a massive opportunity to bring precision medicine to neuropsychiatric disorders, and you'll hear a lot about that today. We are now on the cusp of our large Phase 2b study, which will be an important data point in moving us closer to that mission. We have a great group of speakers today.
I'm excited to have them join us and share with you a lot about the background of the company, the background of the molecule, a lot of work that we've done now in the clinic. We have Amit Etkin, who's our Founder and CEO. We have Adam Savitz, who's our Chief Medical Officer. We have Mike Hanley, who's our Chief Operating Officer. Jessica Powell, who's our Chief Development Officer, and then we're really excited to have Dr. Jerry Sanacora from Yale join us and give an external expert perspective here, both on the clinical side and also on the patient population. All of these panelists and speakers will be available during the Q&A, and I'll right now turn it over to Amit to talk through the agenda and give us a go-through on ALTO-100.
Thank you, and thank you all for coming. There's a lot to talk about today, and we'll really try to do a pretty deep dive into the people, the biomarker, the drug, the plans, really top to bottom. So give you background on depression, but especially this part of depression that we're going after, depression with poor memory, poor cognition, the link to the drug mechanism, to the biomarker mechanism, how that pulls it all together. We'll give you information on the phase 2A trials that we've conducted and how that's rolled into the phase 2B, including, importantly, baseline data, which is. This is the first time that that is being shared. And then take that into commercial considerations, and then have really a thorough discussion of this approach and more generally, precision psychiatry, with Jerry Sanacora.
Before doing that, let me talk a little bit about Alto as a whole and where we are now. We have five phase 2 trials ongoing and already have dosed over 800 patients with novel compounds using our platform over the past few years, targeting populations in the biomarker-defined subpopulations, already over 25 million people in the US alone, and cash runway into 2027 to support all of these activities with five Phase 2b or phase 2 proof-of-concept trials. It's important to realize also where our reality is now as we think about new drugs coming in and our approach specifically. The current approach, as you know, is a trial-and-error approach, both for development of compounds and for deployment in the clinic, leading to what looks like small effect sizes.
But those small effect sizes are actually driven by a small population that responds well, whereas most people respond poorly and really not that differently from placebo. So our precision lens is one where we're going after that responsive population here using biomarkers, ways to characterize those people objectively that allow us to decrease development risk as we go through, which might either mean advancing a drug for a select population or learning early that a drug is not worth advancing. And then once through trials, that leads to a differentiated clinical profile compared to everything else that's out there, which is an all-comer approach.
It's important also to understand that this is, while new to psychiatry, something that takes a lot of learnings from elsewhere in biomedicine, and especially in oncology. And this is actually a very interesting chart on a couple of different levels. So what we've charted here is a number of citations tying treatment to precision medicine in oncology versus psychiatry over the years. You can see that the beginning of precision oncology, with Herceptin coming in, 25 years ago, really began that trend, but that real inflection point happened when immuno-oncology came in.
The nature of the biomarkers are both for oncology, targeted in terms of what the particular gene does and how the drug interacts, but also phenotypic, and that's really the translation between oncology and psychiatry, things like tumor mutational burden and microsatellite instability. That's the kind of biomarkers we're working with as we think about cognition and EEG. Where we are as a field is slowly trending up. We're actually at exactly where oncology was in 2009.
So if you think of psychiatry as 15 years behind oncology, that's literally where the science puts it. So what are biomarkers for us? They're ways to characterize brain function directly or indirectly, that are objective and quantitative, and that go beyond symptoms. Looking, for example, at EEG, to look at, brain activity directly. Cognition or behavior-based, performance-based assessments of various aspects of brain function that you do, much like those of you who did the battery on a computer, yields an objective score. Wearables to look at sleep and activity metrics. And then as we use this information to identify responder populations, as well as to understand what a drug does to people, we're also looking to make this as commercially tractable as possible.
So it's not just about R&D and developing the right drugs, it's being able to scale this approach to the level of millions and millions of people. So the battery that you took, you can see becomes an easy thing to scale online, EEG very much so. We're already doing that in patients' homes. Let me give you just an example of what we'd had to do to get ourselves there. We have our own in-house engineering team that builds the tools to do the kinds of assessments that you've done. This is one test, for example, you see on the left here. Traditional, well-understood, well-validated neurocognitive test, adapted for self-administered online testing. That's through our Spectra battery, and you did the sort of commercial, 15 to 20 minute type battery, that we'll ultimately anticipate putting out.
But it's not just behavior and cognition that we can assess readily. We can also do that with EEG. This is, for example, a low electrode count system that a patient has collected on themselves in their home, and then we have built the software out of years and years of experience collecting this kind of data to do real-time quality control. So the patch of data you just saw come in from the right was a bad patch of data, very noisy, and that's picked up with these little bubbles on the left turning red, and this is all in real time. This is all things that are being used in our sites for clinical trials now, and we'll talk about quality of the data as we go into the Phase IIb. It's not just, of course, ALTO-100. That's what we're talking about today.
There's five phase 2 programs going on across four different novel assets. Each of them have their own interesting attributes to them, three of them targeting MDD in one way or another, but those are different biomarker-derived subpopulations. The first study to read out will be, as you all know, ALTO-100 in October, followed by two antidepressants, ALTO-300 as a Phase 2b and ALTO-203 as a phase 2 proof of concept in the first half of next year, so that's a lot of activity just in depression, and then our lens is bigger than that, with schizophrenia, with bipolar depression as a recently initiated Phase 2b for ALTO-100, and we've done PTSD as well in the past, so it's a really broad gamut that we're after, with a lens that is aimed to generalize across mechanisms and across disorders.
Today, we're going to go through all of the background here on ALTO-100, but we particularly put these questions up here, as things that we often get from investors. In fact, some of this may seem very similar to wording you may even use in posing these questions to us. And we'll try to be as direct as possible about the kinds of questions we get, and therefore the answers. So how do I think about this novel mechanism of action? That is, there's nobody else targeting this pro-plasticity, pro-neurogenesis mechanism. What weight and interpretation do I give the original third-party phase II study? How much does the single-arm Phase IIa trial with our prospective replicated enrichment reduce development risk? We'll talk a lot about that.
To what degree has placebo response been accounted for, either in the biomarker selection or in our clinical trial execution? What does a potential Phase III program look like? Can we execute that in-house with our clinical operations team? And that given this approach is new for clinicians, how readily will clinicians adopt biomarker tests? Hope to very thoroughly address all of this. As we think about the drug itself, and the approach that we've taken to developing it, and then ultimately commercializing it, there's a number of salient points, and this is almost an outline of how we'll talk through the data themselves. So targeting right now, depression, MDD, patients with poor memory, this poor cognition biomarker.
Also, as I mentioned, a bipolar depression program has kicked off very similar logic in terms of clinical and biological background with bipolar depression as in major depression. The drug itself is an oral, well-tolerated, you know, potentially first-in-class, small molecule that enhances neuroplasticity and neurogenesis. We're aiming for a clinical effect size, a Cohen's d here, of 0.3 or higher. That's how we've set up our Phase 2b. That's also how we think of where a drug effect size should be to be competitive with everything else that's either out there now, heavily as a genericized market, or that's coming through the pike as development programs, and we'll talk much more about effect sizes. Tolerability here contrasts well, especially with antipsychotics.
And then our IP and commercial strategy give us IP protection into the early 2040s, focusing especially on how we select patients here based on the biomarker, and then a test that goes with that, as you saw, is readily scalable. So let me start by telling you more about the patient population itself and going through some of the background thinking here. So we talk a lot about neuroplasticity. So let me just define the terms and what that means in the context of depression. So neuroplasticity broadly relates to the ability of the brain to adapt to changing environments or cues internally or externally. The neuroplasticity framework for depression, which has been, as you can see on the right-hand side here, something that we've thought about as a field for a long time.
Gerry Sanacora's late colleague, Ron Duman, did a lot of work at Yale around this, especially in various animal models, has argued that essentially in depression, patients, there's a reduction in neuroplasticity, in particular within structures like the hippocampus, which are important for mood and for cognition, and they essentially get, if you will, to put it in a colloquial, stuck in a rut. They're less adaptive, less able to incorporate new information, including about themselves, which reinforces all the negative emotional biases, so you have emotion, mood, coming together with cognition here.
And there's a lot of literature we'll talk about in a moment, but really fundamentally, what's different here is not just understanding the neuroplasticity hypothesis in depression, but taking that next logical step and selecting the patients with whom you can actually say they have deficient neuroplasticity in the hippocampus, and that's here done with memory. So taking something that's well understood and then translating it to practice clinically. And then this product, ALTO-100, is specifically developed, as you'll see, to target hippocampal neuroplasticity. That really comes together mechanistically. Let's talk for a moment about what some of the evidence is for neuroplasticity and hippocampal abnormalities in depression. It's extensive. In humans, in vivo, there's evidence, for example, for hippocampal volume reductions. That's been one of the most consistently observed brain changes in depression.
You also have a change in memory, which is underpinned by the hippocampus, and that's also been very, very consistent for study after study after study. More recently, people have actually imaged using PET synaptic vesicles directly, which tells you about synaptic function, and that's reduced in depression. All of these things correlate with each other, so you have volumetric reductions correlating with poor memory, you have reductions in SV2A PET signal correlating with poor memory. So that's all in people, you know, who we can interrogate in different ways. But then that's backed up by postmortem data.
Postmortem data in humans shows a decrease in cell number, synapses, and decreased expression, and we'll talk about mechanisms more in a second, but decreased expression of things important for neuroplasticity, which includes glutamate receptors, but very importantly, BDNF or brain-derived neurotrophic factor, which you'll see is part of the heart of the mechanism for ALTO-100. This is then mirrored by work in animal models, and you can look at animal models in different ways. Of course, take all of that with a grain of salt. We know humans and animals differ in important ways, but there's a lot of convergence of those findings, whether it's in the form of memory itself, plasticity, structure, gene expression, neurogenesis, all of this comes together as one coherent story. So we talked about memory and memories as a way to assess hippocampal plasticity.
The way that we are doing it in our trials draws on decades of work, understanding how to measure memory and doing that with well-validated scales. What we did in our test, which is called VM React, was we adapted a well-known test, it's been around for over eighty years, called the Rey Auditory Verbal Learning Test. And that test has been linked to hippocampal function over and over and over. Part of the reason we developed our version instead of using the traditional is the traditional requires a rater, and to make this scalable, you've got to do it on a computer. As you all experience when you do the test, this is a free recall-based memory test.
In other words, you have to come up with the words yourself, which is a more sensitive test than other cognitive batteries that are out there that rely on recognition. Is this word old or new? And that was all deliberate. That, that was stuff that we did years ago in my lab at Stanford, recognizing the importance of this measure and the importance of measuring right in a self-administered way that the patient can do on themselves, and then we validate. And you can see that validation in the upper left. So this traces the learning, so the number of words recalled correctly here out of 15. As you go through the learning trials, there are five learning trials, and then we give you an interference list that is an unrelated list.
Of course, that knocks performance on that list down because it's the first time you're seeing it, and then come back and test your original list learning later. That design and the results from VM React very, very closely mirror those in the Rey Auditory Verbal Learning Test. So there's a lot of validity built into this, and there's reliability both in the original and in our version. There is a 0.8 correlation in performance in this memory test across two months, and in fact, much longer. What that means is it's not a good day thing or a bad day thing, or if you had your coffee or, or whatnot, but it's really a fundamental measure of that person that's stable over time. So that's a great place to start to characterize the biomarker. Now, what about the people?
One common question we have is, given the deficit in memory and depression, is this related to age? In our norming of our battery and standard in any cognitive measure is to take out the effect of age. Obviously, cognition changes with age. That's not what we want to be looking at. Also, looking at effects like education, IQ, taking all of those kinds of factors out so that you're truly measuring a process in people. And you can see here on the left, this is in a large study, a thousand patients with depression, three hundred controls, that the Z-score in patients, that is the healthy normed performance-...
Is pretty similar in terms of deficit, being below zero here is a deficit relative to healthy, no matter whether you're younger than twenty here or at the upper end of the study in the sixties, and you see this in first episode patients. You see it in kids just the same, and then interestingly, and this is very important, existing treatments, these were two SSRIs and an SNRI that were in the study, do not change cognition, and you can see pre- and post-treatment, there's no change in cognition, and moreover, there's not a relationship between cognition and symptom change, so on the right, you see grouping in red and green, patients by whether they went on to symptomatically remit, and so their cognitive complaints, subjectively, along with the rest of their symptoms, reduced, and yet their deficit cognitively did not change.
So really, a fundamental aspect of depression per se, not well targeted by existing treatments, that sticks around no matter what your symptoms are. The other thing about cognition is it's really a consistent way to identify a more in need and ill population. You see in the upper left here that treatment-resistant depression is strongly predicted by cognition, by verbal memory here, much more so than symptoms would predict TRD. On the upper right, you see that hospitalization rates are again uniquely predicted by cognition. Poor cognition leads poor memory, leads to higher rates of hospitalization. On the left is results from the study I just mentioned, where we see lower treatment response across standard of care treatments if you have poor cognition.
And then we just published this paper on the lower right, showing that for vortioxetine, which is the most pro-cognitive of the antidepressants we have, that's part of how it's marketed, even though it's a pro-cognitive antidepressant, changes some aspects of cognition, it still doesn't work clinically any better for those patients with poor cognition and poor memory than those without. So this is really fundamentally a population in need. And one of the most consistent things you see about these patients is not just how they're more likely to be TRD and less responsive, but they're also more functionally impaired in sample, after sample, after sample. So these are two examples. Here was a study where this is a recent analysis we did in a poster that's on our website, where this patient population had two forms of assessment of functional impairment.
So on the left here is the UPSA, which is a performance-based test of function. On the right, the right-hand column here is the FAST, which is a clinician-rated measure of functional impairment. In both cases, poor cognition strongly predicted lower functioning. As you can see, things like the MADRS did not. Certainly didn't survive correction for multiple comparisons here. So quite a dissociation between cognition and symptoms, and very similarly, in a large, longitudinally followed sample, in Texas, what we saw here is those patients with poor cognition had lower functioning, not just at that snapshot, where you take one time point, but consistently over six months of follow-up. And this is a quality-of-life measure, and you see this on multiple other measures. It's...
As you think of the relationship then between cognition and depression, you also see that risk for depression, in the form of genetic risk, is associated consistently with poor cognition. So it's part and parcel of the disease itself throughout all stages, leading us to our hypothesis that if you focus on these people and you enhance hippocampal neuroplasticity, you can get really unique benefits for these people. I'd like to pause for a second, invite Jerry up to the stage here, and just give his perspectives from a clinical view of what these patients are like, as well as what's available now to treat them. Jerry?
Sure. Would you like me to-
Yeah.
Or just talk, or?
Yeah, just talk a couple words.
Yeah.
Yeah, that'd be great.
So yeah, the interesting thing is when we see these patients, cognition is not usually the first thing that they complain about. You actually have to get to that point a little bit. And it's really difficult to do this in many cases because some of the most standard tests we would do, something like a MoCA, something like that, that may not even be sensitive to this. At the level that we're seeing this, these general tests aren't. And if you're being seen in a primary care doc's office where the majority of prescriptions for antidepressants, at least the first line, is being used, you usually have four to eight minutes to talk to that clinician. You're not going to get this level of information at all, so it's really not being seen.
It isn't until we go further on that we start to really get into the specialist. So you have some level that this cognitive impairment may be present. And once you start talking to the patient about it, that's when it really comes out. Yeah, this has become a major issue, but it's not, for many of them, the presenting issue. This isn't what they usually come and tell you about first. "My cognition is not there." However, you talk to them for a little bit, and it really comes out that this is a major impairment.
Do you want to say a little bit about just how clinically you would think about these patients?
Yeah. So, patients in general-
The patients with poor cognition in particular.
Yeah. So clinically, you know, the way to think about these people is, how do you get them in for improved cognition? And like, I mean, really, we don't have any specific targeted, vortioxetine is probably the one that right now, more than anything you would use. A lot of the clinicians' thought is: How do I not give them something that's gonna worsen their cognition? And that's usually one of the big things. They're usually on a benzodiazepine or maybe, another medication that will in some way impair their cognition. But we don't really have anything. A lot of times people may go to some of the treatments like, stimulants or something, but there's really no good evidence that that's actually going to improve their cognition.
Good. Thank you. So you can see - thanks, Eric.
Yeah.
So you can see a really high-need population, poorly addressed by existing treatments, well understood, especially by, by psychiatrists and specialists, 'cause these are the people you deal with. They start really piling up in your clinic because they're more chronic and less responsive. So how do we address that? That, for us, ties into the mechanism for ALTO-100, as you'll see. Really all comes together well. So ALTO-100 was identified based on a functional assay for neurogenesis. We'll go about into some of that in a minute. Was found to enhance neuroplasticity at multiple different timescales. You can see one of the fastest timescales in the upper right. This is a long-term potentiation study in a brain slice. The mechanism works through BDNF. We'll talk more about that momentarily. And this is really a potential first in class.
We'll talk more about that target in a second. You can see it's not just plasticity acutely that's changed, it's even hippocampal volume in the bottom, right, that's increased, which is very much something that we saw, as we discussed, reduced in patients, so let's take a step back to the genesis of this compound and the validation preclinically of this compound, and this is an area, by the way, just, you know, been interested in for a very, very long time, did my MD-PhD uptown at Columbia with Eric Kandel, working on memory, and the floor above us was René Hen's lab, and they were interested in neurogenesis.
This is 2003, when Luca Santarelli published this science paper showing that when you ablate neurogenesis, X-ray irradiating the hippocampus to knock out just dividing cells, you eliminated the antidepressant effect of a variety of standard antidepressant, Prozac and the like. That really motivated a lot of people to start thinking about neurogenesis and motivated Neuralstem, the originator of this compound, to identify ALTO-100 through a screen at this time because of that finding. They took hippocampal stem cells, looked for drugs that enhanced division and differentiation of stem cells, did a number of other screening steps to come out with a candidate that was then tested in a number of different models.
I won't go through all of them here, but you can see in the bottom left, every single animal model that was tested showed evidence of improved cognition or neuroprotection in the form of, for example, diabetic neuropathy, which you're looking at peripheral nerves. So a very broad preclinical profile that was all very consistent. Moreover, what they saw is that this pro-plasticity phenotype occurred at multiple different timescales. So these on the left, usually on the span of weeks, and on the right, you see within hours, changes in plasticity. So it's a really broad effect on plasticity, and part of that is because the drug's effects are on mechanisms that relate to many aspects of plasticity.
For example, we've seen in our own work, really trying to drill down on the direct molecular mechanism itself, that the drug leads to an increase in BDNF release by neurons as quickly as you can measure it. So this is within less than minutes. And then, if you actually block the activity of BDNF, as you see on the right, this is calcium release by the drug, you block the effect of the drug. The drug itself doesn't bind to BDNF as its target. We know that. We have done a lot of work, I'll outline in a second, as to identifying what that direct target is, but we believe it has one target.
The other way you can see the role of neurogenesis in animals causally in the effect of BDNF is in this work done actually by René Hen's lab as well. What this is is a test that they've developed over these last 20 years, that's a test that's dependent on hippocampal neurogenesis. If you ablate those neurons with X-ray radiation, you alter performance in this test. You can see that here. What animals are doing is separating between two very similar patterns in space, in two rooms, one of which is they get shocked in, and the other one they don't. It takes a while for them to learn. You see on the upper left in red is that reinforced shock-based context, and they learn eventually.
But if you X-ray irradiate their hippocampi, you eliminate that, and that's on the bottom left. On the top right, you see that when you give them ALTO-100, there's a substantial enhancement of that learning because ALTO-100 enhances neuroplasticity and neurogenesis. If you X-ray irradiate those animals, you knock that out completely. So really saying that we are, with this drug, targeting the kind of mechanisms that we would want to target based on first principles from human work. And we have since done a lot of work to identify what the direct molecular target is for this drug. For example, blocking downstream effects and a variety of different downstream effects in neurons, either pharmacologically attacking that target or genetically using CRISPR on that target.
We have expressed that target exogenously in other cell lines, which endows cell lines with responsivity to ALTO-100, and demonstrated physical interactions with that target, and that includes both physical binding experiments, computational docking experiments, and all of that now leads to the ability to develop new compounds around this potential first-in-class target in depression. That itself is incredibly exciting that we could be opening the door to entirely new biology in depression, and we'll disclose that in due course as that work is completed and matures. But let's go on now to the clinical, and really what underpins the design and execution and interpretation of our Phase 2b study. This is essentially the overview. So, you know, two weddings and a funeral, or however you want to kind of group things.
We have two prospective datasets, two retrospective datasets, and two placebo datasets here. So I'll go through it in brief and then dive into each. We started here by reanalyzing, on the left-hand side, data from the originator in a phase II study that was itself grossly underpowered. It was powered at a Cohen's d of 0.5, which is much higher than an all-comer study is ever expected to reach. But they had collected some aspect of cognition here, a more limited set of tests, so we looked at a global measure of cognitive impairment that's correlated with memory and found that even though in the all-comer, the trial failed, in those people with poor cognition, there was not only a bigger effect size that was statistically significant, but a dose-response relationship.
That motivated us to acquire the compound, but our data and our confidence come from our prospective data, which is to the right. However, it allowed us to take the dose identified, 40 milligrams BID, which we sometimes call 80 milligrams, and bring that into our studies. On the top here on the right, you see our Phase IIa study, and that was divided into a discovery and then a prospective replication dataset, that focus on memory, and we'll talk about that in more detail, but was a successful replication. Additionally, we did a smaller pilot for our decentralized clinical trial approach, and you can see again, that that's another prospective replication, that poor memory predicts better outcome. And then looked in two placebo datasets directly at those patients with poor memory.
They'd measured memory in exactly the same way that we did, just using the traditional tool, and showed that placebo response was not greater. So that's a lot of clinical data and a lot of prospective replication, and that's the basis upon which our Phase IIb design and execution rested. So let me talk for a moment about what that means from the sort of dataset and prospective element perspective. So as we think about the typical way things are done in biotech, where a lot of analyses are post-hoc and might motivate the next trial and often don't end up working out, we needed to protect ourselves from that opportunity for false discovery.
You could play around with severity and look at five different levels and convince yourself you're seeing something, but there's no better way to really understand whether you have something than to put it to the test in a totally independent, blinded population and see whether the result holds. That is our bottom line, and that's our bottom line, not just for ALTO-100, but for all of our programs. So here we took our large Phase IIa population, divided it into a discovery subset, where we finalized the memory measure itself, and then we prospectively tested it in an independent, locked and blinded group of patients who'd gone through the same procedures. Nobody here knows their biomarker status in these trials and then tested for whether we could see that those people with a poor memory marker responded better than those with good memory.
When that was successful, of course, then you have to verify that placebo is not also enriched, which we did. So that marker specifically enhances response to ALTO-100. That allows us to then go on to the Phase IIb, where we select for patients with this memory phenotype, but also bring along a small group of people without that memory, who are negative for the memory marker, and that's to be able to show enrichment. So at every step, we're almost checking and double-checking our work. These are the baseline data and the patient flow for the Phase IIa. What I'm gonna focus on is that boxed population there, which is that prospective test set, ninety-three people, and we'll really dig in on that. Importantly, the study itself here used both monotherapy or adjunctive treatment with a drug.
That is, however you came in, we added ALTO-100 on top. If you were not coming on an antidepressant, often the case for those types of patients is that they'd been treated before, and they knew the kind of course and outcome you typically have, and they were interested in something else. So they're not coming on a medication, we added ALTO-100 above, on top of that. For folks coming in on a medication who are on a stable dose that then didn't change throughout the trial of an antidepressant at an adequate level, but who didn't respond to that antidepressant, we then added ALTO-100 on top of that, and that's the adjunctive group. This is results from that replication dataset. So here, everybody gets drug, a single-arm design. Remember, throughout this, everybody's blinded to the biomarker, either sites or patients.
The data themselves are blinded, so our analysts don't know the results for the prospective replication. And we're very pleased to see the pattern that we saw now over and over and over in all these different trials and analyses, where patients with poor memory responded substantially better than those with good memory. Here you can see a Cohen's d of 0.6, which is roughly double the drug placebo difference you see in depression. And the numbers you see at the bottom there are the effect sizes at week six. That's our primary endpoint timeline for the Phase 2b, split up by the monotherapy and the adjunctive arms. Very similar enrichment, 0.66 in monotherapy, 0.56 in adjunctive. And there's a poster on our website from ACMP last year where you can look at that whole time course more directly.
You can also look at outcomes here and enrichment as a response rate. And the reason for doing this on the same measure, that is a 50% reduction or greater in MADRS scores, is this gives you a perspective of what the clinician and the patient would care about. They don't really, I mean, nobody does the MADRS in the clinic. They don't really follow the individual score change. What they care about is: Is my patient better? So response here clinically, which you can see was roughly double in those patients with a poor memory marker than those with a good memory marker. And to anchor this, think of this mixed population of monotherapy and adjunctive, and the clinic is probably looking at something like 30s, max 40s% response to standard of care treatment.
So seeing 60% here in our poor memory group is very notably different than what they're otherwise experiencing. And then that was obviously even higher with monotherapy, which saw that, somewhat larger effect size. We then looked at placebo. So we've now prospectively replicated after two retrospective analyses, right? The original Neuralstem data, our discovery data set retrospective, and now prospective in our own phase 2a. And then we looked at the data here, patient-level data, who'd gotten placebo and had done the exact same Rey Auditory Verbal Learning Test, just in its traditional form. That allowed us to calculate the same marker in these patients as we'd done in our own. And you can see in two different studies, over 200 patients, there's not a better response to placebo in those poor memory patients. In fact, maybe slightly lower.
On top of these two datasets, which were drawn from the Vortioxetine Phase III program, we had six other datasets where some aspect of cognition was measured, unfortunately not memory, but you can calculate that global composite that correlates with memory, and again across those, we saw very consistently a slightly worse response in poor cognition patients to placebo. That's a lot of data on the alternative here to ALTO-100, and especially the most relevant alternative in our Phase IIb, which is placebo, and then beyond this, we took it another step further, which is in yet another prospective analysis. We conducted a now smaller pilot study of our decentralized clinical trial approach here, so psychiatry now, as a clinical practice, is a heavily telehealth experience, virtual care, where you're online with your doctor and doing everything from home.
We wanted to parallel that as well in our trials. About 20% of our patients in the Phase IIb were entirely decentralized. We gave them their ALTO-100 and used all of our procedures as used in our trial as decentralized participants. We also sent them a link to the battery, much like you guys got the link and did it on your own computers, and all of the procedures went well. Everybody completed the trial, but that allows us to look at it really, like, the most real-world way within R&D to understand what commercialization might look like, literally sending them the battery, and saw the same pattern of results we'd seen before.
This is patient-level data here, observed data at six weeks, where you can see those patients with the poor memory marker do substantially better than those without, and you can see that across all weeks of the study. Importantly, in our Phase 2b, we try to standardize equipment here, so they actually perform these tests on the same Chromebook that is at home, that is used at sites. But this is literally in the wild, what commercialization would look like, and was very encouraging. As we understand then, how to target our drug and population, we then think about the effect size that we see in our trials. Importantly, what the effect size is in clinical care with standard of care, and therefore, where we would want to target our powering and our effect size for the Phase 2b.
In blue, to the right is the effect size, kind of presumed placebo-adjusted effect size based on our prospective data in the phase 2a, and that puts the effect size over a Cohen's d of 0.5. But the important anchor here is actually where standard of care is, 'cause that's ultimately the real commercial competition, is this big market that's already out there. In gray here, you see the monotherapy treatments. These are meta-analyses of all the available studies out there, and in dark blue are adjunctive treatments. And they all come in around a Cohen's d of 0.3. So in our mind, hitting a Cohen's d of 0.4 is really where clinical significance, the heart of clinical significance here is, where it would put you clearly above standard of care.
That's where we powered our biomarker, our Phase 2b, sorry. But even as you cast your eye across this, beginning at point three, you're now really very competitive with standard of care, especially for a well-tolerated drug, and especially when you're now looking at a broad population, monotherapy and adjunctive, and in particular, addressing those patients who are in most clinical need, those patients with poor memory. In fact, most of these studies didn't measure cognition, but based on the available data that we have, these are probably overestimates of the effect size you see in poor cognition patients, because we know they respond worse. These are from all-comer meta-analyses. Targeting point four for our Phase 2b, but thinking point three and above is really where a drug would be quite successful in the marketplace. I alluded to tolerability.
This is data from our Phase IIa. Really, really well-tolerated drug, but most importantly, very consistent data between our study and the prior Neuralstem study. The only signal here is a versus placebo, is a slight elevation in headaches. It wasn't a significant driver of discontinuation. You can see discontinuation rates themselves were very low, and they're driven by ones and twos. There's nothing systematic that drove this. Really, it's a headache is a generic signal for a CNS active drug. And then what we did in our Phase IIa is we not only replicated prospectively, but then once we did that successful replication, we went then back and kind of bashed the data as much as we can to see, is there something that we were missing?
Importantly, what potential confounders are there in the data and the design that could directly inform us in terms of how we design the Phase IIb? For example, background medications, we already talked about monotherapy and adjunctive having similar effect sizes. That led us to include both in the Phase IIb. Even then, looking at adjunctive and saying, "Does an SSRI versus SNRI work, and, work any differently as a background medication?" Again, there, we saw very similar effect sizes. Clinical factors, we know in reality, patients are highly comorbid. Real-world patients are highly comorbid for things like anxiety, PTSD. Often, those patients are excluded from clinical trials with the idea that that leads to a, quote-unquote, "cleaner population." In reality, it leads to a less representative population, one more likely to fail in Phase III.
So we looked at our data, and we saw that half the people had GAD, and you saw similar enrichment in those with and without, and similarly around PTSD. We also looked at treatment-resistant patients as a subset of the adjunctive patients, and again saw similar enrichment. So really saying that clinical background picture that people focus on so much is really not an issue once you're at the level of biology and replicated biology. We looked also at self-reported cognition, including asking people about their memory, and this was really quite striking and echoing some of the work that I showed you earlier. Across multiple studies, self-reported cognition did not predict outcome. You have to measure cognition directly with an objective, highly reliable tool, like our memory test.
Other cognitive factors we also considered, for example, could we have missed something? While focusing on memory and replicating it, could there have been something else that was better? We looked back afterwards and found, yet again, memory was the most consistently predictive signal of anything in cognition. Then, importantly, looked at severity. That will come into play momentarily when we get to the Phase 2b baseline data, but really saw no relationship between baseline severity and outcome. Let me say that again. No relationship between baseline severity and outcome when you look at cognition. That's really important because in our field, in psychiatry, and especially in depression, there's a long history of people playing with severity for inclusion, but that has really consistently not worked, and it's not something that we do.
We really listen to our data, focusing on the biology, and there's a citation here for a really great meta-analysis demonstrating that any effect of baseline on drug-placebo difference is due to infrequently endorsed symptoms, not the core depressive symptoms. With that, I'm going to invite Adam up. Let me just introduce Adam first, and then also introduce Jessica, who will speak after him. Adam Savitz is an MD, PhD, like myself. We, I think, like to collect MD, PhDs at Alto for a range of reasons. He's also a psychiatrist, and I think that's a really important facet of understanding Alto and our kind of mindset here. There's two psychiatrists leading the organization and driving the trials.
But most importantly, Adam was at Janssen for a decade prior to coming to Alto. He led the seltorexant program into Phase III, which is a patient selection enrichment approach. So if there's anybody on earth who has, you know, clinical, you know, experience in this space, it's Adam, so we're very happy to have recruited him in 2021 to join us. As you know, seltorexant had a positive Phase III recently. He came here in order to advance the enrichment approach using a scalable biomarker.
Jessica, who will speak after him, Jessica Powell, our Chief Development Officer, comes with an enormous amount of experience in CROs, working with a variety of notable neuroscience names, as well as leading trials at a biotech, most recently, Alkahest, and has learned very directly in all of those experiments, experiences, what it takes to create a successful trial, and what do you need to know about your patients, what do you need to know about your sites, and what do you need to ensure about your execution to make sure that you're doing the right work to ask the right question, which then goes back to designing the right trial to be powered and positioned to really determine efficacy in the right population, and with that, I'll invite Adam to walk us through the Phase IIb data. Thank you.
Thank you, Amit, and I'm excited to share the design and some of the baseline data for our IIb study. Before we start, I want to remind everyone that the data remains blinded, the data is provisional, and we are not yet at database lock for this study, so the design of the ALTO-100-2B study is based on the FDA enrichment designs, so that the primary endpoint for the study are those patients with the biomarker positive, the poor cognition patients. The middle part of the design slide here shows a fairly traditional double-blind, placebo-controlled trial in MDD. We use a traditional endpoint, the MADRS, that is well accepted by the FDA as an endpoint. We are randomizing one-to-one between placebo and active. We are stratified first based on the biomarker status, poor versus good cognition, and then monotherapy versus adjunctive.
Even though patients are primarily MDD, as in the IIa study, we include people with anxiety disorders and at least mild PTSD. And we collect the biomarkers during screening. That's how we determine the enrichment, but also at the end of double blind and open label. And this overall design would be similar to a Phase III study in an enriched population. And we've thought carefully about the design of this study, based on my experience with enrichment, best practices around placebo. So we include people with both poor cognition, the primary endpoint, as well as people with good cognition. We do that for two reasons. One is to show that we're actually showing bigger outcomes in the people with poor cognition, but it's also to reduce the expectation bias that people may have.
If you tell them, "You have this marker, that may indicate you'll respond better," then you're gonna see better response, including bigger placebo response. So it's important to include both populations. For this reason, patients, sites, the operation staff are blinded if someone has the enrichment marker or does not, so poor versus good cognition, and actually, the participants and the sites are blinded to what the enrichment marker is, to prevent any selection bias in terms of the patient population. Also, to be cognizant of the placebo response, we're randomizing equally between active and placebo. We only have two arms, and that we use a third party, MGH Clinical Trial Network, during screening to interview the patients to make sure that they have MDD, that they are appropriate patients for an MDD trial. As Amit mentioned, we're including both monotherapy and adjunctive.
Both populations had similar enrichment in the IIa study, and including both reflects more real-world practice, and we include an open-label extension. This is to reduce patient anxiety around receiving placebo, reduces some expectation bias, and also helps to recruit more real patients into the study.
One thing to really reinforce here is that, you know, I talked about the placebo data and understanding what these patients are like. Most people actually just hope the historical, you know, ranges for placebo is what they see in our trials. We study placebo as a treatment, but then if you go back to the slide before, you know, really sweating the details of execution and the kinds of things that reduce placebo rates. You know, we've seen in trials of rapid-acting drugs recently that if you simply on placebo alone, right? So the intervention is no different, and it doesn't matter which trial you're looking at. If you simply ask patients more frequently about their symptoms, they will oblige by telling you they're feeling better, and you can see at three days the magnitude of response that in a traditional trial, you see at six weeks.
So execution matters, design matters, and bringing it all together is absolutely critical. The issue of blinding is a really, really important one. At no point do even our clinical operations staff know patient status, and when we message to patients through the sites, they're in a trial of a drug. That's why we include an open label extension afterwards, so everybody gets an option and gets the opportunity to see the benefit of the drug. They're not there because they have a biomarker, so the last thing we want in somebody's mind is, "I got into the trial because I have the biomarker, therefore, I'm likely to respond." Everybody's blind to the ratio. Today will, in fact, be the first time that you see what the ratio in the trial is here of people with the poor memory versus without.
Sites and patients have been blinded to that information all throughout. And all of this is really by design, trying to run the most rigorous study possible and one most directly informed by our data themselves. And in fact, the next slide will speak directly to that.
This shows this data from the 2a study. It's showing the distribution of people with the cognitive deficit and their response on the MADRS. The people on the far left have more cognitive deficits. They also show a greater response on the MADRS. This is a continuous linear graph that shows this relationship. In the IIa study that Amit described, we used a cutoff for poor cognition of half a standard deviation from normal based on age and sex-matched norms. That's a Z-score minus 0.5. That's the left-hand line indicating the poor cognition. We selected that cutoff for a couple of reasons. One, that shows a meaningful decrease in cognition or in memory. It's a reasonable population, about 30% to 50% across different populations, and it's also a cutoff that is understandable to clinicians.
To show in the 2B study a difference between the poor cognition and good cognition, we selected the good cognition group as people starting with average cognition or average memory of a Z-score of zero or better, and we actually screen failed the people who were in between minus 0.5 and zero. And that difference is also within the standard error of the measurement. We undersampled the good cognition group, to ensure that we had enough of the poor memory people, because that is where the primary endpoint was, and we powered this to have approximately 200 people to have an effect size of 0.4.
So to speak to what Adam was just saying in terms of the undersampling, the expectation here from enrichment guidelines that the FDA has written and published in twenty nineteen is that your primary outcome population, here are the poor memory patients, is what's powered. That's what the label is. The contrast to people without the biomarker is qualitative. In other words, you don't need to power an interaction analysis. We need to show that the poor memory are significantly better than the good memory folks. There just needs to be a qualitative difference that supports the argument around enrichment, which is further supported by the mechanistic relationship here between the drug and the biomarker itself. And that's really, as you will see when we get to the analysis plan, what kind of drives that all together.
So in terms of the protocol, we've had several interactions with the FDA. An early interaction around the protocol led us to increase the size of the study from 200 to 266. This was to ensure powering in a step-down approach for the monotherapy group. We had a recent Type C meeting with the FDA, with an overall agreement with the enrichment design, that with the discussion there, that the FDA had a better understanding of the connection between the enrichment marker, the verbal memory, the mechanism of action around neuroplasticity and the neuroplasticity deficits of MDD, as well as discussion around the importance of the Phase 2b data in terms of the Phase 3 program. That program will be further aligned with the end of phase 2 meeting.
It's clear from the experience around other enrichment trials, such as seltorexant and the discussions at the Type C meeting, that we will need to include in Phase 3, MDD patients across the range of cognitive severity, even though the primary endpoint will be in the same poor cognition group as in the phase 2B study. This is in order to determine a risk-benefit analysis for those with different levels of intact cognition.
Just one small thing to add is that 266 as its subset, which is the poor memory population, and we'll get to numbers in a second with the baseline data. That target population size is what we hit with the 301 total people, which means the data that we'll show you has a few more good memory folks, which is helpful for showing enrichment than we'd anticipated, but the trial hit the outcome, the specified population size after that update from FDA.
Here is the flow of what we enrolled in the 2B study. We had our last enrolled a few weeks ago. Overall, the population was 301. The primary analysis subset is actually the modified intent to treat, the MITT, which is 289. This is all patients who have at least a MADRS score of 20 or higher at day 1, which is visit 3. This to ensure that there is a certain level of severity with these patients. Sites were blinded to the threshold for the MITT. Site-based MADRS scores were not part of the inclusion or exclusion criteria, because that prevents the sites from having to rate the inflation of the baseline scores, which then come down after the initial dosing.
So from the 289, we have 197 in our poor cognition, the primary analysis set, which is then goes to the monotherapy and adjunctive, and our adjunctive population is about the two-thirds we were targeting.
The monotherapy.
The monotherapy was two-thirds of what we were targeting. And the screening, you know, in terms of determining disease and severity during screening, and you'll see data for these values, is at visit one, we're determining diagnosis. Severity is determined on a patient self-report, the PHQ-9. And then after that visit, before people move on, they have a remote visit with the MGH Clinical Trial Network, where they do a SAFER interview as well as doing the MADRS. This is the MADRS score for inclusion of a 22. At visit two, that's when we determine the biomarkers. That's when we determine if someone has poor cognition or good cognition or a screen failed range. We also do a PHQ-9, the self-report, to make sure they still maintain the minimum level of severity. And then at visit three, prior to dosing, they then have site-based MADRS, another PHQ score....
During screening, we obtain a range of biomarkers. This is to further our platform for both the active ALTO-100, but increasing our knowledge of placebo for future trials. So we're attaining EEG, cognition, and the wearable for heart rate and activity and sleep, and we have very high QC rates here. For the memory, enrichment marker, we have a QC rate that is at 99%, while the other measures are way above industry averages.
So something to comment on there, and we'll hit it when we talk about execution as well. These are 34 sites across the United States, and just on that note, we'd expect our programs to continue to be US-based. That dramatically tightens processes. You don't have to worry about languages and MADRS in different languages and memory tests in different languages, and so forth. Obviously, when we get to commercialization, we'll consider that, and we'll consider a global program. But execution on this trial and execution on key Phase 3 trials would be US-based studies. But those sites don't have a whole lot of experience with EEG. Most of them have none at all. They don't really do a lot of cognition, often don't do a lot of wearables.
It's our clinical operations and engineering structure that had to be there to teach them, to monitor performance, and to make sure that our tools, our software tools, were positioned to allow them to get the data that we needed. 93% QC pass on EEG across 34 sites in the US, with many, many, many people here screened and then enrolled, that's really notable, and that's also true in our ALTO-300 study, which uses an EEG biomarker to select patients. There's a lot of nuance in the execution that is underneath these numbers, but if you compare them to industry norms, as Adam was saying, all of these are well above industry norms and critical for being able to execute on a commercialization program.
So in terms of the baseline data for the two arms, it's important that we have similar populations to be able to show enrichment, that we have similar demographics, similar severity between the poor cognition and the good cognition. And we're very excited about how well matched the populations are, both age and gender, race and ethnicity, as well as education. And also, the severity is very well matched for both the initial SAFER interview, the baseline for the MADRS, but also the PHQs at the different times that they were collected. And what you can see is that the day one MADRS rating is actually slightly lower than the MGH central rating. So this is showing that there is no baseline inflation that can lead to a higher placebo response, which is crucial and shows the execution we're getting from the sites.
So we often get questions in investor interactions, right? Is this a function of age, right? Age here is, you know, very similar. Is this a function of severity? The SAFER MADRS is literally to the decimal point the same, right? So not only, as Adam was saying, does that allow us to say if there is differential enrichment, that it's not due to baseline, but it underlines that point that biology is independent of clinical presentation here, and yet predicts outcome with certain interventions in different ways. That point about no evidence of score inflation here. So, for example, if you see the SAFER to the baseline, to the treatment baseline, the visit three being very similar and very similarly matched for the PHQ-9 below, that's important because that's a big determinant of placebo response.
So if you come in and the site's like, "Ah, I wanna get this person into this study," maybe that, you know, three is really a four on this measure, and I can just scoot it up, and they can make it in. That's not a real measure. That's going to regress back to their actual mean when they're in the study. That's part of your placebo response, right? The design and execution of the trial itself is part and parcel of the placebo response, not just the expectation around what the patient believes will happen in the trial. And there's no incentive in our design for sites to inflate scores because they don't determine the MADRS for inclusion. That's the MGH third party here, SAFER, and you can see that reflected by the data.
Many of you have asked, what am I gonna learn during Investor Day? Hopefully, you've already started to learn a lot, but this is one of the things that we highlighted for people before.
Right, and this again reinforces that cognition is separate from the symptom severity here, as the MADRS scores are very similar between the two groups.
Oh, sorry, one last thing is, as just sort of a side note, if you look at the demographics of these participants, it's a pretty diverse population as well, which is also important for Phase Three.
Then taking the poor cognition group and comparing the monotherapy and adjunctive groups, again, these are well matched, particularly on severity, that there is no difference in severity between the monotherapy and adjunctive patients, both on the MADRS as well as the PHQ at the different assessments. The overall gender and age is well matched. We are seeing a slight difference in the racial differences. This is likely due to availability of insurance. Then looking at cognition, this reinforces the importance of patient selection, of not doing all comers. So in terms of the 2A study, the percent of people that had poor cognition was 48%, while in the 2B study, it was a lower percent, 34%. Both are within the 30% to 50% expected range.
But if you were doing ALTO-100 with an all-comer population, where ALTO-100 works better in the poor cognition people, you would have less of a chance of seeing a signal in the people who were in the 2B study compared to the 2A study. But with the patient selection, with the primary endpoint, we then could compare poor cognition people in the 2A study with the poor cognition people in the 2B study using the same test, using the same Z-score of minus 0.5. And you can see that in both groups, they have substantial global cognitive deficits of minus 1.4.
That's what you would see in patients with schizophrenia, and that were matched from the 2a trial to the 2 B trial in the primary analysis set for both MADRS severity, the PHQ, global cognition, as well as the verbal memory marker.
Yeah. So this, I think, is actually one of the coolest slides because this is the stuff that's under the surface of all the trials you guys have looked at in the past. All these things are rarely ever measured, right? When you measure them, you start to realize you're sampling populations sometimes randomly differently. Sometimes there's a systematic difference between the people who come in for an early trial and a later trial, and we certainly know Phase 3 trials are different than phase 2. That's why we've designed our Phase 2b to be most like the execution and population and size of a Phase 3 trial, so there's less discontinuity potential there. But it's literally directly demonstrating that once you select patients, you're now talking about the same population over and over, and they're very impaired on cognition.
So we think of schizophrenia as a cognitive disorder. People don't think of depression as a whole, as a cognitive disorder. In these patients, it's a hugely cognitive disorder. So-
The screening portion, how that's in the screen through.
Yeah.
That they-
Exactly, and that's all just coming through within the screening population. Those are the percentages here. Once you find the right people, you then power for that targeted population. So this is the enrolled population to the right and the screened population to the left.
So I'm now gonna pass off to our Chief Development Officer, Jessica Powell, to talk about the operational aspects.
Thank you. Okay, we're gonna shift gears just a little bit and talk about how we conduct our clinical trials at Alto. We are unique in that we use a model that really relies on an in-house approach, and we often get the question: Why don't we use CROs? And the answer to that is simple: because we collect higher quality data by doing it in-house, and there are multiple reasons for that. One is that, we have a substantially reduced cost overall, which actually allows us to do more while we're spending less, and while we're doing more, it's not just about volume, it's also about spending on the right aspects of the study so that we're actually improving quality overall. We also have a highly trained team.
All of our CRAs worked at CROs before they came to Alto, and once at Alto, they receive ongoing additional training that's specific to Alto and specific to our trials in psychiatry. We have a really high retention rate with our CRAs and our entire team. It's actually the same team executing our Phase 2b trial that executed on our phase 2a, and you would never see that in a CRO. In fact, CROs have a notoriously high turnover rate of their CRAs, 25%-35% over the past few years. We at Alto have had a zero voluntary turnover rate over the past three years. This ensures that our team is highly trained, and they're really subject matter experts in psychiatry. The last one that I'll talk about today is represented here on the slide. Sorry, I did not advance it. There we go.
The last one I'll talk about today is our actual approach to monitoring. We do things just like a CRO does in terms of traditional monitoring aspects. CROs really are about reviewing individual case report forms, an example of which is shown on the upper right here. At Alto, we do things with our own unique way. We're a very data-driven company. I'll get to that in a little bit. But we really focus on direct data entry with our sites and real-time data review. What this means is our CRAs are picking up data entry errors in real time. We do things like eligibility review that prevent inadvertent enrollment, and this is really a shift from reactive monitoring, which is really the industry norm, to more of a proactive approach, so that we're picking up site issues earlier in the process.
We also use a mix of on-site and remote risk-based monitoring. This is assessing what type of risk a site may have and spending more time with them if they need it, and we use independent external raters to confirm MDD as well as monitor rater performance, and Amit and Adam have both spoken to that a few times already, but we're using third parties to actually, confirm the MDD diagnosis before the patients are enrolled into the study, and we're using third parties to independently monitor the rater performance over the study conduct. This ensures we have this, another layer of independent oversight in addition to what we're doing internally, and we go deeper. We do cumulative reviews, and there's two plots here. The middle one is actually looking at our own monitors' performance.
We are monitoring how our CRAs are actually looking at the site data, and we have many metrics to look at that. This is just one, so that if an individual CRA falls off, we can pick that up pretty quickly, within a week or two, and go in and intervene and support that. The bottom one here is a plot that shows the reasons for screen failure across sites. There are many reasons for screen failure. We're just checking trends here, and we're able to compare those to the study averages, to the expected averages, and if you see something abnormal in a site, you can go in and intervene. These are two examples.
We're doing this across protocol deviations, concomitant medications, adverse events, such that we get this really comprehensive view of how a site is performing and can intervene very quickly if we pick up nuances. And again, data-driven company really looking at metrics across sites to intervene. All of this is really scalable for Phase III. I think that's really important because we do this at the Phase IIB level, but to upscale to Phase III is going to be very simple for us. At Alto, we have over a hundred currently activated sites across our clinical trials that we selected from three to four hundred sites that we chose the very best that we wanna work with. On the Phase IIB study, we have 34 sites that have enrolled patients on that study.
25 of these are professional CNS sites, 5 of them are academic VA sites, and 4 are decentralized sites. This allows us to have a site blend that includes professional experienced sites that know how to execute psychiatry studies. It expands our access to thought leaders and also to patient populations that we might not otherwise target, and it expands our geographic reach such that we are including patients across the United States, and all of these metrics set us up very well to pick up patient trends or site trends that really we can intervene on and ensure they do not become an issue for the site. We know that we are enrolling real patients at sites that have high-quality data, so at the end of the study, we are confident that we have a well-executed, well-balanced study.
So as you can tell, one line in Jessica's job description is, "Sweat the small stuff." And that's what, I mean, that's what she does every single day. That's what we do. That is really core to our approach. Just, you know, anecdotally, when we talk to pharma about, you know, just update them on what we're doing, and they hear all this, they ask us: "Hey, can you guys run our trials?" And the answer, of course, is absolutely not. But, you know, that really tells you that the people who do this for a day, you know, day in, day out, doing these kinds of trials, appreciate these important facets. And as Jessica said, it's all very scalable. We're already across our two Phase IIBs with ALTO-100 and ALTO-300, representing what is essentially a Phase III program. That's already happening.
There's a lot of time and effort that went into building this, but it's now an engine that just can keep giving. Great. Thank you, Jessica.
Yeah.
Let me go through then how we will analyze the data and how we'll report out the data here from the Phase IIB. We talked about the primary outcome here being in the poor memory patients, 197 people, and that is pooling monotherapy and adjunctive. That's powered to Cohen's d of 0.4, and we were targeting the sample size, and we hit that sample size. We were also targeting about two-thirds of the patients being monotherapy, and that's exactly what we hit because the goal here was to power a little bit better than we had before, Cohen's d of 0.5 for that population. We've also built in a step-down test so that if the top line is significant, then you go to the step-down test without alpha spend and look at the monotherapy subgroup.
And that's really there because, while our goal is to go after a broader population, as the best represented clinical subpopulation and knowing how trials have happened in the past, separating monotherapy and adjunctive, it gives us that much more of a built-in bit of information about that population, and that was really on feedback from the FDA very early in the study. But it's important in looking at these to also understand, as I was outlining before, where the unmet clinical need is. This is a deeply, deeply underserved population that could be readily identified, doesn't benefit from existing treatments. And so as we understand the effect sizes here, and remember I talked about anything above a point three we think will be very commercially competitive... Our decisions to advance to Phase III are based on that. Our design and our powering are based on that.
So anything above a point three in our mind merits advancing, obviously, looking at the totality of the data that we'll have in front of us, and that we will report out in as clear of a manner as possible. Obviously, focusing here on the primary outcome, there'll be a lot of other outcomes we'll be looking at in this study. All will be reported out as quickly as we can in different medical conferences, but trying to be as clear as possible about what leads to a drug, the drug here being advanced to Phase III.
One of the things that we'll also do is look at a single level of severity of cognition that is more severe than the inclusion threshold, that is, looking at a Z-score of minus one and below, to try to understand whether, while you restrict the population somewhat by doing that, you can further enrich. And that certainly is suggested by the continuous and linear relationship we saw in our Phase 2As we showed between degree of memory impairment and degree of benefit. One of the things we will not do is play around with clinical severity. As we've talked about, that doesn't differ as a function of cognition. Really focusing on our, the critical thing we're manipulating in this study, which is level of cognitive impairment. We will, as is also noted on ClinicalTrials.gov, be looking at the full population MITT, so independent of, the biomarker.
That is interpretable as what the all-comer effect is only in the absence of enrichment. Needless to say, if the drug has clinical value, and we're just wrong about the enrichment marker, and it hits on that outcome, that's also a reason for advancing it to Phase III. That's powered at a Cohen's d of 0.33. The good memory outcome, which is the non-enriched population, according or following the enrichment guidelines, will be compared qualitatively and will give you a sense for what that enrichment magnitude is in the top-line data. Then, any other measures, and there will be many, will be looked at in the fullness of time. One of the interesting things, and we get asked about this all the time, is: Is cognition changed? Has memory changed?
There was some evidence of that in the prior trials, and we'll get to that. Importantly, that's not going to be part of the efficacy package. The efficacy package is focused on MADRS, which is something that's a very kinda clear path for FDA. They've put guidelines out there that say the MADRS is one of the few accepted primary outcomes in depression, and we listen closely to that. Going into the readout from the trial, what makes us confident that we've done the work, we've positioned this trial as best we can? There's a lot of things that go into that, as you've seen. Understanding the target population, we'd started working in my lab at Stanford on this population well over a decade ago.
We've been studying them over and over, demonstrating the clinical need, the relationship between the mechanism of action of the drug around plasticity in the hippocampus, and identifying patients with a patient-by-patient leveled mild or greater impairment in memory, meaning evidence of deficient hippocampal plasticity at the patient level, with a highly reliable measure. The prospective replications, having two prospective replications that I've showed you, along with multiple retrospective analyses, all of which have been very consistent with each other across measures. Multiple external datasets looking at placebos. So think here about how oncology or rare disease often work, that you have other external control arms. We've implemented that here and looked at placebo to show similar or slightly worse response to placebo in poor memory patients. The drug has been very well tolerated.
400, over 400 people dosed prior to the Phase 2b, so there's no surprising adverse events as we go into the Phase 2b. And then really sweating the details on both execution and designing the study to try to mitigate placebo in the first place. So I would ask a question, which we ask ourselves all the time, which is: Can you think of any stone we have not unturned and turned over and seen what's there? Is there anything that we've missed? Is there any thought that we didn't have that we should have had? We don't think so. We'd certainly love if anybody has suggestions, which really says that short of just doing the trial itself, there's not any more to do. And lucky for us, October is around the corner, and we're well on track to report results next month.
So you all took the Spectra battery, this sort of focused commercial-like battery. So let's look at some of your results and see what we learned. This is a snapshot of our dashboard. So this is all the things that we're tracking as data come in. You can see representation here across the world. Interestingly, if you look at the bottom right here, you see that the time taken to do this battery has been pretty tight, right in, like, fifteen to twenty minutes for the vast, vast majority of the population, so very doable, and hopefully, you agree with that as, like, a scalable thing. Attach it to direct-to-consumer ad, or a patient takes it in the office waiting room. It's something that could be done at a very large scale.
This is your data, and what you see in the histograms are the Investor Day registrants' data, and the line plot is the normal curve to compare against. So it obviously fits in that population norm. But you can see across these different measures, and memory is the same biomarker we used in our trial, that's you as a whole. But I think it merits taking this a step further and really asking whether, as a company the banking syndicate that we worked with is the right banking syndicate, not in terms of ability to raise funds, but cognitively. So fortunately for us, the analysts involved in our IPO syndicate have agreed to unmask their data, and let's see how they did. So on the memory test, they all came in entirely average. They all think that they're above average.
Let's reinforce that. That's fine. But this is great. This is exactly where we'd expect. And unfortunate news for you guys, all of you are biomarker negative, so ALTO-100, should you become depressed at the next biotech bear market, will not be the drug for you. However, we have two more antidepressants reading out, first half of next year. Different biomarkers, different populations, so we'll get you before the next bear market. The other plot here, though, and this is less to do with the trial and more just the kind of insights you get about people, is the reward test. It's called the delay discounting test. It basically looks, it's the marshmallow test, so it looks at how much you value a reward now versus a reward later.
On the low end here is, "I want a reward now." On the high end is, "I'm comfortable waiting for a reward later," and left with no commentary, this is the performance of our analysts across the board. So you can, as you read their notes, take it into consideration. But all fun aside, you know, this hopefully gave everybody a very felt sense of what the battery is like, the kind of insights you get about yourself, and these are highly, all of our measures, for any biomarker, are highly test-retest reliable measures. And that's really why we do them and how we validated them. With that, I'd like to invite Mike Hanley to come and speak.
Mike has held multiple operational and commercial roles in a variety of biotechs, but most saliently, has been involved launching Vortioxetine when he was at Lundbeck, which is, as you know, a blockbuster antidepressant, but really differentiated in only minor ways from other SSRIs. And there's a lot of learnings that he's taken in there that he's applying here into his work as our COO, but also leading our commercial efforts here at Alto. Mike?
All right. Thanks, Amit, and on behalf of the Alto team, I'm really excited to share our thinking around the commercial opportunity for ALTO- 100, and more importantly, the potential impact that a product like ALTO- 100 can have on the patients and the families that are affected by depression. We view ALTO- 100 as a very attractive commercial opportunity for a variety of reasons, from the size of the patient population to the scalability of our patient identification approach, to the strong alignment between the severe unmet medical need, the biomarker approach that we're taking, and the product's mechanistic and clinical profile that we believe we can achieve. So let's talk a little bit first with the market size.
As Amit and Adam already referenced, we anticipate somewhere between 30% and 50% of patients with major depressive disorder also exhibit poor cognition. So of the 21 million patients that suffer from major depressive disorder in the US, this translates into a target patient population of potentially north of 10 million patients. And to put that in perspective, the estimated prevalence of schizophrenia, another highly attractive commercial opportunity, is less than 3 million patients. So thinking about the broader depression population, there's a significant unmet need, and that's even more so clear in the patients who exhibit poor cognition as well. So we recently went out and solicited some input from prescribers across a range of specialties to learn more about how they view the unmet need in the patient population.
It's safe to say they're generally unsatisfied with what they have at their disposal. Interestingly, they suggested almost two-thirds of their patients with depression exhibit some type of impaired cognition, and they also indicated that almost one-half of their patients are currently not able to achieve their treatment goals. The unmet needs that they referenced, unaided, align very well with the approach that we're taking as a whole, our platform at Alto, but also with the potential clinical profile of ALTO-100. You know, thinking about the heterogeneity of the disease, physicians recognize the importance of targeted mechanisms, and the role that an objective biomarker-based approach can play in segmenting their patients and being able to select the optimal treatment.
These prescribers also identified some of the limitations of current treatment options in terms of balancing the treatment effect and the safety tolerability profile of the therapy. One prescriber interviewed summed it up best by saying: "We need more effective treatment options. That may hinge on the idea that we're often not personalizing treatment and instead kind of prescribing hit-or-miss treatments." So our Spectra cognitive assessment, we feel very confident it represents a really effective way of addressing that hit-or-miss approach. As a short and straightforward online assessment, this test is 100% scalable and can be made available ubiquitously. And it fits within the current patient journey, very consistent with how patients are currently evaluated and managed. And we see patients being able to access this test through different channels.
One, and we'll discuss this more shortly, is that prescribers appear very open to integrating this test into their current approach of how they evaluate and manage their patients. But two, we also view this as a critical component of patient education and direct-to-consumer efforts that are designed to help patients raise their hand and provide their physician with information that can lend to or drive the optimal treatment decision. So, that obviously relies really heavily on broad acceptance among the targeted prescriber group of precision medicine. And, it's safe to say that clinicians we interviewed are very ripe for this type of approach. Prescribers are very open to a biomarker-based approach to managing major depressive disorder.
I mean, essentially, they're already trying to do this now, personalizing treatment, but they're not armed with the tools to do so. They're typically trying to match a patient and their subjective symptoms with potentially with a particular drug side effect profile and doing the best they can. But without an objective test and targeted therapies, this is really just this is they're just open to more of the same, and patients typically cycle from one antidepressant to another, right? Taking several weeks to titrate up to a maximally tolerated dose, reporting back their subjective symptoms, and then starting that all over again, right? I mean, it's essentially doing the same thing over, hoping for a different result. And the slide here references suboptimal outcomes, and I think that's putting it mildly.
It's frustrating, it's challenging, and it's heartbreaking for the patients who suffer. So in our interviews, the engagement we had with physicians, we went further than just asking broadly about precision medicine. Like many of you here, they had an opportunity to take our Spectra cognitive test, and not surprisingly, they found this to be a really compelling approach. They indicated they'd be very comfortable recommending this approach to their patients. We used a Likert scale, one through seven, and they rated this as a six point three, and in fact, ultimately, the average number of patients that they estimated that they would refer for this cognitive test approached 80% of their MDD patients.
So honestly, this validates what's been our internal thinking around this approach, the scalability, the relevance to the clinical practice, and prescribers had a pretty similar positive perspective about ALTO-100 after seeing the potential clinical profile of the drug. So in fact, this group of prescribers viewed the potential efficacy and safety of ALTO-100 a lot more favorably than they view current treatment options and standard of care. We asked them how they would compare this profile in terms of efficacy, meaning the onset of action, treatment response, or response rate, and treatment effect in both monotherapy and adjunctive, and the tolerability relative to current standard of care. So again, this was a one to seven scale, with four being equivalent to standard of care.
The average rating for both efficacy and safety and tolerability was far superior to what, how they view the current treatment options that they have. I think some of the quotes on the right side capture it best when you think about the biomarker test and the profile of the drug to say, "I would test most patients that are referred to me. I can't think of a reason not to." "I would probably do this on all or most or all of my patients as part of my initial workup, the way, the same way I check for a thyroid and maybe do some additional tests." "If somebody's already taking a medication and not improving, this is the logical next step after treatment as usual." Then finally, "If a patient tests positive, why wouldn't I put them on ALTO-100?
The only reason I could anticipate is access." Right, so prescribers conveyed a really strong clinical conviction towards this approach with that primary barrier that we found multiple times being access through insurance, right? We talked to insurers as well. We got a nice range of payers across the spectrum, and you know that access piece is not really surprising considering that I think it's about 95% of antidepressants are generic, and that's... They have the treatment options available. Payers manage this category pretty tightly, so we spoke with them and asked them to share their approach on how they manage the category and their view of precision medicine and the ALTO-100 profile.
While they're coming at it from a very different perspective than clinicians who are in the throes of this and treating the patients, they found it similarly compelling. It wasn't hard for them to recognize the potential for this to change how depression can be managed, and they highlighted the fact that this clinical profile would likely translate into a very strong economic value proposition as well. So again, some of the quotes speak for themselves... There's a good clinical story to tell here. A predictive tool like this could really have a significant positive impact. I'm in favor of any test that can narrow the choices and have a better success at treatment, and finally, by determining a patient's likelihood to respond, you can avoid unneeded therapy.
You can save money while still giving the patient the best chance at a response. So I don't wanna underestimate this. This isn't going to be easy, but it's really rare to have a product profile that has this potential clinical value and the economic value to be so obvious and compelling to the key stakeholders around the table. So to wrap up, we see a really strong commercial opportunity and a fairly straightforward path to commercializing ALTO-100. There's a highly prevalent patient population, combined with a strong and urgent desire to do better.
There's a mindset among physicians that lends to the rapid adoption of an objective biomarker approach like this, combined with a highly predictive, reliable, ubiquitous biomarker test that can identify treatment responders, and obviously then, the potential for a highly differentiated clinical profile that's really well-aligned with the most significant unmet needs among the patient population. So taken together, we view this as positioning both Alto overall and this lead program to potentially reshape how depression can be managed and to positively affect the millions of patients who are unable to achieve a meaningful response with the treatments available today. So at this point, I think I'll hand it back to Amit to talk about our pipeline and introduce Dr. Sanacora.
Perfect. Thank you. So as I was outlining earlier, this is one of five different phase 2 programs going on. This is just to bring the pipeline back up here and just kind of focus even on the next, you know, let's call it nine months until the middle of next year, where we'll have three different antidepressants reading out. That's two Phase 2bs, one phase 2 POC for ALTO-203, but using a kind of near-term immediate clinical outcome on single-dose looking for effects in depression. So that's a lot of information that's coming.
That's a lot for us to be excited about, and if you just take that depression lens and kind of extend it, in terms of what it means as these drugs go through, that's a really unique opportunity to own, in different ways, different subpopulations of a very large sector of medicine and psychiatry, and we'll have news on that, very soon. But first, let me invite Jerry Sanacora up here. Jerry is a professor at Yale. He's been involved in a lot of different clinical trials, has had a huge focus on mechanism and relating mechanism of the illness, of the medication, and then conducting the right kind of trials to get there. He's had exposure through years and years of different agents. Oh, and there's actually his bio here.
So he's gonna give a couple slides to outline his comments, and then we'll have a Q&A or kind of a fireside between the two of us, and then we'll open it up to the crowd as a whole for Q&A.
Thank you.
Jerry.
Thanks, Amit. Thanks very much. I think I know several of you already, so it's nice to see you here. But really just want to give you a quick idea of where I see the Alto 100 and the rest of the pipeline actually fitting in. This is one of my favorite slides. This is cardiology, and I love starting with this. This is actually looking at age-adjusted death rate due to heart disease, mainly myocardial infarction, over a period of years starting in the 1950s up until about 2005. But really, the main thing here. Is there the pointer?
No, I don't think there's a laser.
The main, the main thing is that inflection point you see right here, coming in the late 1960s, 1970, where all of a sudden there was a major impact in reducing death due to cardiac disease. And that was a point right at where the science was starting to understand the actual pathophysiology. We started to understand hypercholesterolemia. We started to understand the effects of hypertension, metabolic disorders, and smoking. Yeah, so and you could target those with appropriate pharmacology or appropriate interventions, and you dramatically changed the course. I mean, it was gradually increasing over time. Dramatically, you can see that dark, that solid line is the actual age-adjusted death rate, and a tremendous impact. It was because we understood the basic pathophysiology. We developed treatments that targeted that basic pathophysiology, and we really saved lives.
If you look at where depression stands, we're nowhere near there right now. Depression is gaining in the number of people suffering from depression. It's rise to the number one cause of disability in several measures from the World Health Organization. Suicide rates, while not one hundred percent aligned with depression, is closely aligned, seems to be increasing, at least in the US. So we're looking at things moving in the wrong direction, the exact opposite of what we saw in cardiology in 1960s, 1970s, and over on the right is actually just looking at the STAR*D. I'm sure most of you are very familiar with the STAR*D trial, which is now getting old, probably about 15 years old, looking at it, but it was looking at the treatment algorithm, going from one treatment to the other.
The problem was largely most of these were very similar treatments, and most of them targeted the monoaminergic system, so you're going from one drug to another that were very similar, and you can see it really wasn't great outcomes. The first treatment, about 30% to 40% chance of having a response or a remission rate, I should actually say. Then went to the second, not much different, but once you went beyond that, dramatically dropped off. If you didn't have a response to your first two medicines, you're dropping below a 15% chance of actually having a remission to those later treatments. It really highlighted the limitations, and if you look all the way at the end, is the relapse rates, and you see that gets pretty high, especially once you go beyond that first or second level of treatment non-response.
So it really is speaking to the unmet need. The area where I've had the most experience is this idea of moving from the general idea that there was some pathophysiology involving the monoaminergic system, that something was wrong with serotonin, the old, you know, chemical imbalance hypothesis that was out there from the nineteen sixties and seventies, eighties, and into the nineties, to moving to an idea that this is really more of a cortical or a limbic pathophysiology. And really, this idea that Amit noted before, but and Adam, that this idea that plasticity, brain adaptation, the ability of the brain to adapt to changes, what we've been calling neuroplasticity, seems to be impaired in animal models of depression, but also in clinical cases of depression. And the idea that if you could seek to target this pathophysiologic mechanism, you could actually generate benefit.
And this is really where ketamine grew out of, this basic approach, this idea that you could rapidly target neuroplasticity. But if you look at what it's really like for patients going through a journey, we're not really targeting pathophysiology at all in our current state of care. If you just look at this, I actually just used ChatGPT to say, "What's a patient journey like?" But you can see it, it really comes with symptom recognition. It's usually either the patient or a family member saying, "You know, I'm not feeling well," or now, more commonly, actually, when you go to your primary care doc or any physician, using a screening tool like a PHQ-9 to actually diagnose. Then the next step is help seeking. For the most part, everybody's first step is their primary care doc at this point.
Some people may go outside of the primary care doc and look for a mental health care professional, usually not a psychiatrist, usually a psychologist or a licensed clinical social worker or somebody along those lines for their first level of care, and then the diagnostic process, which I think is really important, where I want to spend time. We talk about the evaluation for depression. I can tell you, a primary care doc is not doing a very comprehensive evaluation. It's usually a pretty quick evaluation. "Tell me, what are your symptoms?" "I'm feeling sad. I'm not sleeping well." "All right, we're going to start an antidepressant." Usually, whichever the one they feel most comfortable with, and then moving to, you know... The diagnosis itself remains a diagnosis of exclusion, and this is really important in the field. We don't have a positive test for depression.
We can diagnose the symptoms, and then the way it's actually diagnosed is by ruling out any of the other possibilities. There is no positive test, so the idea of having a test would be incredible at this point, and then the idea of treatment planning, and this is really key. For the most part, it's a collaborative discussion, usually between the clinician and the patient, and depending on what the clinician feels comfortable with, are usually the choices that's provided, so if you're originally seeing a psychologist, it's going to be whatever psychotherapeutic approach they use. If it's a primary care doc, it's what medicines they feel more comfortable with, and it really is based on the clinician's preference and what the patient's preference is, more than anything more specific than that.
So you wouldn't do that in the rest of medicine. You wouldn't choose your choice just because, well, I'm used to using this drug or that drug. You try to target it better. If anything, it was mentioned, the side effect profile is usually the thing that is chosen. Could somebody tolerate the weight gain? Could somebody tolerate the sleep impairment or the excessive sleep? You know, these are the things that tend to target it. And then that initial treatment phase, medicine is given, and then it's usually just a wait and see, and then it becomes a trial and error game. And that's really where the field is, and the field has been for many years, where the follow-up may take two weeks, three weeks. In real world, sometimes it's a month before they see the next visit.
And then, and if they're getting minor benefit, they may say, "Well, give it another month." So now you're, you know, you're eight weeks out before there's really any assessment of where things are going. And then if things work, then there's some level of monitoring, but more commonly, it's an unsatisfactory response. So I said, usually above 50% have an unsatisfactory... So even if they're having a response, and, and in our terms, a response means a 50% improvement, many people are still saying, "Yeah, but I'm not really getting that full response," and that's what we said. It's really a small percentage of people, maybe about 15% of people in these trials, that really get that big response, and that's what you want to do. You want to identify those people that really get those big response. That's what clinicians are looking for.
But if you don't get that, what do you do? Commonly, they'll modify treatment, just increase the dose, decrease the dose, switch it from morning to night. If you're not getting any benefit, then you start to switch, and that becomes that game that patients hate, where you hear it so much. "Well, I went to the doctor, he said it wasn't working. Try another one. Try another one." That's really not what people are looking for. The other alternative, if there's some response, is to augment with another medication, and that's usually done, again, targeting the side effect profile more than any specific response. And then after that, usually now after three, maybe four trials of a medication, you get referred to high-level care, which are things like transcranial magnetic stimulation, ketamine, esketamine, ECT.
Now you're getting to these really pretty invasive and very costly, both in terms of dollars, but also in terms of people's time. And that's a typical course of treatment and the patient journey. Long-term management is another issue, and that's good for people if they're having a response. They get to be followed longer term. But not having a response, this could become a very repetitive system of just going and trying a new medicine over and over again, something that patients really despise, and I think clinicians really feel very unsatisfied with. So I'll just leave it like that. I think that's really covered the main ideas of the patient journey.
So, let me ask you just a couple questions here for your thoughts, that kind of color in some of the background that you gave, which is, first of all, how do you see the ALTO-100 kind of profile results and the cognitive marker fitting into clinical practice? What line of therapy might that be? And what kind of evidence-
Yeah
... would you want to see to get it there?
So I think the profile, what we have so far is great, and the basic mechanism. It was actually interesting to see your poll of clinicians ranking mechanism of action very highly, which I think is somewhat new in the field, but I think good clinicians really do want to understand what's happening, and they really do believe that they're having effects on the brain. So they're very interested in that mechanism of action, and obviously, the state of mechanism of action right now, targeting neuroplasticity, I think is very favorable, and I think among clinicians, that has become sort of the new pathophysiology or new biology that they can kind of understand. You know, it's progressed from this chemical imbalance to much more understanding of for the average clinician now knows what neuroplasticity is.
Where I think twenty years ago, when we started in the field, that was kind of a word that really wasn't out there. Now, I think that that's kind of the thing where people are targeting. They can understand it, they can explain it to their patients. The actual data looking at using a tool to get to a refinement or rarifying these patients, I think is exactly what clinicians would like. The clinicians would love to be able to say, "You know, I have... This test suggests this would work." Patients want to hear that. And once you can actually tell a patient, "This is something that's highly probable to have an effect specifically for you," you get so much benefit. One, from the direct, but also from the indirect, the contextual benefit of having that. Did I miss anything there?
Yeah, and what kind of evidence would you want to see as the programs evolve, and then what line of therapy do you see-
Yeah
... ALTO-100?
So definitely the most important evidence is gonna be the same, I'm sure, for all of you guys. We're gonna want to see efficacy data or effectiveness data, even better, to really see that this is working and, and working with a good safety profile. For clinicians, what we really think about is the number needed to treat, number needed to harm. That's really the main thing, is what's that risk-benefit ratio? Where it fits in, mechanistically, I don't see any reason why it couldn't be a first-line treatment. In fact, it could be an ideal first-line treatment. I, I think the big issue, as was noted before, will really be access and, and affordability. I think if you have good pharmacoeconomic data, I, I think third-party payers would be very happy if they could get somebody...
More and more, I talk to these third-party payers, and they're really getting frustrated with the idea that so many of these people are just on the treadmill, and they're treating the small percentage of TR D patients is chewing up such a large part of their budget. And if they could actually refine and get them treated sooner and get them off this, you know, carousel of just new meds and worst case, hospitalizations, intensive inpatient stays, they would really prefer that early on. But you're gonna need some pharmacoeconomic data to show that first line is going to be beneficial.
So we talked about some of the benefits of a precision approach and the clinician perspective. Outside of access reimbursement, what kind of headwinds do you anticipate in getting adoption for this approach?
So being in the field now, I started in a very basic. Most of my early work was in the lab and then progressed more to doing clinical large trials, now implementation science studies. And the one thing that I've really become impressed with is how this is a rapidly moving healthcare system, and things move through, and you need to have it not gum up the system. This has to be something that fits into the healthcare system relatively simply. That's where I see some of the headwinds. If this is gonna require clinician time beyond what's billable, I think some of the more recent treatments have run into that, which they didn't see that as a headwind, realizing that providing this treatment is gonna cost a lot of clinical time. Clinical time now is budgeted to the minute.
So if you're not gonna be taking much of the clinician's time, that's a huge benefit, and if there's gonna be some easy way to access that data and have it interpretable, that's a huge benefit. In some ways, I think some of the genetic testing has been overly simplified, and that's actually backfired. So these, you know, red light, green light, this is the right thing to use, wrong thing to use, which I think is, in a way, what clinicians want, but they want it to be real. They want to know that that is actually gonna improve their care.
Yeah, data-driven, really supported.
Yeah.
... as an approved and cleared tool. So to close out from the kind of big picture to the small, would love your thoughts on trial execution, how you look at the Phase IIb and what we've presented from just the experience of running these trials.
There are three main things that I actually really like about the ALTO-100 and the protocol. One is the mechanism. I think it really is truly unique. It may be down, coming to a confluence with some of these other treatments that are much more carrying more baggage, like ketamine and some of these others, ECT, where you're actually having big effects on plasticity, but this seems to have a much reduced risk profile with it. The more important thing is the way the trials are being run. I think you're paying attention to both ends. It's all about the delta, at least getting it by that initial approval stage. That delta is from the placebo response to the active drug response, and there's two ways of improving that.
One is to improve the active drug, and the other is really to limit the placebo response, and I think you guys have done a great job of really paying attention to both ends of that. 'Cause you can run up against a ceiling effect. You could have a great drug, but if the placebo response is topping fifty, where it has in some recent studies, it's really hard to beat that. And I've seen some of your competitors almost get burned by really good phase II data, where they've had good phase II data, gotten overconfident, and felt like, "Well, we're such a good drug, we don't really need to worry about that placebo response." You guys haven't done that, and you've focused on that.
So I'm really excited to see that you have a mechanism that seems to be giving a very large specific effect, and you are paying all the attention to minimizing either the contextual effects or the non-specific effects, which really could kill trials, even with great drugs.
Right. Thank you very much, Jerry. What I'm gonna do now is invite everybody else to come up and bring the chairs forward here, and open this up to Q&A. As they come up here, we'll just go kind of front to back. Ritu?
Oh, sorry. Oops.
Yeah, yeah, we'll bring the mic around. Here it is. Just as an FYI, for those in the room, they will be setting up lunch while we're transitioning to Q&A here. Feel free to grab something, but we'll jump in.
Yeah, we can, we can scoot everybody over.
Yeah.
I don't know where the camera-
Go to the next slide.
Hmm.
See what's up. There we go. Ma, do you have?
The mic?
Mic. One second here.
Hello?
Perfect. Good.
Great. Thank you, Nick. Ritu Baral, TD Cowen. Two questions. One is, you mentioned that when discussing the trial protocol with the FDA, the FDA wanted sort of, they wanted to know what the biomarker negative group did-
Mm-hmm.
And that you're going to report the data as a sort of qualitative assessment. Were there any specific conversations about what FDA was hoping to see? I mean, ideally, the biomarker negative group will look just like placebo, but can you talk to if that is indeed best case scenario, but what other iterations may impact your interpretation of therapeutic effect from that biomarker negative group? And then I've got a follow-up.
Yeah. So I think the way to think about it is, first, what do the guidelines say, and then how does the FDA in our interactions understand our data that we presented? So the guidelines say it's basically qualitative, that there has to be some degree of enrichment as related to the risk-benefit profile. So in a drug that's well-tolerated, that degree of enrichment probably doesn't have to be that big. If the drug is very poorly tolerated, then you really want to make sure you have the right people, and that's in part how the whole lens gets looked at.
So some degree of difference, no response at all compared to placebo in the, in the good memory folks would be great, 'cause that really illustrates the, the strength of the finding, and then somewhere in between, there, there's sort of an in-between response. But that's not a bar that we see as needed, and from our FDA interactions, that doesn't seem to be needed either. So it's really some degree of enrichment where there's no preset expectation for how big that needs to be, but understanding risk-benefits.
So were we to see that some, you know, the poor—the good memory group just doesn't tolerate the drug at all, that's one thing. We don't expect that. We've never seen it before. But in a well-tolerated drug, it's, it's really, is your hypothesis hit, and does the biomarker stratification help support that difference? Adam, I don't know, Adam, if you wanna say any more about that.
I think it's very much a look at the data, a qualitative comparison between the enriched and non-enriched. It's clear that they accept having the two distinct populations in the IIb. They're gonna wanna see the in-between population to determine that in the Phase III, but that's, they're okay waiting to see that part of it.
And if there's an in-between effect, but no difference on safety and tolerability, are there label ramifications for that?
Yeah. So, you know, we'll get to that when we actually have data to speak to those in between people. But assuming that it is a continuous response, like what we've seen before, there's any number of label ramifications there, including the clinician's own judgment of, you know, "These guys have a little bit more of a benefit, even though the drug was targeting the poor memory patients. Maybe I'd take a, you know, a stab at those patients as well." One thing I have to say about the FDA interactions is that they understand, and made very clear, that we're doing things differently, and that they're really taking a collaborative approach, as are we, to figuring out the right way forward. So it's really a positive experience entirely in line with what our expectations were going in. Paul?
My follow-up was for Dr. Sanacora on your comment on first line. How amenable do you think, PCPs, just given that, patient journey that was elucidated, how amenable are PCPs to this idea of the biomarker and the screening tool that we all-
I think PCPs would love it if it's not gonna take any time from them, which it doesn't. You know, PCPs really will have about eight minutes to see them. But if they could tell them to go online and do this, or if they could have some other way of getting this, getting rapid feedback, I think a PCP would love. It would be the equivalent of, you know, doing a simple ferritin test, you know, if somebody is anemic, to know, well, I should give you iron, or I should think of something else for your anemia. I think that's how PCPs think. It would be great if they could do it, as long as this wasn't gonna really suck any time out of them.
Mm-hmm. Paul?
Great, thanks. Paul Matteis from Stifel. You guys shared a lot of data, and thank you for that. I was curious, just from a powering perspective, what are you seeing on a blinded basis as it relates to variability in the study? Does that confirm your expectations? Do you think there's upside where you actually are powered for point three? And then on the FDA point again, is it as simple as just showing a difference in efficacy between the enriched and non-enriched, or how much does the FDA care about mechanistic rationale and other justifications, and where are you in that conversation?
Yeah. So on the first point about variability and variability with respect to what more-
Just like overall standard deviation as an input for powering.
Yeah. So let's really think about what powering here means, right? So the powering at 0.4 means that given just trial-to-trial variability with an average effect of 0.4, that we have an 80% chance of detecting that. Which really means that in a trial, an individual trial, you should be able to detect a level below that. And we think that there's a good alignment between the 0.3 that we think clinically is relevant for advancing the drug into Phase 3 and would make it a very competitive commercial product and what we can detect significantly. But that's really... You know, that 0.3 is the guide here in terms of effect size that would merit taking it to Phase 3.
The issue with the FDA in terms of efficacy, is it really just showing some degree? You know, they, like everybody, are people, right? And everybody wants to understand the mechanism, and we provided a lot of that, and we will continue to provide more of that, and I do think that that is helpful. The enrichment guidelines themselves don't require that, and so it's really in interpreting the entire package, right? What do I understand about the people and the need and the mechanisms and the reliability and the clinical data and the evidence of efficacy and so forth that comes together? And, you know, we and they are working collaboratively here. It's clear that they don't have a preset answer. It's really working together towards that.
And just as a follow-up, have you had any conversations about the test itself and whether you'd actually be registering that formally as a diagnostic?
Yeah. So that's a great question, and that also came up in the Type C meeting. We have a process right now already ongoing in terms of understanding the mechanics of registering that on the device side. It's a software device. There's already a lot of predicates in that space, cognitive measures that are already 510(k) clear, that you can follow. And that'll be just part of the overall drug and efficacy package. So it's the drug and the device sides are talking to each other on this program at FDA, and we likewise are following along. We don't anticipate that that's an issue. It's really part of the process of getting the software in place for the Phase 3. Brian, then Laura.
Thank you. Brian Skorney from Baird. Just really happy to see that the Spectra cognitive battery did confirm what we all know. I'm a very stable genius.
So, Brian, I have to note that, you know, you did very well across the board in your test, so congratulations. We verified that indeed.
It's not all about you.
Yeah.
But I have two questions, both around sort of the point three number on the Cohen's d side. I guess just for the primary analysis, if you were to advance anything that's point three or above, why not power for point three? Why choose to power point four? If it hits like point three five, don't you get into the questions of like, is this statistically persuasive, that the effect size that you're seeing is a true effect and not random chance? And then on the all-comer side of things, I forget offhand what the Neuralstem Cohen's d was in that original study. I don't think it was that far off of point three.
But when you contextualize the results you have, if you wind up getting a really good result in the poor cognitive patients, is there a scenario where you move forward in all-comers? And I only ask because I know in some other areas, where biomarkers are used, sometimes they'll go after broad populations, even though it only works in a biomarker-specific population. Is there sort of a business calculation that you can do there to determine, do you get more money over a shorter period versus,
Yeah.
... maybe it goes into the reward question, right?
Yeah.
'Cause you have IP longer if you have a biomarker-specific patient population, but maybe a bigger launch-
Yeah
with an all-comer. Thanks.
So remember, as Adam was saying, we're gonna have to do the all-comer population from a risk-benefit perspective. And so if you power that to be able to hit in the all-comer because you have the poor cognition segment driving it, you can hit in the all-comer, but it's still the poor cognition that's driving it. So those are label issues more so that you know, that efficacy in the prior Neuralstem study was around point two, right? So in our enriched population, we're only looking for just a little bit of a bump, actually, to make it really compelling in that population.
You could hit in the whole population, but by being driven by the subpopulation, because we're going in with an enrichment approach, that can get reflected in the label, even if you're approved for everybody. In other words, it could be directed at those people. And then it's a clinician's judgment, and maybe that's the best of all worlds in that kind of context. In terms of the powering, you know, the way we think about powering is not where your minimum statistical effect could be detected. That puts us around point three and above, right? But is how many people you'd have to get to be able to 80% of the time hit that target. If you're doing that at point three, that means you may have statistical significance. You may be overpowered.
Statistical significance well into the low point twos in your targeted population, which is a statistical signal, but not a clinical one. And so we really set the bar for ourselves where the clinical signal is, and that's why it's powered at point four, and then able to detect at levels below that. But certainly take your point and, you know, and that's where as I mentioned, we'll be looking at the entire data set driven by that point three, and it should mostly align with statistical significance cut points, but, you know, may not exactly perfectly.
Laura?
Thanks. Hi, Laura Chico, Wedbush. Could you talk a little bit further about the built-in step-down test, more about the rationale and the selection there on the powering and just kind of the commercial ramifications, the significance of the monotherapy, kind of the base case scenario there? Thank you.
Yeah. So as you know, people develop drugs typically, specifically within monotherapy or adjunctive. There have been some programs that actually have included both. But that was really feedback from the FDA saying, "We don't yet know if the drug placebo difference is gonna be the same in monotherapy and adjunctive. So consider, as you think about how to advance the drug, that biggest subsample and embed it and better power that subsample." Our focus is still obviously on the primary outcome population. But having that step down in there, which the no alpha spend is contingent on the primary outcome being hit because you're basically looking at just a subset of the same test, is not a multiplicity issue. It's taking the same test and just walking it down to the smaller group.
That could lead us to think, depending on how we compare monotherapy and adjunctive, that we only advance one or we advance both. That's really where the data will guide us. Another way of saying that is the adjunctive side isn't powered. It's part of the whole overall, and so there we're looking at effect sizes rather than statistical significance. And, you know, you could always increase the power of a study to detect smaller subgroups and so forth. At some point, you know, you've overpowered your study, and you just need to get the result. Andrew?
Uh-
Or Miles.
I might just sneak in here.
Sure.
Miles Minter from William Blair. We get a lot of questions on the monotherapy and adjunctive populations being combined here. I know your defense is that the effect sizes were very similar, but if we just look at the delta, 'cause that's probably gonna be more important here, how do the deltas or the absolute magnitude of MADRS compare between those populations, and is that a source of variation? I think if you go back and look at the Caplyta data, you look at the Vraylar data, they come in around 14 points. Auvelity's come in at 16 on a monotherapy. You know, on a mean basis, the two-point difference there, when you sub all those populations together, is that a source of variability that might make you concerned there?
Then secondly, maybe on the opposite end of Brian's performance in the cognitive battery here, do you have safeguards for patients or participants that score really, really badly on cognition and might be, you know, indicative of they didn't even try on the test, and so is that part of an exclusion criteria as well? Thanks.
Yeah. So let me just take that last part first, 'cause that's really critical. We have a lot built in under the hood that looks at exactly those things, so people not trying, where people perform relative to population norms, and there's a lot of, like, little catches there that we have, and, you know, and that's just a ton of experience in doing this and having run thousands and thousands of people through this battery. So that we're not worried about. In fact, what you often will see, that sort of a sanity check, which we showed you in the slides as well, is that those poor memory patients, they're taking 12 other cognitive tests as part of the research battery, and their overall cognition takes the whole picture into consideration. That was at 1.4 standard deviations below healthy.
Performing poorly in the memory test predicts a very broad, poor cognitive profile. As to the source of variability and would welcome from a commercial perspective also, Nick's thoughts and Mike's thoughts, you know, the comparing cross trials and absolute change, it's tough, right? Like you cited Auvelity, but they had almost 12-point placebo change, and we're aiming for less, you know, and certainly, that's why people also think about adjunctive is because it's such a high-need group. They typically have very little placebo response relative to a monotherapy. But would love thoughts from, you know, maybe Mike, in terms of the magnitude of change and kind of thinking about that across these, and Nick?
Yeah, and I mean, I think there's also, so I would say from the commercial side, but also I'd ask Dr. Sanacora from the biological side, right? What's the actual difference between a monotherapy patient and an adjunctive patient biologically? And what we're trying to get at here is really that biological difference, Miles. But what we, you know, what we've seen over the, over kind of talking to clinicians and also with our history of launching different drugs in the space, is that what clinicians are really looking for is the patient actually getting better? So yes, they're looking at absolute magnitude of change, but in a monotherapy setting or an adjunctive setting, they're really looking at, does the patient come in, and are they actually better?
So yes, you launch with a label, you launch with a delta in your two lines. We didn't see differences in the absolute change in those in our actual phase IIa trial. You can see the poster online, but what you actually see is that it's just a shifted curve, right? It's not a difference in the absolute change. So from our standpoint, similar patient populations and those deltas commercially don't make that big of a difference if the patients are actually seeing response.
Yeah, and maybe before Jerry jumps in, just remember that the clinician and, you know, very much welcome Jerry's thoughts on this, has their idea of what the baseline response for that type of patient is. And in the more resistant adjunctive population, that's gonna be a lower baseline. So it's not just an absolute number outside of context on that patient.
Mm-hmm. Yeah. No, I agree. I don't, in terms of biological, I don't think we have that down right now, but it's usually people that have some residual-type symptoms, so aren't achieving that complete response, and that's typically who ends up on adjunctive therapy. And there's also the level of people that just don't want to take medicine. They want to be on a monotherapy, and-
Yep.
You know, they both come into play in who actually ends up on adjunctive therapy.
Yeah. One thing we have heard, Miles, is we actually get more feedback on running the trial in monotherapy and adjunctive as a positive because it represents a more real-world population, right? When you go to launch a drug, the odds that you're gonna launch directly into a monotherapy population is not likely, right? Clinicians are, especially psychiatrists, as we have multiple of them up here, known to be tinkerers with drugs, adding different drugs to different patient populations. So getting that data in a monotherapy and an adjunctive setting is actually really important for these clinicians.
Yeah, I mean, to kind of put a finer point on it, right? And I'm sure Jerry and Adam agree. If I, as a psychiatrist, get a patient, they're often already on a medication. I'm not going to take them off of the medication so I can start a new monotherapy. I'll add something on top, which means that the monotherapy studied drugs that have no data in adjunctive. I'm just assuming that that translates, but I have no knowledge. That's really critically why, in the real world, having this data is critical.
Andrew?
Andrew Tsai from Jefferies. Thanks for the questions. So, the first one is, was there an interim futility analysis that was built into the study? And did the DMC ever convene to look at the data on either efficacy or safety? And then secondly, the baseline MADRS, I think you shared was 31, 32. Are you happy with those ranges? Because we've seen scores north of forty, I think. So I'm just curious if you're happy with those scores.
Yeah. So the first question, there is no interim analysis for any reason. We just generally don't find that that's a helpful thing in a tightly and quickly run study. And I think to the second, I would invite Adam to kind of talk about how we think about severity. That's really exactly where we were shooting for severity to end up. So we really have the moderate to severe end represented. And as I mentioned before, there's really no evidence that increasing severity leads to better drug-placebo separation. That's where we focus the biomarker. You want to ideally represent the wide breadth of patients who are out there so that your results generalize to the real world. Adam?
Yeah. To the first part, our blinded safety analysis showing the drug continues to be safe and well-tolerated with the expected adverse events. And in terms of the MADRS score, I actually think where we are, which is in the moderate to severe range, in the low thirties, is a reasonable number. It's one that we can match in future studies. It's not showing that baseline inflation you sometimes see in other studies. And as the data has shown, the belief that the more severe patients, you're gonna see a better separation, hasn't been borne out by actual data.
Yep. Yep, Sumant.
There you go. Thank you.
Sumant Kulkarni from Canaccord Genuity, and thanks for taking my questions. I have three. So first, is depression episodic, and is there a correlation between how people in a depressive episode perform in the memory test? Second, I know you're not looking at baseline severity, but do you think ALTO-100 could have a differential impact on patients who've had depression for a long time versus a shorter time? And third, is there any merit in using hippocampal volume as an outcome-based biomarker, and or is it cost prohibitive?
Yeah. So, one of the most interesting things in relation to your first question about cognition and depression is that they're really independent of each other. So cognitive impairment is seen before the person gets depressed, it's there during the episode, predicts poor outcome, and even when they remit, is often persistent. So, though you know, you're really treating the pathophysiology of the disease per se by targeting this population with this kind of drug. You know, the issue of chronicity of depression, we have a wide range of kind of chronicity represented. Our average time since first diagnosis by memory is around twenty years... in this study, again, invite Adam to say a little bit more about the kind of historical milieu on those patients.
And then, you know, hippocampal volume, I think down the road may be a very interesting thing to do. That requires a lot more infrastructure, MRIs, which is standardized across dozens and dozens of sites. Jessica has actually done that kind of work in the past, so it's not entirely out of the realm of possibility. Probably will need a longer treatment period than six weeks. Even then, the clinical treatment period in clinic is usually longer than six weeks. And, you know, maybe that's something that we or somebody else does down the line, but, but ultimately here, we're grounding on change in MADRS scores. So Adam, maybe if you could talk a bit about the kind of historical background-
Yeah
- on our patients.
Going to the first question, also, people who maintain their cognitive deficit between the episodes are more likely to relapse to another episode. So it's a factor for relapse. In terms of... These are sort of the typical patient that Jerry or myself or Amit would see in clinic. They're in their mid-forties. They've had the illness for twenty years. They've had multiple episodes. Even the people who are on monotherapy, many of them, almost all of them, have been treated with an antidepressant in the past, a certain portion, even in the current episode. So these are the patients that are out there needing help, that aren't getting the treatments and getting the response they want from the current treatments.
Great. Any last questions? Oh.
Thank you.
Let's just get the mic so that people online can hear you. Great.
Thanks. This is Athena Cohen, for Ritu Baral. We wanted to ask about the secondary efficacy analyses, what you are looking for, and I believe there is one, analyzing the cognitive severity, subpopulation. So if you could just provide more color on those. Thanks.
Yeah. So and maybe if we can get back to that SAP slide.
Yeah. Yeah, yeah.
So the key secondary is our monotherapy population, right? But there are these other elements called out. So one of them is, you know, simply looking at the same cutoff, but at a more severe level, using a Z-score of one standard deviation or below healthy norms. Thank you.
Yep.
So you can actually see the slides here. And that's really to understand whether further enriching the population, further magnifies response, might guide what we do in Phase III. This is not powered per se. You can see it's powered at 0.55, but is instructive, and that's on our, you know, primary way to differentiate patients. It's not on clinical severity. We wouldn't expect that to differ either. By clinical severity, you can see it represents a little over half of the MITT poor memory population. There will also be some secondary analyses, like you know, secondary outcomes, response rates, CGI, and so forth, actually listed here on the left, that will be reported out. Those are really, to give you the fuller picture of efficacy clinically.
We saw CGI change in our Phase IIa just the same manner as we saw MADRS change. MADRS is the approval endpoint. These would be supportive of efficacy.
Great. Well, thank you all very much. Look forward to subsequent discussions, and really look forward to October.
Thank you.