Todd Friedman, Director, Investor Relations.
All right. Good morning, everybody. Thank you for joining us at PacBio's first-ever Investor Day. Thrilled to be here in person in New York City with you all, and welcome to everybody joining in remotely as well. Hope you all had a chance to look at our latest sequencers out in the foyer, Revio and Onso. If you haven't, they'll be here all day, so during a break or lunch, feel free to get a demo. Real quick, the agenda. PacBio Day, we'll have our leadership talking through everything from our strategy, our innovation, our new products, the markets we serve, and some financial information as well. We'll have two Q&A sessions throughout the day, so plenty of time to ask questions.
At any point, feel free to use the QR code on your table to take a look at the agenda. Then real quick, flashing up our safe harbor forward-looking statements. We'll be making forward-looking statements. Please refer to our SEC filings for more information. With that out of the way, let's get the day started.
To pioneer the future through biology, you have to look beyond the status quo to what's possible. You have to see the whole picture. You need a partner who's devoted to empowering your vision, who can deliver trustworthy technologies and the highest accuracy data, so you can focus on finding answers. Who strives to miss nothing, so you can do anything. You need a co-conspirator, and that's us. See a whole new picture with a whole new PacBio.
Please welcome to the stage Christian Henry, President and Chief Executive Officer.
Thank you. Boy, this is a great day for us. I see so many old friends, new friends, and I just wanna say thank you for joining us at what I hope is a very informative and, quite frankly, fun day. This is a great moment in our history where we are at the precipice of shipping two major new products, and what we wanted to do was share some of that excitement with you and tell you a little bit more about who we are as a company and what we are trying to achieve. Before we begin, maybe I'll just give you two seconds on my background. I'm Christian Henry, the CEO.
I've been with the company since September of 2020, and before that, I've been a biotech executive for a long, long time in life science tools. I did retire and come back to PacBio. The reason why I retired and came back, and actually came back, much to the chagrin of my wife, by the way, is that I think there's so much more biology to uncover with DNA and DNA sequencing. I do believe that the tools that scientists have had over the last two decades, while they've been remarkable, are insufficient to power the future.
What I believe is that PacBio's technologies and capabilities put us in a position to really provide the insights and the biological resolution to drive the utility of the genome forward and create value for all of us, not just us, but also our children and their children. This is gonna be a fun day. I'm gonna run through a few key objectives. Hopefully, by the end of the day, you will feel that you've learned a little bit about really our strategy for addressing this multi-billion opportunity. You'll understand our new technologies and why we think they're poised to be so successful in the market. We also wanna showcase the team we've built.
You know, part of building a great company is creating a great team, and I'm very fortunate to have built a fantastic team, which you'll see many of those members today. Please take the time during the breaks to mingle with them and get to know them, because I do think they are the ones that are really powering this company. Also, we wanna demonstrate that we do have the business plan, the strategies and the capabilities to create a highly profitable company and create shareholder value. I wanna take just a quick second to introduce a couple of our board members that are here as well. Over here is Kathy Ordoñez and Marshall L. Mohr. If you get a chance, we're very happy that they've been able to make the trip. Thank you for coming. Okay.
Now, there's five core themes that we're going to discuss, and one thing you'll find out about me is I like to talk about things in threes and fives. First of all, our strategy is intended to drive growth and profitability. We believe we also have differentiated products that allow us to take on all comers in the market, both emerging companies and the large incumbent in the market. We think our technologies will allow us to acquire more customers and drive our growth. We still believe the sequencing opportunity is effectively unbounded at this point. We're still in the early innings of leveraging genomics in biology, and we do believe that PacBio technologies and our strategy can play a major role in the future. We have the leadership team, as I've already talked about, to win.
Finally, we do believe that we can achieve positive cash flows within the forecast horizon that we're talking about today. I think that's one of the most consistent questions that I get from investors. You know, many of you, I'm sure, are sitting out there going, "Well, what does the model look like? What are the numbers? I wanna get to the answers." You know, I'm gonna start with the answer first, and then we will spend the day talking about how will we achieve these numbers. These are our objectives. First, we believe through 2026 that we can grow between 40% and 50% in a CAGR or greater than $500 million of revenue. Now, for those of you that do the math, 50% would be well over $500 million.
One of the things I always talk about with our teams is that the road to $1 billion of revenue is through $500 million. We decided to peg our 2026 performance to $500 million or more. You don't see the equal sign, so it really is greater than $500 million. We expect that we have the opportunity to significantly improve our gross margins from where they are today. By improving those gross margins, we will drive to a strongly cash flow positive business. We expect to become cash flow positive during 2026, and so that's our expectation. Of course, once we achieve cash flow positive, we don't expect to go back.
Those are the key three metrics that I want you to be thinking about as we go through the day. Susan has a lot more detail, so I don't wanna steal her thunder, but I thought it would be important for you guys to calibrate what we think our opportunity looks like, today. Let's step back for a second and think about where we are in the world of biology and how our technology can really, you know, really change the state-of-the-art, change the mindset of researchers and clinicians about the utility of the genome. What if an oncologist could provide a reference grade genome to every single patient? What if researchers could build a cell atlas that represents all isoforms, fundamentally changing the way gene expression is thought of?
What if you could sequence 1 billion or more 1 kb reads in one run? What could researchers do with that information? Or what if you could get five omes in one run? All of us probably realize that the world is becoming more multiomic all of the time. Our expectation is that with our capabilities, we can actually look at the genome, the epigenome, chromatin, transcriptome, and the metagenome all in one run. That's completely different than the paradigms that exist today. Our technology that is available today can enable all of this, and Revio will make it scaled and economic. At every presentation that I give, I start by talking about our mission.
The reason why I wanna spend a minute or two here is I wanna paint a picture for you of the framework of how we manage the company. When I joined the company, we needed to create something that would galvanize all of us to move in one direction. I mean all stakeholders, whether that's employees, board members, customers, investors. We needed to have a North Star from which all of our strategies could emanate, and our North Star is enabling the promise of genomics to better human health. We use that as the guiding principle from which we derive our strategies and which we make decisions. Having that, we also needed a strong set of values to define how the company works together.
This gives you some insight into how we think as a company, how we expect to work together to create groundbreaking new products, to drive innovation, and to do it in a way that's timely, customer-friendly, and ultimately creating value for all of our shareholders. Our five values, be curious, focus on innovation, delight our customers, take action, take risks, execution matters, and most importantly, working together. These are hallmarks of how we interact at PacBio to create value. What I wanna talk about is the notion that in order to move fast as a company, you have to have a framework with a strong mission, strong values, and a strong strategy, which I'll walk through.
If you have that three, if you're an employee in the center, we can focus our employees on pushing, you know, pushing decision-making down into the organization, which allows us to move faster, which allows us to inspire our employees, which then allows us to create great products. Without that framework, that employee may have a difficult time making a tough decision. That's why I start each conversation with this concept of what is our North Star? What do we believe in as a company? Then what are our strategies? Because it allows our employees to make decisions more quickly at the point of where expertise really is, so we can create great products and shareholder value. Who are we today? You know, we were founded in 2000.
Most of you know us at some level. We've been around for a very long time. We are publicly traded, of course, on Nasdaq. We have nearly 400+ patents at this point, and we have more. We have basically doubled the size of our organization since I joined the company at 770 employees. Little more than that, actually. We have 8 global offices around the world. When you're thinking about evaluating sequencing companies, you know, you do have to assess whether the company has scale to meet the customers where they are. One of the things I'm really proud of is that we've created that scale just in the last 2 years.
Just to kind of look at some of the timelines of, in the history of the company, how did we get to today, you know, the company started in 2000, as I said, and it took a decade to get the first products out. Then over successive years, the company has made improvements to its fundamental SMRT sequencing technology, which is, you know, our single molecule real time sequencing technology. For those of you that don't know exactly what we do, what we do, and when I talk to my ten-year-old about it, what we do is we take a semiconductor, and we put single molecules of DNA on that semiconductor, and we take movies of them so that we can see what's happening, see the sequencing in real time.
When you think about that kind of in an abstract sort of way, that's pretty cool. When you think about it in the scale of what we're truly doing today, it's actually mind-blowing. With Revio, we're taking 100 million movies, simultaneously processing them and getting the highly accurate sequence data, the bases, as well as the epigenetic data. That's pretty astounding when you think about what that really is. In 2020, I joined the company, as we talked about, and over the course of the last couple years, we've been working on technology, and we've also been working on building an ecosystem. A sequencer in and of itself is valuable, but it's not nearly as valuable as building an ecosystem around your products, because those ecosystems are moats that provide competitive advantage.
They're also enablers to enable new applications, new sources of revenue, and new ways of working inside of the community. We spent a lot of time in 2021 doing that. In 2022, we've been focusing on launching the ability to call 5mC on Sequel IIe, and that will obviously carry over into Revio. Of course, three weeks ago today, actually, we announced the launch of Revio and our new product, Onso, as well. We've been pretty busy over the last couple of years, setting the stage really for this day. What are the core strategic drivers that are gonna propel us forward?
I'll spend a couple minutes on those so that you can get a sense of what the core strategies are on how we are going to take these incredible products and drive revenue growth into the future. First, it's obvious to us and apparent that we see human health applications driving a significant percentage of our growth over the next decade. Today we see a sequencing market of that's about $7 billion. This is maybe a little bit different than some of our peers and what they say the sequencing market is, but what we're trying to represent is the actual dollars that are being spent today. You know, the money flowing through the universe, so to speak, through sequencing applications.
The biggest component of that is human genomics at $2 billion, you know, we estimate to be almost $2.5 billion, and oncology nearly $3 billion. Those are the two biggest areas. We think those are the two biggest growth drivers over the next decade as well. In this time horizon, we actually expect the entire market to double to $14 billion a year, with human genomics driving that growth as well as oncology. Now you can see the oncology growth is so much, you'll probably start to cut to the chase on why acquiring a short-read technology into that space was so important. We'll carry that theme through the rest of the day.
The bottom line is we participate in big markets that are growing quickly, and the products that we have serve the two biggest portions of that growing market. The second core strategy or kind of philosophy or way of that we think is that we think that multi-omics data will become more and more and more important over this time horizon. Now, our base time horizon here is 2026, but I actually think, you know, over the next decade, you're gonna see the integration of multi-omics data in many different areas.
Whether that's genomics or epigenetics, as we've been talking about, or even things like proteomics and metabolomics, more and more clinicians will leverage these different omics data to create better decisions with respect to diagnosis and treatment of disease. With this as a core philosophy or fundamental strategic driver, we're creating strategies to create, in fact, a multi-omic company. I've talked about this a lot. My vision for the company is to create a multi-technology company with multiple technology stacks, with integration over the top of them, as you could kind of think about in a decision support software or decision support capability. This includes long reads, short reads, other omics technologies that we will look at over time.
The objective is to have multiple technologies in the portfolio so that we can leverage the best characteristics of those technologies to serve our customers. Then within a technology stack, of course, we would like to have multiple products so that we can reach customers where they are, whether that's throughput or budget or both, and also that we could create pathways so customers can start at our technology, maybe at a starting point, and then move up as their throughput requirements dictate for them. We see this as an incredible strategic advantage over the next several years as we become more multi-omic for several reasons.
One, there's like, if you look at long and short reads, for example, there's incredible synergy between the long read portfolio and the short read portfolio, not just from a technology perspective, but from a cross-selling perspective. We're already seeing that with our customers asking for quotes that would carry both Onso and Revio. You know, they are realizing they can get best of breed technologies from one company for prices that are actually more affordable than many of the competitors out there. We see creating a multi-omics portfolio an important part of our strategy to penetrate and create value in these markets. Leveraging these technologies gives us the ability to access the entire market.
As I said a second ago, you know, we expect the sequencing market to be double the size it is in 2026 at $14 billion. We think the long read market is going to grow the fastest in terms of CAGR. We are expecting it to grow north of 40% per year, with revenues well over $1 billion in the market. We think that short read and other sequencing technologies will continue to grow in a mid-teen sort of way, but off of a much larger base, of course, and so that's still a very significant revenue opportunity for the company if we can bring highly differentiated products to the market. Hopefully, we'll convince you of that today. We also believe that a global scaled business model is essential to success in this market.
The life sciences market is truly a global market. Researchers collaborate all around the world, and to be successful, you have to meet them where they are. You cannot participate in the Chinese market from the United States, for example. It sounds, you know, that's a pretty obvious statement, but the reality is, if you're an emerging company, you don't have the infrastructure to meet the demands of the global marketplace. From PacBio, given that we've been commercial since 2011, we have the global scale, the global manufacturing and supply chain capability, as well as the ability to innovate. The combination of all those gives us the ability to accelerate revenue growth and serve these customers and create competitive advantage. Finally, I'm gonna emphasize this again and again and again.
In order to win against competitors with larger market share, you have to bring new technologies and differentiated ideas to our customers. It's a combination of not only differentiated products, but I say differentiated ideas. What I mean by that is that you have to be a company that is willing to work with your customers. One of our goals is to become the easiest company to work with in life sciences. I do believe that competitive advantage is created by the team that you have and how you actually work with your customers every single day. Not only do you need the products that are highly differentiated and desired by these customers, but you also need to work with them in such a way that they feel inspired to be using your technology.
They feel inspired to come talk to you at conferences because you're collaborating with them. I think one of the things I hope that you'll see today is that our team is an inspiring team, and they wanna work with customers. Jonas, at lunchtime, will really dive in to the science behind why we are winning. Just his infectious enthusiasm for the science, you know, is compelling when we go and talk to all of our customers around the world. That's kind of our high level strategy. Now let's just talk for a couple minutes about what happens in 2023. First, one of our core objectives has to be driving adoption of Revio.
You know, one of the core strategies that I came to the company with is, how do we accelerate the Sequel IIe install base as quickly as possible, even at the expense of some gross margin, so that we can get those instruments in, so that people can be using long read technology and get used to the data and used to seeing the power of HiFi. From there, once we've launched new products, that would be an incredibly fertile ground to go after and drive quick, rapid adoption of the new products. Well, today we have nearly 500 Sequel IIes in our install base, and so those are 500 places where I do believe Revio is immediately applicable. We've already seen an overwhelming response from our existing customers, as well as an incredible response from our new customers.
Jeff Eidel, our CCO, he'll talk a little bit about that today. That's gotta be front and center of what we do. Accelerate, accelerate. We also have to continue focusing on Onso, our short read platform, and demonstrating how this accuracy matters. You know, Mark will talk a little bit about Onso a little bit later, and Jen will talk about the markets around Onso a little bit. We think that it is a product that is gonna transform the way people think about needle in a haystack applications. Our second priority here has to be driving that notion and getting that product to market. That's not all. We're not stopping with Revio and Onso. This is just the beginning.
In fact, we have development programs in flight right now inside the company, developing ultra-high throughput sequencers that will make Revio look like mid-throughput, quite frankly. We have lower throughput long read sequencers in development that will make long read sequencing accessible to everyone at high accuracy and with HiFi capability. Driving those R&D programs forward so that we can rapidly launch products and build out this portfolio is central to our strategy. Mark will talk a little bit about our product roadmap and where we're going with respect to that. We also know we need to get on that path to becoming a sustainable company.
We've been very fortunate to raise a significant amount of capital over the last couple of years, and we've been aggressive in deployment of that capital to get to this day, so that we could have products that would be fundamentally better than others in the market and make us much more competitive so that we could accelerate our growth. Now is the time to continue investing in the new products, but really start driving, and getting leverage out of those investments we've made. Thinking about leveraging our infrastructure so that we generate, we get to that positive cash flow is critical. Finally, expanding our partnerships, continuing to show why HiFi is just fundamentally better, why short-read whole genomes, quite frankly, are insufficient for the biology and the resolution you need to truly understand the genome.
Continuing with partnerships such as our partnership with Children's Mercy Kansas City to continually demonstrate the power of long read sequencing, and in that case, in rare and undiagnosed disease, that still has to be a front and center message. Because now we have the product, we have applications, and we have the marketplace that's inspired to use those products, but we need to keep pushing so that we accelerate into the future. That's our—those are the core elements of next year, what you'll hear me talking about quite a bit. Last thing I wanna kind of cover before we—I start moving on to everyone else is how does our—how does the strategy get executed in practice?
I find when I talk to employees and talk to other stakeholders, that it's not enough to just say, "Well, here are our strategies." You have to show people the context of how those strategies get unpacked over time and what are the implications. I use this chart, as I said, I like to do things in threes. The first part of my strategy from joining the company was scaling the company. The company did not have a commercial presence sufficient to drive this incredible technology. We started when I joined the company in 2020 with creating scale. We started demonstrating the capabilities of HiFi and driving new product development. We had to get to Revio in order to make the company successful. That was really the first part of my strategy, scaling.
The second part we're starting to enter into in 2023, and that's the acceleration phase. For those of you that have been part of companies that have gone from $100 million in revenue to $1 billion, I think most executives will say that is the most fun part of being in a company because you're seeing the adoption, you're delighting your customers and you're accelerating. We're hitting that right now. I think that's really this acceleration phase. The next three years, I call it, is really all about that, and I think it's gonna be a lot of fun. You're gonna see us continue to drive elasticity in this period, consumable growth. You're gonna see us drive deeper into the clinic than we ever have in clinic translational research.
You'll see us over time start to really demonstrate to you how PacBio's technologies become clinical grade and get in. Mike Goloubef in his presentation in operations will talk about how we're already setting the foundations to be a truly clinical company. Then finally, once you get through these first two phases, you know, for now I'm calling it durability or durable growth, where we have a consistent growth story, we have expanding profits and cash flows, and we've created a sustainable company that is really helping to change the world. That's how the strategies that I've outlined really manifest themself over the next several years. Of course, I can't do this without an incredible team. We have a team with deep experience and, you know, you can see the level of understanding of this industry.
I think that's what separates us from many others, is that we have a deep understanding and appreciation of the industry, the customer relationships, the investor relationships, so that we can create a company that has significant competitive advantage, creates moats, and drives growth. I'm really proud to be working with all of these people. We also have a board of directors that has deep tenure inside of PacBio, so they can help understand not only the strategic path, but how that path relates to the future. Excuse me. I'm really excited to be working with this board, and as I said, we have a couple of our board members here this morning. Finally, we've created a scientific advisory board, Euan Ashley, Joseph Puglisi, and Jay Shendure.
We had our first SAB meeting in the last couple of weeks. What's amazing is that we wanted to put an SAB in place so that they could help us see over the horizon. I think what was really amazing to me was the level of expertise these folks have in all of the areas that we're interested in and how, particularly, you know, you look at Euan. Euan's focused on how do we get our technologies deeper into the clinic right now. Joe's focused on kind of thinking about the technology in new ways. Jay speaks for himself. He has so much deep expertise in the application area.
The combination has created some really spirited discussion, and you know, we're gonna leverage them as we drive our programs forward. Very excited about that. Now what I'd like to do is thank everyone, and we're gonna have David Miller join us here in a second to start talking about the products.
Please welcome David Miller, Vice President of Global Marketing to the stage.
Well, thank you everyone for coming today. It's a true pleasure to be here and speak to you about our new long-read portfolio. As mentioned, my name is David Miller. I lead our product marketing group. You know, I wanted to give you a little background as well as to how I ended up at PacBio. I joined the beginning of last year, and I'd spent about a decade running NGS labs, a similar sort of amount of time running high throughput marketing or product marketing at a short-read sequencing company. After all that time of working in labs and looking at the technologies and looking at the data, I joined because I fundamentally believe that PacBio is the future of NGS.
The technology that's coming out, the sequencing data that is being created is truly where the industry is going. It's amazing to be part of that. What I wanted to do in this session was really go a little deeper on some of the things that Christian touched on. Firstly, I will go into our new Revio system and give you some background around our SMRT technology and HiFi and why the industry, I think, is so excited and what's bringing so many of us together to work on this amazing platform. I'll also talk a little bit about how we think about the ecosystem, why we think that's differentiated, how it's helping us get this platform out and really scale up, and why it's gonna continue to drive growth in our space.
Then finally, I'll touch a little bit on why I think that, Revio and HiFi especially are differentiated relative to some of the other long read technologies that are out there, and hopefully give you a sense of how we think about the success that is really gonna be enabled by Revio. With that, I wanted to go and start, like I said, coming back to sort of the fundamentals of what we do and how the technology works, because as Christian said, it's truly amazing to believe that we can watch 100 million single molecules of DNA replicate in real-time. That is really our single molecule real-time technology, our SMRT technology, if you will. It all happens on our semiconductor chips that contain our zero-mode waveguides. Within those, that's where all the sequencing is happening.
That's where we're able to attach a single polymerase and watch that single polymerase extend DNA in real-time and really capture that sequence information. When you hear us talking about SMRT cells and ZMWs, that is fundamentally where the sequencing is happening. A little bit of context, the on-market product today, the Sequel IIe, that uses about 8 million ZMWs on an individual run. We're watching 8 million molecules at a time. As we said, with Revio, we're scaling that up to 100 million. That's four chips with 25 million ZMWs. I'll go into those details shortly and why that's so important and fundamentally enabling for our customers. Within those ZMWs, what is actually happening is that we are watching that polymerase extend and incorporate nucleotides that have a dye on them.
Each of those dyes we can detect and determine whether it's an A, a C, a G, or a T. That's sort of shown in that middle panel there. Because we're doing this in real time, because we're watching this extend, what we're able to do is actually capture the time it takes for that nucleoside to incorporate and how long between those incorporations. That's how we capture the epigenetic information, adding that extra arm to the data. When you think and hear about other people doing methylation, they're leveraging the same signal. We're using two different ways to capture both the sequence and the methylation information. We think that's really important and gives us an advantage when it comes to determining those two arms simultaneously. The final piece to the puzzle is really the HiFi technology.
That is fundamentally our understanding of leveraging the informatics and the fact that we are sequencing circular molecules. Because we're going round and round that circle multiple times, we can take multiple looks at that DNA molecule, that original single DNA molecule, and start to really understand what is going on there. That factor, coupled with a highly random error rate that we have, allows us to polish out all the errors, and that gets us to long and highly accurate reads, something that no one else can really do. That is what has captured the imagination of so much of our genomics community. What can you do with HiFi? Well, you know, the first thing to recognize is that it gives you a read length that's about 100x longer than your standard short-read sequencer.
Where, a short-read sequencer is generating sort of a 150 base pairs of data, we're generating 15,000-20,000 base pairs of data. Couple that with the fact that, as I said, we're able to get to incredible accuracies. On the right, what you can see is a study from the precisionFDA that came out early last year, really showing that HiFi in green gives you the highest data quality. Now you've got a technology that gives you a 100-fold longer read length than the short-read technologies and 10 x the accuracy of the other long-read technologies. It's this combination that has enabled so much. I think it's important to know that HiFi is a lot more than just long and accurate sequencing.
This is where it starts to become really important when we look at the other attributes of the technology. We get incredible evenness of coverage. There's no GC bias in our sequencing data. It allows us to sequence all throughout the genome and really detect all the variation that exists. That includes the methylation, as I spoke about, so an incredibly important arm that we're starting to add, and everyone gets off a PacBio sequencing run for free. Yes, we have that extraordinary accuracy, so you can really understand what's going on, and those long reads really help you do things like allele resolution phasing. Now you can really start to get a full picture of the genomics and start to understand all the variants that are occurring in that sample.
What we're showing here in the IGV plot, for those that are familiar on the left, is really looking at three different technologies, short reads, HiFi, and noisy long reads. Each of these, you can see the trade-offs that are being made relative to HiFi. In the short-read space, you're seeing those dropouts, those regions of high GC that you can't sequence through. Importantly, in this diagram, those regions have variants, so you're missing variation there right off the bat. In the noisy long read down the bottom, you're seeing all the errors coming through that's preventing you from calling variants accurately and truly understanding what's going on or getting that phasing information that's so important. If you look at HiFi in the middle, you've got that incredibly clean, accurate data.
You've got evenness of coverage, you've got all the variants called, and you've phased the region, this gene, STRC, into the two haplotypes. This really, to me, summarizes why HiFi is so exciting and why so many people are interested in what we're doing with it. It's data like this that has driven, you know, so many great studies to come out in recent years. Certainly, Jonas will go into much greater detail about the science over lunch. I just thought if we think about genomes, transcriptomes, and epigenomes, there's three seminal papers, each demonstrating a first that came out in the last year or so. On the genome front, you know, HiFi was right there when we sequenced the first telomere-to-telomere assemblies. HiFi underpinned the ability to really get those complete view of the genome.
On the transcriptome front, our colleagues at Broad, and we'll talk a little bit about some of the advances we're seeing in the transcriptome space, were the first to publish a single-cell isoform catalog using HiFi. Then finally, with our new epigenetic technology, our colleagues at Children's Mercy were really able to use the technology to look at phased methylomes and truly understand what is going on in some of the rare and undiagnosed genetic disease cases they have. HiFi has captured everyone's imagination, and while we've had a decade of bringing products to market, as Christian said, and really scaling them up, you know, we've gone, you know, from the original RS, we've increased throughput 10,000-fold, we've increased read length over 100-fold.
The two things people kept asking for were, "How can I get higher throughput for HiFi?" and, "How can you improve the sample economics?" I'm excited to say that that's exactly what Revio does. Revio gives us 15-fold increase over Sequel IIe and ushers in an era of the $1,000-dollar HiFi genome. You're getting all those benefits of epigenetics and phasing for $1,000. Jen will do a really amazing job of walking through why that is so valuable. Our community is telling us this is something that they truly believe is gonna change genomics. I think, though, it's important to note that the Sequel IIe platform remains on market. It is still a fantastic product, and people will continue to use it for key applications.
We've already seen folks express interest in keeping their Sequel IIes for running things like AAV, microbial genomics, targeted resequencing. There is a long list of applications where the Sequel IIe remains incredibly powerful. That idea that we now have a multi-product portfolio is truly exciting for us as we head out into the market this year. Let's go a little deeper on Revio. It's an exciting and amazing product, and hopefully, everyone's had a chance to see it out in the foyer and actually go hands-on. But as we said, this product is really designed to deliver HiFi at scale. To us, you're in a new era of sample economics when it comes to HiFi sequencing. Every Revio run is capable of generating 100 million ZMWs of sequencing data.
Watching, as we said, 100 million single molecules extend in real-time. We're doing that in 24 hours. That gives us enormous benefits when it comes to lab operations and really achieving that 15-fold throughput increase that we mentioned relative to the Sequel IIe. Importantly, each run is putting out about 360 gigabases of HiFi data. Four whole human genomes at 30x coverage every 24 hours. Really delivering the scale that folks are looking for on this platform. It's not just the scale. We wanted to build an incredible product that's really gonna enable people to take advantage of it. Yes, we have that scale of 1,300 genomes per year, driven by 25 million ZMW smart cells, four independent stages that I'll talk about, and a 24-hour cycle time.
It was important for us to make sure that we delivered a platform that was easy to use. It's one thing to have the capability to do something, but being able to actually make it happen in real life in the lab is incredibly important, especially as someone who's spent a decade running labs. We've reduced the number of consumables people need to interact with and touch to load a sequencing run. We've implemented what we call a load in advance capability. For the first time on our platforms, you can come up while it is sequencing, open the door, and put on the next sequencing run, really getting to continuous operation. Again, really improving lab operations. For folks that have heard, we've actually removed nitrogen, something that we've used on our systems for the past decade.
That is no longer required and really opening up ease of installation into a number of labs. We also thought about the affordability beyond just the consumable cost. We've made the loading as simple as possible, so your hands-on time, your labor costs are down to just a minute per run. We've decreased the file size that comes off the system. All your computational costs are decreased. Importantly, just like on the Sequel IIe, all the compute you need to generate HiFi reads with methylation is on board. We scaled the compute about 20-fold relative to the Sequel IIe. Let me go into that in a bit more detail because I'm incredibly proud of what the team was able to achieve in getting so much into the same footprint as the Sequel IIe.
As I said, we've scaled the throughput or the performance of the compute about 20-fold relative to Sequel IIe, and therefore about 40-fold relative to the original Sequel II. This is the first platform where we've been able to implement GPUs on instrument. We're taking advantage of the latest NVIDIA GPUs, and that is really powering what we've partnered with Google Health on DeepConsensus. It's DeepConsensus that is helping us with our HiFi read generation, and as you can see in the middle plot, is even improving the data quality relative to Sequel IIe.
By partnering with Google, which we announced at the beginning of the year at JP Morgan, we've now been able to move that software on instrument, reduce the run time down from days to hours, and have it all happen on the instrument seamless to the user so they get that optimized BAM out at the end. Smaller file sizes to move around, smaller files to keep. All in all, it makes for an incredible user experience to get to that data that they really want to, the HiFi reads. We really believe that Revio will be transformational for HiFi sequencing. The ability to sequence 1,300 genomes per year, the ability to get more reads out of every SMRT cell, to run multiple cells at a time.
All of this is coming together to build a product that I think the community is incredibly excited about. I think the numbers that the team will walk through will speak to just how excited they've been. I think the quote down the bottom, though, captures it incredibly well. As someone who actually worked on the GAIIx, to have someone say that this is the most important sequence that'll launch since the GAIIx is incredibly powerful. Brian Krueger, Vice President at Everly Health, really believes that Revio is gonna change the game when it comes to next-generation sequencing. I wanted to go a little deeper on the four stages that we have on the system and how customers can use it.
As we said, this is the first time we've enabled a feature like this on a PacBio system. I think if you think about the flexibility it brings, it really will start to open up some exciting applications for folks. One of those big markets we spoke about was oncology. You could imagine if you really wanted to understand a tumor of a patient, you could on one cell sequence their normal genome, understand what the baseline is, understand their inherited risk. You could sequence on a couple of cells their tumor, really understand somatic variation, see what's driving that cancer. Finally, with our MAS-Seq kit that I'll talk about at the end, we can look at all the isoforms that are coming through in that tumor and understand the functional impact of it.
In one run on Revio, you can get a complete genome, epigenome, and transcriptome profile of a cancer sample, as well as the patient's normal sample in 24 hours, which is pretty amazing. The data, though, I think, speaks for itself. We're still generating that incredibly high quality, high accuracy, HiFi data. We have done runs and published some of these on our website showing the HG002 pedigree. You can go in and dig in. Customers are already doing this, showing that there's no compromise moving from Sequel IIe to Revio. If anything, like I said, we're seeing slightly better performance on the new platform. Importantly, though, we wanna make sure that the platform is still gonna work for all those amazing applications that we've had over the years that have been real strengths for PacBio.
The ability to sequence not only humans but also plants and animals. On this run, we actually included a combination of oak and mistletoe, and we're able to get the same great performance out. For customers that run core labs, work in mixed samples, or even have large research institutes, you can really look in both the human, the non-human space. Continuing to hold up that amazing pedigree that PacBio has of enabling science. In the human space, it is incredibly important because we believe this is the platform that will unlock HiFi genomes. When we go and compare, as I said, we're actually seeing slightly better performance in some cases relative to what we were doing in the precisionFDA Challenge.
You can see on the right that our accuracies for both SV, SVs and indels are slightly improved, and this is coming out of the gate. We think there's some opportunities for us to continue to work on the variant calling algorithms and push this further. On the right, you can see that the methylation that we just launched at the beginning of the year carries through and works just as well on Revio as well. All the features and benefits that people love about HiFi, but now with the scale and the price they were chasing all along. As we said, it's one thing to build a fantastic sequencer, but we really wanna make sure that we have a workflow and an ecosystem that supports its use. Again, this is really important to me.
The way that we think about this is really we're gonna focus our development efforts around sort of the core of the workflow, the sequencing, some of the library prep, the on-instrument analysis. That's where we're gonna invest. We're gonna start to look at partnering both upstream and downstream to build out workflows, to enable further applications, and really drive growth in HiFi sequencing. We have a number of partners upstream that are focused on bringing automation to the platform, allowing people to scale new workflows, new applications like targeted resequencing I'll talk about. Then on the downstream, you know, we've partnered with folks like Google and Sentieon to make sure that the variant calling algorithms are there to support digging into the data and extracting the most valuable information from it.
All of this is designed to make sure that we have an increasing number of applications that we support on the platform. When it comes to those partners, you know, as I said, we've partnered with folks like Hamilton upfront, not only on our extractions with the Circulomics products that we acquired last year, but also on the library preps and instrument setup. That is fully automated now. You can really run the instruments at scale. We've partnered with folks like Twist to bring targeted resequencing to the platform, just announced recently as a fully supported workflow, looking at sort of the 400 or so genes that aren't available in short reads or don't sequence well with short reads. Also looking at things that people really care about, like pharmacogenomics and HLA. All of that is available through targeted resequencing products with Twist.
On the back end, you know, really working with Google, not only on DeepConsensus, but also on DeepVariant, so we can really call all the variants coming out of the platform, make it as seamless and easy as possible, and then start thinking, how do we take that to the cloud and deploy these workflows. This ecosystem we're starting to build, and we've been spending the last two years building, I think really differentiates us and helps get the platform up and running as quickly as possible, knowing that we now have 15-fold the throughput of what we were offering last year.
You know, the question we keep getting is, "Well, how does that compare to other long-read technologies in the space?" I want to sort of give our perspective on where Revio sits relative to on-market products. Revio is certainly changing the game when it comes to how we compare to other technologies. The most obvious comparison that we get is to high-throughput nanopore sequencing. What you'll see is that with Revio across the board, there's a number of key areas that we're gonna actually exceed what you're able to do on a nanopore sequencer. Firstly, the price, importantly, the price per genome is actually lower on Revio than it is on a high-throughput nanopore sequencer. That read quality that we spoke about, all those benefits of HiFi, that's an order of magnitude better than it is on nanopore.
You're getting that full complete view, able to call SNVs, indels, and SVs all off one run, as well as the methylation, and you're doing it in 24 hours. While it all comes down to about 1,300 genomes per year, you know, the fact of the matter is that because of the workflow, because of all the investments we've made in simplifying and streamlining and building out that ecosystem, that 1,300 number is readily achievable to all labs. Something that I think we hear from many Nanopore users is challenging to get to. We're really excited for what Revio is gonna offer relative to Nanopore, and we've already seen amazing interest from a number of folks that haven't previously been invested in PacBio. The other technology that keeps coming up is the synthetic long reads or CLR, as it's called.
I think we need to be clear that from everything we understand and what we've heard, this is synthetic. You know, the quote that came out at ASHG and in GenomeWeb really said what we are doing when we have a long read, we have a number of short reads with marks, and we're merging them together using those marks. That is the definition of a synthetic long read. They are taking short reads and merging them. The way they seem to be doing that is by adding marks or errors into those reads so that they can reassemble them. What that means is that you need to over-sequence to get around those errors and to polish them out. The numbers that we're seeing is 5-fold - 7-fold.
It means that your synthetic long-read genome is probably somewhere in the order of $1,400-$4,200 per sample. Again, more expensive than Revio. It means that you still can't get access because it's amplified to certain regions of the genome, and your read length is even shorter. You're not gonna cover some of those genes that you really want end-to-end with phasing. You have a more complex workflow. You need to use informatics to do all of this. On a number of fronts, these synthetic applications are really not gonna deliver what a true native long read can do. Importantly, there's no data out there that we've been able to get access to, no peer-reviewed publications. As I said, Jonas is gonna walk through some of the 9,000 publications we have demonstrating the value of HiFi and PacBio sequencing.
The final thing I'll say on this one is that those synthetic long reads seem to be limited in the applications that they take on. One thing that we're excited about, and we also launched at ASHG, was the new MAS-Seq kit. The 10x Chromium three-prime single-cell kit is able to now be sequenced on PacBio, generating full-length single-cell isoforms. This is one of those applications that you can do with native long reads and with this new kit that you can't do with synthetic. When we look at this, what we hear from customers is the amount of excitement they get from truly understanding the biology of looking at isoforms, not just gene expression. You can see the quotes from customers there. Already using it's a game changer. We don't know enough about isoforms.
This is gonna allow us to do and uncover that diversity. My dream is to do single-cell spatial transcriptomics with long reads, and now I can. We are incredibly excited about this kit, not only because of what it enables, but because how we got here. This is a kit that we partnered with Broad on. It came out of a publication, a homebrew method, and through partnership, we were able to improve it. Relative to the method that was published, you know, we're now generating sort of 100 million reads, 100 million individual isoform reads off a Revio chip. About a 5x increase in what the homebrew method was able to do. Absolutely impressive and truly amazing and really something that we're seeing a lot of interest in.
I thought I'd start to wrap up here and transition because, you know, this is one of those applications that we really think is gonna move from short reads to long reads and really fuel a lot of the growth in Revio. You know, it's clear that short reads cannot sequence the whole transcripts in single cell. All you're looking at is that three-prime end and getting a view of the gene expression, not the transcriptome. As we look at this quote from a neurology PI at a major research institute, to them, it's clear that in the next few years, long-read sequencing will be the de facto standard in transcriptomics research. We get such a comprehensive view of the gene regulation. It's already enabling us to uncover novel mechanisms that are associated with developmental disorders and disease.
We really believe that it's actually the isoforms that are driving so much in the, in the transcriptome, not just the genes. This is the way that we can look at that, and one of those areas that we really think will transition, as I said. To close, I would just say that we understand, and obviously, we acquired Omniome to look at short reads because we do believe there are areas where short-read sequencing is the right tool for the job. A lot of the oncology research, looking at liquid biopsy, those are the areas where we really think that short-read sequencing will be the go-to technology. What's important is that you need high levels of accuracy, and that's why we settled on Omniome. With that, I'm gonna transition over and introduce Omniome. Thank you.
Please welcome to the stage Chief Operating Officer, Mark Van Oene.
Thanks for joining us today. I really appreciate all your time here. Dave, thanks for that presentation. He gets me so fired up when he talks about what we can do with Revio. It does remind me why I joined PacBio. Yes, I wanted to build a company, and I love building organizations. I'd been on that ride with Christian from $100 million - $1 billion - $3 billion. It was way more fun, that $100 million - $1 billion. With that in the backdrop of wanting to grow the company and then knowing what this technology in long reads could do with scale and affordability, I had to be a part of it.
Revio is the culmination of that vision because I fundamentally believe that a whole human genome should be done with HiFi. You should not accept a short-read genome without structural variation, without methylation, without really understanding the impact of tandem repeats and phasing. Fundamentally, the limitations were scale and throughput. Dave, thanks for reminding me what scale and throughput's gonna do for us. But that doesn't mean the short reads shouldn't also be accurate. When looking at the short read market, we found a chemistry in this SBB chemistry that was completely differentiated from anything I'd seen in close to 15 years.
We had to be a part of that accuracy and see how we could bring that to market to change some more paradigms for sequencing and address the entirety of the opportunity ahead of us. I'm gonna talk a little bit this morning about Onso and some of the accuracy that we're seeing. I'll share a little bit more data on how we see that playing out in some oncology applications. I'll give a little bit more about our longer-term roadmap because as mentioned earlier today, you know, Revio and Onso's launch next year are just the beginning of how we revolutionize our long-read opportunity and our short-read opportunity going forward over the next several years of the horizon.
We talk about accuracy and, you know, it should be a part of everything that's being done in development. But for the last decade, there's been very little improvement in accuracy in short reads, which is why this SBB chemistry caught our attention. What we're highlighting here is for you to think about the strategy of us having both technologies. You know, none of these markets are best served by any one particular technology. This combination of short and long reads with accuracy and scale and affordability lets us really think about how we can enter short-read markets as they transition to long reads or support long reads markets with the combination of short and long read data types.
Great examples, you know, Dave. Dave highlighted how HiFi sequencing and complex disease research is primarily always gonna be a long read application going forward now that scale and affordability is there. But things like cancer, you know, is much more suited for a short read technology if you think about early detection or monitoring of residual disease, because the biology of cancer is very low frequency events, which requires a lot of depth of sequencing. The samples and the access to samples is typically liquid biopsy in which the DNA fragments are chopped up and relatively short. You think about the size of some of these market opportunities and how we can best serve those with the technology, this combination lets us play very differently.
It also lets us enter into complex disease research where people are still doing panels and exomes and things that short reads are better for. As those markets transition, and around the world, those will transition at different rates, but as those markets transition from targeted short-read panels to affordable long-read genomes with all the extra information they're gonna get. This combination, we do believe, will help us drive more Revio success as well as more Onso success because of the way we can support these customers. If you dig into Onso a little bit more specifically, and I do hope you guys get a chance to look at it outside, this is a mid-throughput sequencer. You know, we're launching it with something around 500 million clusters, and a couple of different kits.
We see applications like a liquid biopsy, where most DNA fragments are 180-190 base pairs long. You know, we do see application for single and 200 base pair read kits. We also think of applications like 2x100 reads. You know, a lot of exomes today are 2x75s, and so a 2x100 kit is very attractive for people that are doing targeted approaches with short reads. Traditionally, people use paired-end 150s, and so we're gonna have this combination of read lengths and kits available to our customers on that launch. The differentiator is it's no longer 80% or 85% Q30. Right?
We're gonna be coming out with a spec at 90% Q40 and have confidence that we're gonna be able to deliver on that accuracy of an order of magnitude improvement. As Dave mentioned, not just for the long-read sequencer, it's also really important that we have an ecosystem in and around our short-read sequencer. We've built up the partnerships, we've built up the kit structures. We're gonna enable people to take their workflows and their library preps and adapt those to get onto our Onso sequencer. That entire ecosystem is a really important part of what we're also building out for the short read platform. As we said a few weeks ago, we are still on track.
You know, we continue to be on track to have first customer ship in the first half of 2023. As we finalize some of the pricing in and around our reagents, we do expect them to be priced where mid-throughput sequencers price their reagents. We'll start taking orders for the platform in the beginning of the year. I will say early feedback, you know, we launched this with a bundle on the instrument, so a Revio Onso bundle for $849,000. Much more interest in that than I even expected.
It's that combination of technologies for these researchers and these core labs to think about how they can use accuracy and differentiated technologies at a very affordable price for the two platforms to go and drive their research and discoveries forward. A lot of the questions have been around the accuracy and some people will say that accuracy doesn't really matter that much. I just fundamentally disagree. I don't think the community has had access to Q40 and Q50 data to really understand what they can benefit from by this level of sensitivity. I'm gonna walk through an experiment we've done and apologize for those of you that this is gonna be a little bit too technical.
We're getting into the technical section of the morning here for people between Jen and myself, but we'll try to keep it fun for you. This is an experiment we did, and this is a SeraCare circulating tumor DNA sample. It's an industry standard. It's a great way to prove technologies and compare technologies. We run this with paired-end 100 reads. We worked with Agilent on this. Agilent has a comprehensive cancer panel that's an enrichment panel that we could use to look at a variety of cancer genes to see what sort of frequency of variation could we go and detect.
If you start to look at the data, what we're comparing here in magenta, go figure, is the SBB data that we generated on Onso, and in orange is not to be named SBS set of data. You can see, you know, this is really looking at at what sensitivity can you detect the expected variant percentages. You know, a good spot to just anchor on is around here at around 0.1%, you know, 90% of the time, we can detect that. This is using equal coverage. This is 6,000-fold coverage for SBB and 6,000-fold coverage for SBS.
We're at about 90% sensitivity at a 0.1% frequency, and we continue to be able to go lower than that and still pick these rare variants. You know, that's great, but what does that really mean when you're starting to look deeper at the data? Does that sensitivity matter, and does that accuracy really make a difference? This is the same experiment. Now we're just looking at one of those IGV plots. And you can see that for the most part. You know, these are Ts, and you would expect that. You're looking for that rare cancer variant that exists in this sample at a very low frequency of 0.1%.
You start to see at the top there, and we've blown it up, that there are some different nucleotides called on the SBS data. The easiest way to look is at the counts. You can see that, you know, there were three A's, there were nine C's, and there were five G's. For the most part, it was wild type T, which you'd expect. So as a researcher looking at that, you're trying to figure out what happened. Is that an A, C, or G have detected a variation? You know, the reality is that because of the mismatch error rate on SBS, you really don't know. This is that same SBB data from that sample in that same region. There were zero A's, six C's, and zero G's.
That sensitivity makes a difference, and it gives you the confidence that you now know what that variant was in this, you know, NRAS gene that's causing the cancer at a very low frequency. What the community is doing is they're trying to come up with different ways to overcome the mismatch error rate on SBS sequencing. They're using what they call unique molecular indices. We thought it would be more appropriate then to think about how would the community use this to overcome the sensitivity challenges. Now we've done 6,000-fold coverage with SBB, no UMIs. We're just looking at our native SBS chemistry and 24,000-fold coverage on SBS. You've got about a four-fold extra cost to get to that coverage.
If you look at the data here, again, this is now a different gene we're looking at, you can see that on the SBB, there's 3,272 C's and nothing else. On the SBS, you've got mixed in there 11 A's and four G's and three T's. Is there a variant there or not? Does this patient have cancer or not? You wouldn't know. Even doing four times more sequencing with using UMIs and the complexities of the library prep and deconvoluting that, you still don't know the answer. Now you have to go and investigate because there may be a variant there. In fact, there's not. This is a perfectly wild-type sample, and there should be nothing else detected there, which is what we've identified.
These average plots can be misleading when you're an actual scientist looking at counts at these very low levels of sensitivity. This accuracy is going to matter. It's gonna cost people a lot more to oversample to try to make up for it, and they're still not gonna have the answers that they truly need in this market. We feel really good about where we stand now with differentiated long and short-read technologies. You know, as Dave went through in some detail, you know, the scale and the throughput limitations on PacBio have been overcome. You know, a $1,000 clinical genome is the genome that will be adopted going forward for clinical applications. We have a huge opportunity to drive into transcriptomics and epigenetics in ways that we didn't have two years ago.
Onso is to be proven, right? We see the evidence. We're gaining confidence in what that accuracy is gonna do. We will see what the uptake of that is. It's really important for us to support the customers with these two platforms. We can't stop here, and it's really important for us to meet the customers where they are. Christian talks about how we have to always be just passionate about the customers. Not every customer wants exactly what we're delivering with Revio, and not every customer wants the mid-throughput version of an SBB sequencer. This is a graph, and don't try to interpret the schematics. That's what our SMRT cells actually look like.
As you can see over time, they've become much more integrated SMRT cells as we include optics and, you know, continue to push the density. The point of this is for you to really to look at where we are in our roadmap of SMRT cell development with Revio. You know, we just made a huge leap from the Sequel IIe, going from eight million ZMWs to Revio with 25 million ZMWs. But we're still all on eight-inch or what they call 200 millimeter fabs. That eight-inch wafer technology is still years behind where that semiconductor industry is. They've all moved on to the 12-inch 300 millimeter wafer technology. What Revio does, and the big change we've done here, is we've now moved to this backside illumination.
That's really important because that's what the 300-millimeter industry is now built upon. It gets the light source closer to our polymerases. It reduces our crosstalk and our noise. It's gonna let us go and further push on that pixel density for ZMWs so that we can go beyond that 25 million. Huge leap forward for us. You know, Revio is gonna be critical for us to go and prove to the world what you can do at scale and affordability, but it's not where we're gonna stop. You know, we talked a lot about the Invitae collaboration a couple of years ago and how we're working on the N+1 chip. That's what I'm indicating here with that ultra-high throughput schematic.
We're moving now to the 300 millimeter wafers, maintaining backside illumination, putting in some different stacked components to that so that we can further improve that density and drive cost. If you think about the cost profile and the margin profile changing over the next 3 years - 4 years, going to 300 millimeter fabs is a really important component of that. These are both gets away from sole supplier, but these are ways for us to think about manufacturing with different providers at more affordable costs because that's where the semiconductor world has moved.
We're really excited about the progress we're making on this next even higher density SMRT Cell and continue to push to get into that platform that will enable us to do tens of thousands of genomes per year, not hundred, not hundreds or thousands of genomes a year, but tens of thousands of genomes a year. Within the timeline that Susan's gonna walk you through, we will have Revio, and we will have this ultra-high throughput sequencer. We also think it's really important to make long reads and HiFi sequencing accessible. It's not always just pushing to the high output. You know, some of the batching effects and challenges people get into are real, and so you don't always wanna just push to the highest possible throughput, lowest cost per gigabase.
You also need to have things that are accessible to drive utilization of HiFi, entry into HiFi technologies, and expansion beyond that in, as the demand grows and through the different throughput levels of the system. We did announce a partnership with Berry Genomics, which is really built around how do you make a smaller footprint, low CapEx long-read sequencer. Really important for things like transcriptomics, targeted DNA panels, microbiology applications. You know, people don't always wanna spend, you know, the $779,000 to get into higher throughput, you know, lower cost things. It helps us segment the customers, it helps us become more global, and it's a very high priority project.
Going from Revio to benchtop, I will say, is an easier thing than going from Revio to a higher throughput. In the same time horizon, we will also be launching this partnership with Berry Genomics and getting into that benchtop long-read market. Now on Onso, you know, that accuracy matters, and I don't think of it as 500 million reads. You know, if it at least requires four times more sequencing on SBS, you know, think of that mentally as closer to two billion reads, right? Because that's how much more sequencing that they need to do. I think of it as a mid throughput, but a really high-end mid throughput sequencer as it is. Let's face it, a lot of these different needle and haystack applications are really sequencing greedy.
Some people will sequence 50-, 60-, 70,000-fold coverage, not 6,000 or 10,000. You know, we had a great talk from Max from Stanford last week. You know, he's pushing on how does he get his error rates down to 1 in a million, not 1 in 100,000. Really thinking about what that sensitivity is gonna be meaningful for in picking up residual disease sooner. Jen will talk you through some of that application. We do need to push Onso into higher density flow cells and for COGS as well as for throughput and for market opportunity. Also within this timeframe, expect us to expand our short-read SBB platform into higher throughput technologies and applications for enablement of those.
We really do feel like we're setting ourselves up for a wonderful ride, and it's gonna be fun. This path to greater than $500 million is where I enjoy being. I wanna remind you of some of the different market opportunities we have. Jen's gonna go through great details, but human genomics, I do believe is where we're gonna win with HiFi. Oncology is where I believe we're gonna get some great beachheads and start to prove out the value of our SBB chemistry. Now, neither of these are solely long read or short read. You know, in oncology, transcriptomics is gonna matter. RNA variation matters. Isoforms definitely matter. They're looking at less than half of the transcriptome today with short reads.
That oncology market that is big and fast-growing is gonna continue to be a focus for us on both technologies. I truly believe that things should be done in a completely unbiased fashion. You know, hypothesis-free science is the science that wins. We're enabling that with genomics and transcriptomics. We're also enabling that now with microbiology applications. You know, you don't have to just do 16S sequencing anymore. You can look at what that metagenomic information really is. In plant and animal, you know, I don't think it's restricted to de novo assembly of you know, your rare species or organism or plant. You know, more and more you're seeing the genotyping world move from arrays to sequencing.
These genotyping by sequencing applications and a big driver of our partnership with Corteva is where both long reads will help with all of the polyploid, but where the accuracy of long reads and the accuracy of the SBB are gonna help them with GBS, with genotyping by sequencing because you require much less depth if you have confidence in that genotype call that you're calling. Look for us to continue to also push in plant and animal. We talk about bettering human health. Part of that is supply chain of food. It is feeding the world. It is making sure we're free of infectious diseases and surveillance. Just because we talk about bettering human health, I don't want you to think that we're leaving the plant and animal and microbial world behind. We're not.
Some of these new emerging technologies, you know, AAV has been a great new thought around how we can use longer reads to look at the CRISPR editing and some of the different therapeutics. What excites me is decentralizing HiFi and decentralizing SBB so that the brilliant scientists around the world can tell us what great new applications and methods they've come up with. You know, this MAS-Seq collaboration with the Broad Institute is a great example of that. You know, we can start to think about transitioning gene expression to transcriptomics. You know, it required some help from them.
You know, somebody will come up with that next great use of HiFi or that next great use of SBB, and we're gonna be ready there to partner and to catch them and be the best partners for the scientific community that we can be. With that, I'm gonna turn it over to Jen, and thank you for your time.
Good morning, everybody. It is so exciting to be here in New York City and be here live with you all to share the exciting time in our space and particularly here at PacBio. I have the pleasure this morning in my section to share with you more details about the multi-billion dollar market opportunity that PacBio is going after. Before I launch into the slides, if you'll indulge me for just a minute, I'd like to share a little bit about my own story, about how I came to be here at PacBio, and why I'm so excited to be part of this organization. I'm a geneticist by training. I've been in this space for a couple decades now and have had the opportunity to see the incredible impact that technology has had on this space.
It's always fascinated me how genes, how genetic information and multi-omic information now plays a role in who we are as individuals, that makes us all different. Also plays an incredibly important role in susceptibility to disease. Really understanding, teasing apart that puzzle has fascinated me from the beginning. Every major milestone in this field has been fueled by a technology revolution. 20 years ago, we saw that with the commercialization of microarrays and what was uncovered, what was discovered with the use of that technology. About 10 years ago, we really saw short-read sequencing come to the forefront, opened up tremendous insights into biology, and it's been a wild ride over the last 10 years. We are at the inflection point for this entire industry. What is gonna drive that inflection? It's gonna come on two fronts.
You've heard about long-read sequencing and the amount of information that is captured in long-read, highly accurate native sequencing is just off the charts, and we'll talk a little bit about that in this presentation. The second front that's gonna catapult this industry into sort of the next step change is breaking through accuracy barriers. With Onso, with SBB chemistry, finally, researchers and entrepreneurs have the tools in their toolbox to break through what's been limiting them with SBS chemistry. It is an incredibly exciting time. It's an exciting time here at PacBio. It's an exciting time in our industry, and we're so thankful you decided to spend your morning learning about us and our space today. Thank you. All right, let's dive into the markets. Well, I have two jobs to do this morning.
Number one is to talk about our market. Christian already shared the punchline to it for you, but we will explore that in more depth, the segments of that market, and what those drivers are. Part two of my mission up here is to explain to you and help you understand how PacBio is seeking to play in those markets and how we're going to win. Christian already shared this with you. This is our sizing of the market. Now, you've probably heard other organizations. There's many ways to size a market. For us, the way we approached it is by looking at publicly available information from all of the major players in the space. We looked at their public statements about their growth trajectories over this time horizon. We accessed commercially available market reports, and we leveraged the collective wisdom of our leadership team.
You saw when Christian put up our leadership team, that almost all of us, many of us, have decades of experience looking at genomic technologies, and all of that collective input is how we got to our $7 billion closing out 2022 this year. Now, the growth rate for the industry as a whole is reasonably healthy. We have it pegged at about 18% CAGR over this time horizon, closing out 2026 at about $14 billion. Healthy growth across the board. The way we think about segments here at PacBio is we have five that we focus on, human genomics, oncology, plant and animal, microbiology and infectious disease, and then, quite frankly, a catch-all bucket, which we call emerging. Over this time horizon, we see growth, healthy growth, double-digit growth in every one of these market segments.
The real take-home story, the headliner story, is in human biomedical research, which in the way we've segmented the market, that's human genomics and oncology primarily. We see those market segments really taking off over this time period, and we'll talk a little bit about why in the coming slides. Now, PacBio's growth story is a very similar story. Our growth story is primarily fueled by human genomic applications and oncology applications as well. We'll start with human genomics. Today, our revenue is about 40%, or our revenue from this market segment is about 40% of our overall revenues. Over this time horizon, we anticipate very healthy growth at about a 50% CAGR over, from now until 2026. We're starting this journey at less than 5% market share in this segment.
We expect to grow to somewhere less than about 10% market share over this time horizon. Our oncology story is a little bit different. We are at the starting line in oncology. Oncology represents less than 10% of our revenues currently, and I think it's fair to say that we are a nascent player in the oncology space. Over this time horizon, we anticipate very healthy growth at about 75% CAGR out to 2026, but our starting line is less than 1% market share. Even with that very aggressive growth, we're only at sub 5%, low single digits market share by the 2026 time period.
For me, the most striking piece of this particular slide is that despite aggressive growth, despite our models that have us penetrating key markets, even as we exit 2026, we are barely scratching the surface of what is this genomics revolution. The amount of headroom beyond this time horizon is substantial and we're excited. We're excited for the next four years, but we're even more excited about the long-term play as well. The take-home message is that PacBio's growth over the next four years is gonna be fueled primarily by taking market share in key applications in the two markets we just described. Whole human genome sequencing, RNA analysis, and targeted sequencing in both the human genomics space as well as the oncology space.
For whole human genomes, whether it's in the rare disease context or the complex disease context, pop gen or cohorts, clinical or research, honestly, it doesn't matter in this time period. Fundamentally, PacBio whole genome sequencing delivers more value for money, and you've started to see that story emerge in Christian's talk, Dave's talks, Mark's talk , and now again in mine. Now scalable on Revio, so accessible, the economics work out and the throughput is there to catalyze this entire market to sort of the next major step change. The future of RNA sequencing is isoform sequencing, whether it's in single cell or bulk tissue, whether it's in oncology or human genomics research, it is essential for truly starting to understand the functional implications of all of these variants that researchers have been studying for decades now.
It has increasing utility in a clinical setting, primarily in rare disease for our story, and we will show feedback from customers that basically paint the picture of why would anybody do short read RNA sequencing again, once they can get that full isoform information. Then targeted sequencing. Targeted sequencing is a big bucket of activities, many use cases across many markets. For us, our story is really twofold. It's leveraging SBB and the accuracy that it brings to the table in the context of oncology applications, primarily needle in a haystack use cases like liquid biopsy, but also other markets as well, where that exquisite sensitivity is so important. The second part to our targeted story is on germline panels.
Germline panels as a relatively cost-effective, easy entry point for more and more customers to embrace HiFi technology to address their research question and to go after some of their clinically challenging genes. Oops. The other point I'll make before we move on is that each one of these markets by themselves in 2026 is a multi-billion-dollar market. Our market share, again, is low single digits at best in each one of these today. We have aggressive growth targets, but as we exit 2026, again, we are barely scratching the surface of our market opportunity. Let's focus on whole human genomes. What drives the market in this time period? Four things. The first thing is expansion of clinical genomes.
Expansion of clinical genomes for rare disease, and increasingly so in routine genetic disease, more general genetic disease diagnoses. The second major driver is consolidation of legacy molecular technology onto the genome backbone. Why would you go through five or six different orthogonal technologies to try to get to an answer when you can knock it all out in one shot, in one genome? The third driver simply is just improved tools, improved workflows, improved analysis solutions, more understanding of the data, making genome sequencing more accessible to a larger swath of customers. The fourth major driver for genomes over this time horizon is the conversion of these large population or cohort research projects, biobanks, etc., now to start doing their analyses off of a genome backbone.
No longer do they really need to make that choice of do I go with a couple hundred-dollar microarray or do I start to move my analyses onto genomes and get much more information? The conversion of those projects into genomes over this time horizon is something we also expect to fuel growth in genomes. What does that look like? In our analysis, we pegged the market for genomes at about $1 billion today. That equates to close to 1 million individual genomes being done, and PacBio's market share today is approximately 1% by volume. What we see over this time horizon, again, is a healthy growth rate of genomes and money being spent on genomes, to the tune of about 5 million-7 million samples being done by 2026.
PacBio's market share at that time by volume, approximately 5%. The CAGR that you see in terms of money being spent growing at 50%-60% overall. Now that's short read genomes, primarily. There will still be a lot of short read genomes done in this time horizon. The interesting growth story, and the growth story that's incredibly important to PacBio, is the growth of long read genomes over this time horizon. We estimate about 120%-130% CAGR by revenue by dollars spent on genomes in the whole genome space for PacBio over this time horizon. We are outpacing the market. Customers are starting to preferentially choose HiFi as their genome platform. Well, you'll ask the question, why? Why is a customer gonna choose HiFi over a couple hundred dollar short read genome?
Well, the answer is value for money. Not all genomes are created equal. A gigabase is not a gigabase. It's not a gigabase when you really are looking for the answer, and you need to get that answer. You don't wanna waste your time, whether certainly in a clinical context, but in a research context too. Why waste your time with something that's going to give you a half answer or a quarter answer? PacBio is delivering the new class of genomes and the new class of genomic data. The graphic on the right is meant to illustrate that point. Short-read genomes, they are great at looking at SNVs and small indels, and they've been doing it now for a decade. HiFi can do that too, but we can do a whole lot more.
Stacking up the value, large indels, getting your haplotype and phasing information, getting methylation, getting structural variation, all in one shot, all in one go, so that you miss nothing. Incredibly valuable, incredibly important, and as I was saying before, really ushering in that step change for the entire community. Now, the story here is two parts. There's a clinical part and there's a research part. On the clinical part, we'll start with rare disease. Rare disease was the tip of the spear for going into clinical genomes for short-read sequencing. It is for long-read sequencing, too. There's a very clear market need. There's a very clear genetic value component here. For those of you that have been in this space or have heard about rare disease, rare disease, in fact, is not all that rare. It's about 30 million Americans suffer from a rare disease.
Somewhere between 300-400 individuals globally suffer from a rare disease. Collectively, it's not all that rare. What we've seen from all the work that's been done, all of the exome and now genome studies that are coming online, all done on short-read sequencing, is that the diagnostic yield for these patients is plateauing at about 50%. One out of two children that come in seeking a diagnosis walks away without a diagnosis today. It's great for the one that did get a diagnosis. It's good for their family, for closure, to move on, but that's 50% of the population that's still searching for answers, and that's just not acceptable. Not especially not acceptable when there's an alternative that can deliver more.
What we've seen with some of our partners in this space, and there was a paper that came out earlier this year from Cohen et al., from Children's Mercy Hospital, that basically started to show the increased diagnostic yield, the increased clinical power of a HiFi genome in their clinic. What they showed is that they detected four times more coding structural variation leveraging HiFi genomes, and that information translated to over a 13% increased diagnostic yield in patients that previously had been sequenced on short-read sequencing, but still walked away with no diagnosis. Now, at least for that 13% of those patients, they were able to have some information that could help them in their journey going forward.
Now, Children's Mercy Kansas City is not the only place that's starting to see this potential and is seeking for answers for those 50% of patients that they're trying to serve and have not been able to serve with short-read sequencing. Around the world, leading children's research centers, leading academic hospitals, leading children's hospitals, have started to come to us. We've started to work with them more deeply to bring technology, to bring the HiFi technology in particular, to bear on some of their more challenging cases. Rather than have a dozen centers around the world reinvent the wheel over and over again, PacBio has partnered with these organizations and established a consortium of these like-minded, forward-thinking pediatric centers to bring the power of HiFi more rapidly, accelerate the technology to have impact in their clinics.
That's from things like sharing best practices, setting standards for the community, and eventually sharing data so that interpretation of genomes becomes easier and more accessible. Another thing about Revio in particular and the HiFi technology is that it alleviates one of the, I would say, more insidious pain points for a clinical lab. Especially in the acute setting, you've got a sick kid who's come in and needs answers fast. You've got a family that's tearing their hair out trying to understand what's going on. It is not okay to sit on that sample for weeks while you aggregate dozens, hundreds, even thousands of samples so that you can get the best economics on your sequencing run. That's the decision that clinical labs have to face today. Do I run this? Do I get answers fast?
Do I get back to that patient or that physician that's banging on my door? Or do I try to get the best cost economics by batching as many samples as I can on my flow cell? With Revio, it's $1,000 per genome. Whether you run one proband you run the sample that you have in-hand today, whether you've got a trio, mom, dad, and kid, or you've got four probands, every single genome, no matter how you slice it, no matter what your study design, is $1,000. It eliminates that need to choose for the clinical lab director, for the clinical lab, sort of workflow, do I batch and wait, or do I move fast and deliver the best service to my customers?
Now, the alternative on a short-read platform probably is somewhere on the order of $1,300-ish, maybe all the way up to $4,200 or $5,200. It really has to do with, is that lab willing to burn that entire flow cell to get an answer for that child quickly, or are they gonna wait and batch? Even with batching on the lower-end flow cells, it's still above a $1,000 genome. Ultimately, from a clinical lab workflow perspective, from an economics perspective, there's benefits here as well. We've talked a little bit about the rare disease use case. We've talked about sort of the acute use case in a pediatric setting. What about more general, a genomically informed lifetime medical record. That's the future.
That's, I think, where we all in this industry believe the space is going. How do we get there? Well, it's about building value. It's about building insights. It's about building information across more and more variant classes, more and more information that you can extract from the genome. Now, this graph here is meant to show that it's not just SNVs and indels. Yes, that's an important part, and there's been a tremendous amount of work done in that space by previous technologies, by microarrays, by short-read sequencing, but that's only part of the story. When you start to stack up, the component parts actually drive disease and drive health and drive each of us to be different. It's structural variation. It's the dark regions of the genome. It's understanding phasing.
Did you get that copy of the chromosome from your mom, or did you get that copy of the chromosome from your dad? It makes the difference. Epigenetics. All of this stacks up to add more and more value to understanding who we are as individuals, and in the disease setting, what our response to a therapy might be or what our prognosis is. Lining that all up now, PacBio delivers all of those classes of variation with confidence, highest accuracy, the most information possible from a genome. Alternatively, short-read genomes. They'll give you good SNPs and indels, moderate structural variation. Yes, you can detect some. You will get some phasing, but mediocre. You will absolutely not get the dark regions. You will not get methylation.
Now, as we've all heard, and as Dave pointed out, there's some organizations that are turning to synthetic using their short-read technology to try to stitch together synthetic long reads. Well, we don't know what those are gonna look like. I put question marks here because the truth is nobody's actually seen the data. No customers have used those kits in their own hands. Jury's still out, but you definitely won't be getting methylation. You definitely won't be getting dark regions because those short reads that are being stitched together suffer from the same biases that the original short reads did. There's no getting around that. When we start to stack up the value, when we start to stack up the information content, and we actually look at the prices, there's a tremendous amount of value for money.
Again, why waste your time getting half answers? Don't take my word for it. This is Brian Krueger. He's been in the industry for a while. He likes to post on social media. I'm sure many of you have seen some of his many posts. He was at Labcorp for about six years as a senior director of some of their technology development there. Recently moved to Everly Health in a leadership role. I'm not gonna read you this entire LinkedIn post, but draw your eye to the bottom part of this post, if you can see it from where you're sitting. Brian basically starts to do the math for us.
He's stacking up a list of all of the different orthogonal molecular techniques that a clinical lab would have to use to get to an answer for an individual patient. That includes array CGH technology. It includes MLPA assays. It includes methylation profiling, and you start to stack all those up so that that lab director can then deliver a fulsome analysis of that particular genome. We're talking $ thousands, not to mention having to maintain all these different assays in the clinical lab, drawing down on precious sample. There's a huge host of workflow efficiencies that if you could knock all that out in one assay, tremendous benefit. Benefit for the patient, benefit for the laboratory. That was the clinical story. I wanna shift now to the research space.
If we think about what do research studies look like in human genomics? In general, I would say the last decade or so has been marked by investment in large-scale cohort studies doing statistical analysis. You've got a group of patients that suffer from a common phenotype or have a common phenotype, and you have a group of folks that ostensibly don't have that phenotype, and looking for statistical differences in one group's genomic profile versus another's. As we think about the future of these studies, the future of biomarker discovery, what does that look like? Well, if you'll bear with me, I have a few postulates up here that I wanna run through. The first one is that mutations of all variant classes underlie complex phenotypes. Complex phenotypes could be autism.
Complex phenotypes could be response to a particular therapeutic. Power to detect associations, because this is a statistical experiment here, is a function of two things. One is your ability to even detect that variant in the first place. You can't detect statistical association if you can't even measure the variant. Number two is sample size. Power is driven by sample size. What we did here is put forth for you all today a thought experiment. Our researcher has a million-dollar budget for consumables, and that researcher needs to make a decision about what technology are they going to use. They're gonna seek to use a technology that delivers the maximum power for discovery. Hypothetically, I'm an autism researcher, for example. I got my million-dollar grant. What am I gonna do with that?
How do I set myself up for success to make that next big discovery that unlocks new biology, that unlocks the missing heritability of autism, in this example? Well, I could choose short-read whole genome sequencing at $200 a pop. That seems like a great deal. I'm gonna go with that. Okay, with my $1 million budget, I can sequence 5,000 samples. I'm gonna have reasonably good power, very good power to detect associations and SNPs and indels, 'cause that's what my technology delivers. I'll also see some structural variation. That'll be helpful. Hmm, but can I get more? Do I see dark regions? No. So can I detect associations there? No. Missing, you know, close to 10% of the genome right off the bat, and then methylation, absolutely blind. Well, I could do PacBio HiFi at 30x.
For my budget at about $1,000 a pop, that's 1,000 samples. Well, this is gonna be challenging because 1,000 samples, my ability to detect association is a function of seeing the variant with confidence and the number of samples in my study. Because I've had to reduce my samples, my power to detect across all these variations, well, across SNPs and indels has come down. Structural variation is actually a little bit better because you can see those much better on a long-read platform than you can on a short-read platform. Now I start to get some information about dark regions and some information about methylation. Still, I don't know, I don't know. Am I short-changing myself?
Well, there's a middle ground that actually delivers more information content than either of those two approaches, and that's looking at a lower depth HiFi genome. In this example, we just used a 10x depth genome. It's a nice round number, and it equates to about $300-ish per sample. In this model, that gives you about 3,000 samples. What we see is now you've got information content across the four classes of variation that we're looking at here. You've been able to boost your sample size and thereby increase power. You've increased power in SNPs and indels, but you've increased power significantly across the other classes of variation, which in the first example, on a short-read genome, you would have zero power to explore.
We think this is gonna be incredibly compelling for those research studies that are looking to really understand complex phenotypes. If you are a researcher and 100% convinced that the story of understanding the biology that you're interested in totally and uniquely only sits in SNPs and indels, great. Good for you. There's a tremendous amount of biology. Going back to my first statement, these mutations exist in the population, and whether it's a SNP or a ten KB deletion or a methylation signature, it's all contributing to biology, and it's up to this community to figure out what it's really doing so that we can use it to understand, to help better human health, and that's why we're all here.
That's the whole genome story, and I appreciate you bearing with me while I tell it because it's a good one, it's rich, and it is going to really just change the game in terms of what we know about the role of DNA variation and epigenetics in the context of human genomics and the context of many different use cases going forward. We're going to abruptly change gears now and talk about RNA. RNA is a large market as well. There's a theme here. These are all large markets. In about 2022, we estimated it being about $1.5 billion in this space, for.
In the human genomics and oncology context, and growing at a pretty healthy clip, somewhere between 15%-20% over this time horizon, bringing us somewhere to about $2.5 billion-$3 billion by 2026. What's driving this growth? Well, there again, there's four things really driving this growth. The first one is the expansion, the continued expansion of single-cell technologies. Now as those technologies tend to push into the spatial concept, that will also fuel growth. The second is from a funding trend. So government funding agencies, an uptick in spending on multi-omics studies. Once you've done the DNA, the next obvious layer to put on in your multi-omics study is an RNA approach.
The third, and this is specifically talking about where PacBio is coming into the market, is that long-read approaches and really moving from short-read, kind of, gene expression counting, which is RNA-seq, to getting that full complexity of isoforms, it is a totally different beast. It is just exponentially more information and gonna catalyze investment over this time horizon. Then the fourth one is transition of some of these RNA-based findings, some of these discoveries into more routine clinical use. When I talk about RNA, I basically say, look, the future of RNA is isoforms, full-length isoforms. Full stop. Period. Drop the mic.
Why, when there's so much inherent biology in how genes are spliced and show up in different cells and tissues, would you keep using such a blunt approach to not see that exquisite complexity with short-read sequencing? Moving to long-read sequencing to me is just a no-brainer here. What we've seen with some of the earlier studies that look at Iso-Seq, which is doing full-length isoform detection on bulk tissues, that you can discover about 2.5 x more novel transcripts than you can with short-read sequencing. Moving to the single-cell space and the MAS-Seq protocol, which is concatenating these isoform species so that you can leverage the power of long-read sequencing, that delivers 30-fold more discovery power than short-read sequencing.
I think that's nicely illustrated by this graphic here, which was taken from the Alkhafaji paper from earlier, from late last year that looks at CD45 cells and is looking at different species of CD45 cells by their isoform information on the top with short-read sequencing. You get some differentiation, some exploring of the different cell types. The resolution and the added information content that comes through when looking at that with long reads on the single-cell level, and that's Mark Russer, the rainbow view of the same population of CD45 cells is just incredible.
As the research community, as the world starts to truly explore the incredible biology of cell type to cell type, what constitutes different tissues and needs to really understand that at the transcriptional level, this is the only way to go. Layer that on with some of the error rates that are being published. Researchers are trying to leverage short-read sequencing to do the same kind of work. But also from this Alkhafaji paper, the error rate in reconstructing full-length isoforms from short-read sequencing is about 43%. 43% is awfully close to 50% error rate, and last time I checked, that's pretty close to a coin flip. That's really questionable to me. If you look at it on HiFi, though, it's 0.4%.
Two orders of magnitude reduction in the error rate as you go from short read to just actually directly measuring the isoform off of a long-read platform. Then looking at the price, we have sort of a model price worked out here at the bottom. You'll notice that short-read sequencing is a few hundred dollars less expensive, but really they are very close, in the same ballpark. And with the added information content and just the exquisite amount of new insights that are gonna come out of looking at all these different isoforms, it is a reasonably competitive point that we're at there.
Sums it up in this last quote from a customer, who's a core lab director and a professor here in New York City, and basically said, "Look, like, if you can get isoform sequences for about the same price," which is exactly what we're talking about here, "why in the world would anyone still use short reads?" Which I think sums it up perfectly, and I'm very, very excited to see what this field will bring, and what PacBio will do to catalyze this field over the next few years. Now, it's not just about research, and so the second slide that I wanted to share about RNA is starting to look towards the clinic. This is a study by Vega et al. from earlier this year, and they used long-read sequencing.
In this experiment, it was bulk tissue to study a number of different breast cancer samples. What they found in this paper, and it is a fantastic paper that really shows, I think, the unknowns of what transcripts are and what splicing does. Basically, they looked at a number of breast cancer samples, and they had a few findings that I think are interesting here. Number one is they looked at HER2. I think most of you have heard of the gene HER2. It's incredibly important in breast cancer. It's been studied for probably decades now. In this study, they found that about 70% or two-thirds of the isoforms that they found with long-read sequencing were brand-new, not in the catalogs, never seen before.
What that does is just sort of a data point, shows that there's a tremendous amount of biology to be uncovered, even for genes that have been studied ad nauseam, we thought. Now there's new things to explore. The second piece to this is that they started to look at maybe the clinical impact of it. They started to look at, okay, we've got these different splicing isoforms. Do they correlate to anything, or is it just a kind of interesting research finding? In fact, they did find that a signature of about 35 of these isoforms were correlated with survival statistics. It starts to connect the dots and lay the breadcrumbs for how these findings can start to be used in the clinic down the road.
Then the last piece I'll pull from this paper, just to whet your appetite a little bit, is they start to talk about how isoform sequencing is going to be critical for them to really understand cell surface proteins and which isoforms are potentially being expressed on the surface of those cancer cells. For those of you that cover oncology or been in this space for a while, you'll know that immunotherapies and understanding, being able to design targets towards those mutated proteins is a critical and very important and very successful at this point, therapeutic class that continues to be invested in and explored. We're excited to see what role full length isoform sequencing can do in the therapeutic space as well. The last driver of growth for us is targeted sequencing.
Targeted sequencing also is a large market, about $4 billion today, growing somewhere between $6 billion and $8 billion in the 2026 time horizon. Targeted sequencing is a very big bucket. It's being used in many, many different contexts and different forms and flavors all over the world. The drivers are really just expanded use cases, expanded penetration into different customer types. As we think about the oncology piece of this, it's really exploding into those needle in a haystack applications like liquid biopsy. We'll talk through some of this. Our story, excuse me, our story of growth in targeted sequencing is in two flavors. One is in the germline panels. We see targeted sequencing, quite frankly, as a stepping stone to genomes.
It's a reasonably low cost, a bite-size way to start to leverage HiFi data and get onto our platform, get familiar with the data, and deliver the insights that are most important to individual labs. On this slide, I have a few examples of that. On the left-hand side, perhaps the most researchy, and it's dark regions. We partnered with Twist, and Dave spoke to this a little bit. We partnered with Twist over the past year or so to develop panels, hybrid capture panels for our technology, for our long-read technology. One of the first products to come off the shelf of that is what we're calling the dark regions panel. We all know there's parts of the genome that short-read genomes just can't touch because they're too complex and whatever data you do get is suspect.
Now, HiFi can read right through that, no problem. For all of the samples that have been run on short-read genomes, for some of these studies where they've amassed, you know, tons of exons or tons of genomes, and really all they want is the dark regions, we provide a way for them to augment their studies, to add on to their data banks, the dark read regions, specifically on our platform. We think that will be a really exciting top-up approach that some researchers will embrace. The second one is single gene disorders, and this really falls also with carrier screening.
There are some incredibly important genes out there, things like the thalassemia genes or SMA or GBA, all of which have very clear clinical roles to play and have sort of been the bane of existence in trying to get them to work on short-read sequencing. They're very complex genes with pseudogenes and repeat regions, and they're very challenging. Clinical labs have had to use other technologies to go after them. Now they can do it all on HiFi, and we see some labs trying to adopt targeted panels. Most notably, Berry Genomics, who's developing assays for thalassemia for the Southeast Asian population on our platform, is having tremendous success there. Then, of course, pharmacogenomics has been important and growing and also a very challenging class of genes.
Now, lastly, I'll wrap up with Onso. So as we think about our targeted sequencing story, a big component of that is targeted assays on our short-read platform. I won't go through this graph in detail, but basically suffice it to say this is a patient's experience with cancer, so a hypothetical patient, and their experience with cancer. From the growth of the tumor, that's the blue line, and the relative proportion of ctDNA, tumor DNA in their bloodstream, changing over the course of their therapeutic experience. After surgery, tumor load drops, your cell-free DNA goes down very low. There's always a question, is the cancer gone or is it still there? Did the surgeon get it all? Is the therapy working?
You really don't know until the tumor has now come back up, and is high enough, the tumor load is high enough that you can start to see it with SBS. If we change the paradigm and actually lower that accuracy floor, now we can see tumor molecules potentially much earlier, much more sensitively and specifically at different stages in this hypothetical cancer patient's life, and that's gonna open up, especially monitoring in MRD applications. If you think about monitoring in MRD, you are measuring time points over that patient for multiple sequential measurements. From a market size perspective, it's much bigger than using liquid biopsy for therapy selection, and it is fast and growing market with a tremendous amount of investment.
We are very excited to see what this technology can do in the hands of laboratories, clinical laboratories and research laboratories around the world to really explore, are there new ways to detect cancer sooner? Now, I'm out of time. They're gonna give me the hook, I think. I'll just wrap up with some of the other markets. We didn't have time to talk about them today. Each one of them is incredibly exciting, and I could spend another hour on each one of these, but I'll spare you that. They're growing. They're large markets. HiFi, Revio and Onso are delivering competitive advantages for different applications that are relevant in each one of these markets. I'll leave you with this. This is the same slide that Mark showed. We've talked about our markets. We've talked about the segments.
We've talked about some key applications and the applications where PacBio is playing to win and delivering tremendous value to the customer. The question mark is exactly that. What is gonna come out of this? These technologies are exploring biology at a level that's never been seen before. What we've all experienced from these technology step changes is that those discoveries then catalyze new markets. I couldn't tell you what that is. I have some suspicions, but we don't know what that is, but we are very excited to see what it will be. With that, I am going to end and leave it there, and we are going into a Q&A session. I think it's a 25-minute Q&A session. I'd like to call Christian, Dave, and Mark up to the stage with me. Yeah.
All right. Hello? Yeah, a 25-minute Q&A session and then, a much-needed break, I promise, right after that. I'll hand the mic over. Just raise your hand, and I'll run you the mic. Just state your name and your firm, and we'll go from there.
Great. Thanks. Dan Brennan from Cowen. Thanks for the presentations, obviously. I think the expansion of the market is something that resonates today, obviously, in terms of all the new applications and kind of customers you could approach. Can you just give us a sense maybe of, say, the existing number of customers that are on the Sequel II today? And when we think about all these new kind of opportunities, I'm sure you've done a lot of the math already, but, like, how much does this open up from, like, a new customer standpoint? Like, you know, when you think about, say, 2023 and 2024, you've given the kinda three-year or five-year growth rates. Like, how many new labs, how many new customers do you expect will be potential for the Revio?
You know, I'm sure we'll learn more from Susan later in the presentation, but when you think about the capacity to serve those customers, kind of, you know, where does that stand today? Thanks.
Sure. Well, first of all, I wanna thank everyone for hanging in with us. I know that first session was extensive. I also wanna thank Jen for standing up here for 45 minutes and then getting into questions. Dan, to answer your question, you know, today we have roughly 400 or so Sequel IIe customers, give or take. I think Jeff, in his presentation, will talk a little bit more about that. When you think about Revio, it really opens up the entirety of the market, and so we would expect to see hundreds and hundreds of new customers come to us with respect to the value that's created through Revio.
You know, having the capabilities to deal with more than double the customers we have today over this time horizon, you know, we've already built a lot of that infrastructure and put it in place, and it's really, as I said in my remarks, all about getting the leverage out of the commercial organization, out of the operations group. Mike, Michael Goloubef, will tell you about how we're scaled in operations before the end so that you can get a picture of that. But at the end of the day, you know, Revio, we fully expect to generate hundreds of new customers. Jeff will give you some statistics just from our launch event, which will give you some confidence that that's in fact the case.
Dan, I think it's not just the numbers of new customers. Yes, I do think Christian's right. There's gonna be a lot of new customers. I think we're gonna get them because of the exposure that they'll have to HiFi data. If you think about our install base, you know, and in comparison to Revio, you know, with 15 x the output, we only need 30 or 40 systems out there to generate the world's current capacity for HiFi. The access to this and the utility of the data is gonna really give people that opportunity to understand what they would get from being a part of our ecosystem.
I think that's a really good point, Mark. You know, this is the inflection point because now what happens is, as more and more HiFi data gets out into the world, the flywheel starts, and people start to see the power of HiFi and the applications. Oh, and by the way, it's affordable for me, where I can actually take those projects on. We've seen the same thing, albeit it was when I had a lot less gray hair. You know, as we dramatically scaled short-read sequencing platforms, we saw that massive adoption occur because that flywheel of data was generated. This is the same moment in time.
In fact, we have some advantages now because we have a full suite of applications that we can take to the market, so we can make that flywheel move even faster and grow with all of the samples coming into HiFi.
Kyle Mikson with Canaccord Genuity. Thanks, guys, for today. Just wanted to say, it's been remarkable to see the transformation of the company the past few years, Christian, so congratulations on that. It's been great. I guess I'll just ask one here. I guess with the, You know, everyone kind of wonders what the giant killer is gonna be, right? I think Revio could be that. You know, we have a short incumbent out there. Is it gonna be Revio, or do you think it has to be that next sequencer, the next kind of like ultra-throughput platform? Then kind of jumping off that, how much lower on cost and throughput and higher in throughput could this next sequencer be? You know, be meeting agreement, I guess.
You know, I guess Revio is like a, you know, 2.5-fold improvement in cost from the 8M chip. I mean, can we see another type of improvement again?
You know, I'll pass that to my head of R&D right here, Mark. You wanna take that one?
Yeah. I think Revio is the platform that's gonna be wildly successful for us. When you think about that ultra-high throughput platform, you know, it is another order of magnitude. We've been talking about, you know, how do we get a sequencer that can do tens of thousands of genomes a year, not thousands of genomes a year? Think of that as another order of magnitude of output gains. Moving to that 300 millimeter wafer for that is a really important part for the COGS, which will give us that ability to price the genomes where we need to. Now, I will say from the last three weeks, we're not getting pushback on a $1,000 long read genome, right?
No one's coming to me saying, "Oh, it's still just way too expensive." They're like, "Wow," like, "that's really enabling, and that's what I need." What I would say is, you know, we'll price it at that point in time when we're ready to, but, you know, if we're not getting a lot of pushback and the value's there, and we can demonstrate the clinical utility of a long read genome, then we're probably close to the right price. It's just at that point, it's just how do we bring in more people with even higher scale to make it a more routine application?
I think part of that is-
It could be more margin, right? We have to decide at what point do we share those cost benefits and those, you know, those gains with the customer versus maintaining some margin. I think we have to strike that right balance.
That's the most important component of innovation is, not only is it enablement, it's also figuring out how do you share the benefit of that innovation with your customers to drive elasticity of samples in the market. If you can get into the fortunate position where you have innovation that allows you to have flexibility in pricing, then you actually have much more control over how fast the market can expand, and how much your gross margin can expand. Which, you know, quite frankly, we're playing for both.
Hey, guys, Tejas Savant from Morgan Stanley. Christian, maybe one for you, just following up on that earlier question. You know, you've took typically sort of double throughput and half the cost on an annual basis in the past. Given the step function improvements on the Revio, should we sort of think of the specs as essentially locked down for the next couple of years, and then that gets you to sort of line of sight to the ultra-high throughput platform? Is that the right way to think about it?
I wouldn't think about it that way, Tejas. I would think about it as, we're at the beginning of our journey, even with Revio. The specs that we've created and the product, the way it's performing now, is nothing short of remarkable relative to where we've been. It's still the beginning of that journey, on the platform. We have lots of opportunity to continue to improve throughput. You know, we're continuing to focus on driving accuracy up, continuing to focus on driving runtimes down even further. This is a combination of chemistry and software, principally, so we're not talking about major hardware changes to the Revio platform to get those benefits.
What you could imagine is where we get the product out and make it robust in the market, and then over time, continue to improve and provide more value to our customers. Matter of fact, we have. The truth is, we already have active programs ongoing to take us, you know, to Revio 1.1 and 1.5 and 2.0 or whatever, however you wanna think about it. At the right time, actually, before I go there, Revio itself will be a very long-lived product because we will be continuing to improve it. What Mark's talking about and what we're talking about is actually going even beyond Revio with another type of product that delivers ultra-high throughput for those customers that absolutely need it.
Another product in the portfolio that has low throughput. By the time we get through this time horizon of 2026, we have multiple products in the sequencing market, meeting the customers where they are with respect to capital, with respect to throughput requirements, all with increasing flexibility to manage our pricing such that we can optimize gross margin, with growth and develop the company that we all believe we can do.
The labs need to get used to running thousands of long-read genomes a year, right? They haven't had that opportunity up until now. You want them to take a couple years to scale and grow to get ready for something that's even higher throughput. We've got Revio there to help them scale, to take on all of the infrastructure to do thousands of genomes a year. You know, on the Onso side, I think that's where your question was. The optics, the design of Onso will enable higher output flow cells. That's not just this one flow cell. Expect higher density flow cells to follow on for Onso, because we've engineered it that way.
What I'm talking about for getting into high throughput for short reads is, at some point, you want billions of reads per flow cell, not just 500,000 or 500 million to 1 billion. The higher throughput there would be a different technology in the future to really enable much higher throughput levels of SBB chemistry.
Of course, as you pointed out, you know, billions of Onso reads are very different than billions of other SBS reads in terms of what it really means, what you can do with the technology.
Got it. Just one quick follow-up. I wanna talk about sort of the relative delta versus Illumina, right? You've gone from about 80 whole genomes to 1,300. They've gone from about 8,000 - 20,000 on their NovaSeq X Plus. The delta, you know, leaving aside the content and the methylation and all of the other advantages, has shrunk from about, you know, 100x on throughput to about 15x today. To Jen's point, potentially even 10x if you find users who are willing to do a 15x sort of long-read genome. Can you talk a little bit about how big that market segment could be?
Is there an element of, you know, customer education that's required, or do you think, people are already there in terms of, viewing a 15x long-read genome as an effective compromise that gets that throughput delta to, you know, within line of sight to where Illumina is?
Yeah. I think that the I think what we're seeing is that scientists and researchers do understand the differences in different coverage models for the experiments that they're trying to do. When we're looking at whole genome population scale research, you know, people are very comfortable down at 8x- 10x. When you're looking at, you know, oncology samples, perhaps 30x isn't even enough. We have spent a lot of time in 2022 focusing on calibrating the world about what a PacBio genome is versus something versus another kind of genome, and looking at that fold coverage, because it is different.
In fact, you know, at 15x, you know, now you're at, if you just run the math straight up of what we talked about, 2,600 genomes a year, which, you know, let's face it, in the vast majority of labs, that's a very nice number to be at. Jen, do you wanna add anything to that?
I think you summed it up nicely. I mean, really it depends on what that particular individual is trying to accomplish. Are they doing a large research study, so they're comparing different groups of individuals against each other to understand associations? That's one use case that we talked about. The other one is in a truly clinical setting, where you need to have extreme confidence that you're covering everything, that you're getting the fullest picture of what's going on in that particular individual's genome. That's a very different use case. You know, what we tried to paint today with some of the vignettes is the tunability of a HiFi genome, depending on what that use case is, what the cost constraints are, et cetera.
Fundamentally, no matter what coverage you're at, you're getting significantly more information across all the classes of variation than you ever possibly could with a short read alternative. I think that's really the take home message.
That's great. That's very helpful.
Thank you very much. I'm David Westenberg from Piper Sandler. Two questions I'll just ask upfront. First one is on the synergies between the short read and the long read platform. I definitely, you know, understand that the institutions do tend to be the same with the long and the short read platforms, but I do find that the users tend to be a little bit different. Can you talk about the synergies in having both platforms, and if it's maybe a function of convergence of projects that might be developing over the next few years that's driving it, or if there's some sort of operational thing that you're gonna be providing in order to really capture that synergy of long read, short read. Then my second question is for Jennifer.
I think that was really interesting how you talked about how, no matter how big the size of the project, you can give them a $1,000 genome on that. Can you talk about the mechanics of that? 'Cause usually when I think of, I think the user has to buy, you know, on Illumina, it's a flow cell, and whether you use it or not, you burned it. Can you talk about the pricing model and how that lets you do that? Thank you.
Yeah, let's start with Dave. Dave, can you give some perspective on how people are gonna use these technologies and the synergy between the two platforms?
Yeah, look, I think one of the most exciting things we're seeing, and Mark put the slide up, is that across all these different applications and markets, there is a place for both. When you look at some of these researchers, we're seeing folks that wanna go out, do the discovery in long read, and then maybe you're gonna follow it up in a different way with the short read. I think the oncology example's a really great one. You can get that deep, complete view of what is going on in the cancer. Maybe you're able to get a primary resection or a tissue sample. That makes sense for the long read.
Then as you follow that person up, now maybe you switch across to, you know, liquid biopsies, or you've got an FFPE sample you wanna go and look at where the DNA is degraded. There is always gonna be that sort of synergy of how you bring these two together based on the biology that you're looking at. I think you're right, though, especially as we look at sort of our traditional customers around core labs and even service providers, that you tend to have customers that gravitate one way or the other. Importantly, they wanna have both technologies, and that's what we really saw coming out of ASHG, was a number of folks walking up and saying, "Well, look, the CapEx market is opening up next year, and people are looking at new technologies that are emerging.
With you guys, I can buy both, get that, you know, fantastic experience working with one company, get both products with high accuracy and really understand how to bring them together. That's something we're really interested in. In terms of the longer term view, look, I think the first thing is around the ecosystem and how we build out those partnerships so that, you know, if we're using, you know, Hamilton Automation or Miroculus or whoever it might be, that it works on both platforms. That's really how we started thinking about the ecosystem, is can they serve both the short and the long technologies that we have? Over time, though, as Christian said, we really wanna build out the capabilities to merge those data sets and bring them together so we can get the deepest insights into the biology.
Jen, thinking about the flexibility and the economics around why we get $1,000 or less every time.
Yeah, absolutely. It's pretty straightforward. I mean, the individual SMRT cell, the 25M SMRT cell is priced at $995. That's the price of the SMRT cell. Now that SMRT cell is gonna deliver you enough data to do a whole human genome. It's $1,000 for the genome.
At least at 30x.
At least at 30x, exactly. When we think about Revio, it's been designed with these four independent stages, in essence. Whether you run one SMRT cell, that's $1,000, or you run four and get four genomes, each one of those is $1,000. It's sort of independently linked there, or it's not linked, it's independent. Whereas on Illumina or other short read technologies, I should say, you have to batch samples. In order to get those better economies of scale, you have to put multiple samples on an individual flow cell. That yokes them together and, as I was saying in the example, forces the laboratory sometimes to have to choose whether they get
Try to strive for the best economics by batching as many samples as they possibly can on that flow cell. In some cases, move forward with the samples that they have to get a more expedient answer.
Hi, Julia Qin from JP Morgan. Thanks for taking the question. I want to start with the broader strategy, Christian. You highlighted that, you know, obviously, with the Revio order of magnitude input in throughput and the ultra-high throughput coming, there's supposedly a lot of price elasticity to be expected. At the same time, you're also expecting the short read market to grow at a much higher rate, even off of a larger base. In light of that, how would you think about the relative prioritization between the long read and the short read platforms? And what kind of considerations go into your decision in terms of resource allocation between the two platforms?
Thank you for the question. That's a critical question because it's part of the way we're driving to our path to cash flow breakeven and positive cash flows. The way I think about it is that both Revio and Onso, short read and long read, are both highly differentiated, and so we're gonna keep fully investing in both platforms. We think our fastest growth opportunity is Revio in the short term, because the reality is there's no product like it. There's never been a product like it. With that product, we think we'll be able to achieve significant market penetration, take market share from existing short read projects and move them over to Revio. That has to be, you know, kind of our number one priority.
1A really is Onso, and I really do mean it in the 1A sense. This isn't because we do see Revio attacking a large market and attacking it with a highly differentiated product that's perfectly suited for the market we're going after. That's the key. That's actually the fundamental thing and fundamental takeaway, is that we're focusing Onso on specific markets. We don't want to make Onso the whole genome sequencer, for example. It will do whole genome sequencing. It will do it for under $1,000, but that's not the point. The point is, by having this broad portfolio, that we can put the best of breed technology into the hands of our customers to help them answer their biologically relevant questions that they want answered.
By having that capability, we're able to, you know, really push Revio into the broad markets and push Onso into the market that, you know, that Revio doesn't serve as well. Now we cover the complete market with best-of-breed products. You'll see us in 2023, of course, get both products to the market and ship those, and you'll see us through in this time horizon, continue to improve Revio and to continue to improve scale of Onso, probably in that order. But quite frankly, you know, a lot of the development programs we have, basically all of them, are in parallel at some level. We'll continue. Because through this time horizon through 2026, right, it's all about accelerating.
It's all about multiple products into the market, best of breed, leveraging our commercial infrastructure, driving consumable pull-through, which drives mix, which drives cross-margin expansion, which allows us to, you know, with expense discipline, of course, drive leverage over our whole P&L in order to get to cash flow positive. I know I said a lot there, but does that make sense? Okay, great.
The only thing, if I could add, is, you know, getting 5%-10% of the short read market is great, and it's meaningful for us, but the goal is to make sure that we're the dominant long read player as well. How do we make sure that we have majority market share in long reads? That's where taking Revio out fast and driving success of Revio is so important for us commercially.
Yeah. That's a good point.
Great. A follow-up, I appreciate the cost per sample comparisons you guys showed, which is very compelling. I'm curious, in terms of, you know, the instrument pricing side, how do customers factoring the upfront instrument CapEx, especially in light of that, you know, your some of your competitors are selling, you know, no extra instrument or low instrument CapEx proposition. Do you perceive, you know, different receptions to the instrument CapEx requirement among your different target customer segments? Thank you.
Dave, you wanna take a stab at this one?
Yeah. I'll take a shot. Look, I think what we've come out with has been incredibly exciting and people are really interested. I think it's not so much about that upfront capital cost that we're being compared to, it's the hidden costs that are coming later. When you think about some of those platforms that give you very low cost initial capital instrument, you have to follow that up with huge amounts of compute, data processing, storage, et cetera. Quickly, once you start looking at it, those costs rise, and that's why I think we see a lot of people not really scale up on those technologies.
When you compare that back to us, when you really look at what you're getting and what you're able to do, generating 1,300 HiFi genomes with methylation every year, all that compute built in, it really ends up being, when you amortize the costs out across a fully loaded genome, highly comparable, if not better, on Revio. I think as we start to have those conversations, people are becoming acutely aware of all those additional costs and why this is such a powerful platform and is priced so, aggressively in the market, I would say, for what you get.
I think that if you compare that to some of the other short read technologies that are coming out, you know, and again, using a slide that Jen put together that unfortunately we didn't have time to go into today, if you think about a hypothetical $1 million project budget, if you're spending $985K on a short read sequencer or a high throughput sequencer, that leaves you $15K for the actual sequencing. It's like one run. If you look at what we're doing here at $779K, that's leaving you over $200K to really go in and do 200 genomes or more.
As customers start looking at it, both on the low or free CapEx on the long read side and the additional costs, it doesn't end up making as much sense as you may initially think. On the short read side, that higher CapEx price is really limiting what you can do to the point that you get so much more out of these genomes. Again, it makes sense to really look at Revio as the game-changing option for whole genomes, especially.
Yeah. I mean, I think even to shrink it, let's face it, in life, you know, the truth is you get what you pay for. The reliability of Revio, the history of PacBio delivering, the consistency from run to run, you know, later Jeff will show some how well our instruments actually perform in the field every single day, how we've put all the compute on board as Dave put out. There's a lot more hidden costs in those so-called low CapEx or, you know, free CapEx instruments that aren't fully appreciated until the customer gets into it. I think with Revio, we now have the scale and the cost and the credibility, I believe, to really, you know, be successful against that argument.
Great. Well, I think that's an excellent segue into the break. So we'll have about 20 minutes or so, stretch your legs, and we'll be back with more presentations, and there will be another Q&A session later as well. Thank you.
Thank you.
Thanks.
Thank you.
We'll be starting back up momentarily. Can you please find your seats and silence your devices? Thank you.
I'm Nathan Hammond from Stanford Health Care. We have begun looking at PacBio as a possible basis for our clinical testing in the future. We're very excited about what long reads can do for illuminating structural variation. That's very hard to call with short reads. We were concerned at first that while we might be getting better performance on structural variant calls, we might suffer a little bit in our small variant calling, which is a very important part of our clinical service. What I've been presenting on here at ASHG is that when we looked at the performance of small variant calls, actually PacBio has really closed the gap, and they look every bit as good as their competitors on the small variant calling.
We can use a single source of sequencing, get long read data and do small variant calls as well as structural variants together.
Please welcome to the stage Jeff Eidel, Chief Commercial Officer.
Hi, everyone. Good morning. I'm Jeff Eidel, PacBio's Chief Commercial Officer, and although I've been in the genomics space for about 17 years, both as a long-time employee of a short read sequencing company as well as a customer of that sequencing company, I'm the new person on the block here at PacBio, and I'm absolutely thrilled to be on the team. I've been with the company now for just about three months, and it's been an amazing ride. I came to PacBio because I love our mission. I enjoy the challenge of scaling the company, as Mark and Christian talked about scaling from $100 million up to $1 billion and beyond. I love our strategy of providing customers with choice. Frankly, choice that customers haven't had in too long. Our products are gonna give them that choice as the company with...
The only company to be able to offer native long read and short read sequencing platforms. I'm thrilled about this opportunity and to bring these products to all of our customers around the world. Today I'm gonna talk about three main goals or three main things. Christian did a great job of going through the history of PacBio and reiterating the strategy of the company over the last several years. Mark and Dave talked about the revolutionary capabilities of our new technologies, and Jen did an amazing job of convincing you about the market opportunities we have ahead of us.
I'm gonna talk about the team that's gonna market, sell, and support these products and really help us capitalize on these huge markets in front of us. I'm also gonna talk about our global service and support teams and capabilities we have to enable a great customer experience from our customers around the world. Lastly, I'm gonna give you some great examples of how this commercial engine and infrastructure that we've built have translated into an amazing experience for our customers and real tangible revenue and order opportunity. When I first started talking to Christian back in the summer about joining PacBio, he walked me through the strategy that he talked about earlier. He gave me a non-confidential view of the roadmap ahead of us from a technology standpoint, and then he started talking about the team that we've assembled on the commercial side.
I could instantly see the strategy and how it was coming to bear. Having been at the company now for three months and having the privilege to work with this team of 225 commercial employees, I'm absolutely thrilled to be a part of it and couldn't think of a better team to be a part of. Now drilling into that team a little bit more, you know, at the tip of the spear is our sales team, and we've invested heavily in the sales team over the last couple of years. We have nearly 70 commissioned sales reps around the world across all three regions, and we also have 25 distributors who are authorized to sell our products in over 125 countries around the world. It's not just about sales.
We have almost an equally as big marketing group who's responsible for taking the customer requirements and translating them into the great products you've heard about today. They've worked on the great branding that you've seen over the last year or so, and they work tirelessly to pull off, you know, plan and execute the events like you saw a couple weeks ago at ASHG. Transitioning back to the sales side, you know, as I said, we've grown this team about four times since the start of 2021. What really resonates with me is when I look at the experience of this team, you know, on average, across those 70 or so reps, they have about 11 years on average of experience selling in the genomics and sequencing space.
It really gives us a competitive advantage in terms of those reps' abilities to talk about our products with our customers, to understand the market and quickly reach those customers at a global scale. When you look at the companies at which those reps have been a part of, it's not surprising, you can see all in the word cloud here, all companies that you're probably familiar with. Not surprising also that many of them came from companies like Thermo Fisher and Illumina. In fact, our three regional GMs all held senior leadership positions in the commercial organization at Illumina before joining PacBio over the last year and a half or so.
There was a question earlier today from the audience about, you know, what does new customer growth look like? What's that gonna look like going forward? Well, I can tell you that this investment that we've made in our commercial organization infrastructure, certainly with the products that we've had up until this point with the Sequel platform, we've done an amazing job of growing that customer base. Since the start of 2021, we've added 91 new customers in terms of those customers who have purchased new systems from us, and certainly diversified that customer base across the world. Switching gears, I wanna talk a little bit about what I think is kind of our unsung heroes of our commercial organization.
It's our team that's out there in the field, working with our customers, you know, making sure that our systems, you know, stay up and running, and if they, you know, do happen to slow down or have an issue, get out there quickly to fix them. One of the things I'll give PacBio a lot of credit for is having the foresight back in 2015 or 2016 to put together as part of the Sequel launch, something that we call Sequel Insight. Of course, we're gonna have to rebrand this platform going forward with these new products that we've talked about today. What Sequel Insight does today is give us the capability to remotely access and diagnose issues that may come up in the field.
Let me be clear, we don't have access to customer data, we don't have access to anything but basic run metrics and information about how our instruments are doing in the field in terms of uptime. Of course, this tool isn't valuable to us unless our customers actually use it. What I'm happy to say here is that we've got almost 90% adoption of this platform across our install base. With that tool, as well as the robustness that we have across our instrument base, especially with the Sequel platform, you know, we've increased our mean time between failure or MTBF to greater than 300 days over the last three or four years.
Many companies don't publish what this MTBF metric is, which is defined as the number of days between a failure that causes us to have to dispatch an engineer to the field. But I can confidently say that this 300-day MTBF or greater than 300 days is amongst the industry-leading and something that PacBio is certainly proud about. Lastly, on the service support team, you know, I'm struck by the amazing tenure that we have within this team. On average, across those 70 or so people, we have about four years. They've been with the company for about four years.
All three of these things, the Sequel Insight platform, the robustness of our systems, as well as the experience and the tenure of our service support organization, has led us to not have to scale the size of this support organization by nearly as much as our installed base over the last couple of years. That installed base that has grown over 200% since the start of 2021, with only a modest increase in the size of this team. Yet not sacrificing any of the service and support and the customer experience that I think our customers deserve and expect from PacBio.
Lastly, I'm gonna give you some examples of, again, how this commercial infrastructure has really started to pay off for PacBio over the last couple of years and in the very near term. Before I do that, I've talked about sales, I've talked about marketing, I've talked about our service and support organization, but I also wanna give one shout out to our commercial operations group who works behind the scenes to make sure that we have the systems and the tools and the training, and also, you know, enable our sales team and our marketers to be able to get the word out and, ultimately, you know, give quotes and collect orders from customers.
Three weeks ago, as you all know, we launched Revio and Onso at ASHG, and there was a ton of work that went into that launch event, you know, more than all of you will probably ever know. Extremely proud of everything that happened and it culminated in a wonderful week for us at ASHG. We had a sold-out launch event on the Tuesday night on the twenty-fifth. Absolutely stunning event that you know, I think customers had a great time at. From that day, on that Tuesday, we really stole the attention and stole the week, I think, at ASHG. Much so that throughout the course of those four or five days, we generated over 1,000 new customer leads.
Those are 1,000 new researchers, 1,000 new potential users of our platforms that we didn't already track in terms of our marketing lead database. On top of that, we did an amazing job, I think, on the social media side of things. When you look across the different platforms that we track, we generated over 13 million social media impressions over the course of those four or so days. When you compare those 13 million impressions to those that yielded from our two competitors, two of our main competitors, this was about 15 x more impressions than those competitors generated during the course of that week. We clearly stole the show by design and made a huge impact.
At the end of the day, what it really translates into or what it really comes down to is, okay, how's all this buzz? How's all this excitement, this marketing push that we made over the course of the week and since then, how does that translate into new orders for Revio and Onso? I'm here to tell you today that I can safely say that our sales funnel today has never been in better shape. We have the most robust sales funnel that we've ever had here at PacBio. I'm gonna give you a couple examples of that.
At the launch event a couple weeks ago, Christian talked about our launch partners with Broad Institute and Macrogen and Berry Genomics, and we're incredibly honored to have those three companies, our initial launch partners. Since that day, since that night, three weeks ago, the amount of customer enthusiasm, the amount of customer reach out to us in terms of wanting to get more information about Revio and Onso and generate quotes for them and ultimately collect sales has been truly amazing. I'm really happy to tell you today that we're happy to announce some additional launch partners today. We've got, you know, companies like GrandOmics, JMDNA, HudsonAlpha, Huarui Genomics, and Annoroad as well in Asia. I'm gonna leave you with one last story here.
In the middle here, you've got the CEO, who's in the middle of the picture here, Dr. Lee from DNA Link. As many of us did throughout the course of the week in at ASHG, we had lots of customer meetings, lots of customer dinners. On the Thursday night of ASHG, our GM, Jason, and our distributor in Korea took Dr. Lee and his team out to dinner, and we had a wonderful meal, spent a couple hours with him talking about the products and his business.
At the end of the dinner, he was so excited, he grabbed a pen and he reached over and he took Jason's hand, our GM, and he wrote on it the number of Revio systems that he wanted to order, and he signed it and he said, "You know, that's my order. That's my order for Revio." That piece of paper that he's got in his hands there is the actual PO. We've moved forward from a palm PO to an actual PO, but we're absolutely thrilled to have DNA Link as well as the rest of these companies, as well as, you know, several others around the world from all three regions who have now ordered this product. We're off to a great start and more great things to come as well.
With that, I'm gonna hand it over to Mike, somebody who I'm gonna keep very busy in terms of, building systems that hopefully our team will certainly sell in the coming years. Thank you very much.
Good morning, everybody. Thanks so much for coming. I'm Mike Goloubef. I'm the Senior Vice President of Manufacturing, Quality and Supply Chain. I've been with the company for two, just about two and a half years. And before that, I spent 25+ years at a number of other life science companies including Thermo Fisher, Danaher, Applied Biosystems and MDS SCIEX. Why did I join the company? Like Christian, I came out of retirement to join PacBio. Why did I do that? You know, look, the story is compelling. The products are terrific.
We sell into an incredible customer base. Honestly, for me, it's about the employees, right? It's about the team that we've built, including the leadership team here today, but also the team back home in the manufacturing sites and the people I work with every day. Look, I've been around a bit, and I can honestly tell you that the team we built in PacBio is bar none the best in the business, and I think it truly gives us a competitive advantage, and it makes the future look incredibly bright for us. I'm gonna talk about a few things today. A couple things is I'm gonna touch on our mission. As Christian said, his mission is the guiding star or the North Star of our company. Similarly, our mission in manufacturing is our North Star.
It's something that my leadership team and I have spent the past year developing, and I'm really happy to share that with you. I'm also gonna touch on our current state capabilities, let you know who we are and what we do. I'm gonna talk about the areas of focus as we continue to scale the business. Finally, I'm gonna end up with just highlighting our future state and giving you a sense of what we're building out.
For us, it's all about the customer, it's about quality, and it's about cost. Our customers tell us that over the past year, they've experienced the highest run performance they've ever seen. Really, this is, again, a testament to the products that we're developing and delivering to our customers, but it's about the team of people behind those. Our mission is to deliver an exceptional customer experience through superior product quality at a cost and scale that maximizes profitability. Who are we, and what do we do? PacBio manufacturing is a network. It's a combination of our factories and our suppliers. On the factory side, we have three factories in the United States. We have two in California, and we have one in Baltimore, Maryland.
Of the factories in California, our site in Menlo Park, which is also our headquarters, we manufacture instruments, we manufacture SMRT cells and SMRTbell reagents, and that's principally for our Revio, Onso, and Sequel IIe products. In San Diego, which has come to us through our acquisition of Omniome, we manufacture our flow cell technology and our large fill reagents. In Baltimore, Maryland, which came to us through our Circulomics acquisition, it's the home of our sample prep, and there we manufacture our NanoBind discs and associated reagents. As important as our factories are to us, our suppliers are equally important. You'll notice on the left side of the slide here, we have a quiver of key suppliers and contract manufacturers that come from all parts of the world, particularly the USA, Mexico, and APAC.
These are suppliers that provide unique capabilities, best in the business for what they do, and they provide us capability and capacity that we need to scale the business. This network has allowed us to produce 1,000 instruments to date, of which 500 of those are Sequel IIes. We've also produced, in the last 2.5 years, 150,000 SMRT cells. I'm happy to report that in the 2.5 years since I've joined, we have never missed a shipment. You know, that really is a clear competitive advantage for us. Look, I can't take the credit for that. This is truly a team effort, and I've got, again, like I mentioned at the beginning, Barna, and some of the best in the business.
I also wanted to touch on this page some of the manufacturing and quality systems that truly separate us from our competitors. Our ISO 9001 and 13485. These quality systems, particularly the 13485, it allows us to consider manufacturing products for clinical markets. We've established that. As well as establishing our Menlo Park facility, it provides the backbone for our processes and systems that we use to not only develop our products but manufacture and ship our products. This will shorten our timeline as we start to consider the transition from an RUO product to the IVD to the clinical market with IVD products. You heard, you know, Jen talk about that and Dave talk about that.
What other areas am I looking at as we continue to build out scale in the business? You heard me talk about the network of key suppliers and contract manufacturers, but also our factories. We're gonna start to continue to develop those factories so they become technology-focused. Rather than one factory does all, those factories have become very technology-centric, and they'll be located closer to an R&D center. Similar to the key suppliers and contract manufacturers, we feel we've selected the right folks, but there's an opportunity to continue to leverage those folks for their capability and scale, particularly as we make the transition to IVD products. I'll talk a bit about that on the next slide. Business continuity planning. We've done a great job of risk management in the business.
You've heard all the stories about chips and chip supply. You've heard the stories about the ability to move goods globally. We've been able to manage that successfully despite these incredible headwinds. When you think about a company our size, what's really compelling to our suppliers is our story, the kinds of products we manufacture and the customers we serve and what they do with that. Our business is of the right size that we can fit in well within their capacity models, and we get the allocations that we need. I'm happy to announce that, as of this quarter, we've signed a long-term agreement with Silterra. If you think about the SMRT cell manufacturing, three critical suppliers, Silterra is the middleman in that supply.
We've managed through the long-term agreement to lock in four years of capacity and supply. That, you know, takes us through the ATOM and 25M right through to the 2026 timeframe. Excited about that. The network that we built takes us through to 2026. Finally, in this slide, I'd like to talk a little bit about cost. Cost is an important element of what we do. We continually try to drive costs down, and we're trying to do a better job managing our cash. Those are imperative for us, and we're starting to build programs around that, and I'm gonna talk to that on the next slide. What does PacBio manufacturing look like in the future?
We'll have a factory network that'll be technology-focused, lower cost, and closer to the customer. These manufacturing sites will now start to develop rapid prototyping and pilot manufacturing capability. Why is that important? That's important because they partner with the development teams to pull those new products faster through the development into manufacturing and steady-state manufacturing with a design for manufacturing position, a cost position, a quality position that's required to scale that business. Our contract manufacturers, again, we feel we're positioned correctly with those, but they have access to lower cost. As we continue to grow our business, and there's a need to continue to take our cost out to improve our gross margins, we have that advantage with the supply network that we have. IVD and clinical.
We will select suppliers that will provide the opportunity for us to manufacture our products in their sites. Their capabilities of Benchmark, of Sanmina, etc., for instance, they have GMP suites, right? They're well-positioned to be able to provide those to manufacture and provide those products to our customers, and be able to manage the rigor of audits and ongoing service of those parts and products. Cost and efficiency and yield. We have launched a gross margin initiative, and we've identified 140 items of opportunity for us to improve our cost position, improve our cash position, eliminate waste. It's truly a business partnership, so it's not only manufacturing, but it's in partnership with development and commercial.
Through that, we'll be focusing and tackling areas like yield improvement, utilization, efficiency, elimination of scrap. It'll also look at high levels of design for manufacture for newly released products, so we hit the ground running. It'll look at value engineering for products that are already in the field that are under cost pressure. It'll work with Jeff and the commercial team to look at price and other commercial strategies. If we do that correctly, and if we embed that as part of our DNA, we'll continually drive a better cost position, and you'll hear Susan talk a bit about this as we continue to help out on the margin side. Finally, on the business continuity planning and supplier risk. You heard me talk about. We've done a really good job of managing supplier risk.
We need to take that up to the next level, which is around business continuity planning. What is business continuity planning? It takes into consideration supplier risk, but it starts to think about things like redundancy. What happens if one of our factories is impacted by an event? What happens if one of our suppliers is impacted by an event? We have to be able to quickly pivot to ensure that we are able to manage that supply seamlessly. We feel we are starting to build that plan, and we have a network of suppliers that certainly will allow us to do that. Look, I'm absolutely excited. We have the right team. We're well positioned through to 2026. We're starting to think of 2026 and beyond.
It truly is a partnership between ourselves and R&D as we move through this journey together, and I'm excited and confident that we'll deliver what the business needs. With that, I'd like to turn it over to Susan Kim, our CFO.
I got introduced to PacBio during the middle of the COVID pandemic, and having come from the tech industry, I got to learn about the company, I got to learn about the technology. I met Christian, I met members of the board, I met members of the management team, and I got very excited by the potential for our technology and the impact we can have on human health. That's why I'm here today at PacBio. One of the takeaways this morning that all of you have seen is just how much progress we've made across the company. In many respects, we're just at the beginning, and the journey forward is very exciting. Before I dive into the long-term financial targets, what I wanted to do is I wanted to take a moment to dive into a snapshot of where we are today.
During Christian's first year at the company, you heard him talk a lot about needing to invest in the commercial organization, and that investment has paid off in spades. What you see is that over the last 12 months, of the instruments that we shipped over that 12-month period, 40% of those shipments went to new customers. That has helped to grow our overall customer base to 400 today. We've also grown our Sequel II, IIe install base to 494, and growing that install base has helped us to grow our consumable revenue. If you look over the trailing 12 months, that consumable revenue has grown 23% year-over-year, despite COVID lockdowns and macro pressures. We continue to develop our execution engine.
We have launched new products, we have grown our install base, and we continue to grow as a company, such that in Q3, we crossed the thousandth instrument shipment mark. We shipped our thousandth sequencer in Q3, and so that is also very exciting. I'm very proud of the fact that we continue to be well-capitalized with $834 million of cash on our balance sheet as of the end of Q3. Now, in terms of the financial targets, you've had Christian talk about our long-term strategy. You heard from Mark and Dave talk about our product roadmap. You heard Jen review our addressable market opportunity, and the combination of all of that will contribute to our revenue growth to grow our revenues 40%-50% CAGR out to 2026, and have a minimum revenue of $500 million.
Our product roadmap is also what's gonna contribute to our gross margin expansion to achieve gross margins in the 55%-60%+ in the 2026 timeframe. Because of the investments that we've already made in the business, we can be very disciplined going forward with respect to our OpEx investments, such that we can deliver sustained operating leverage as we continue to grow. The combination of this will allow us to achieve cash flow positive in 2026. Before I talk about the revenue drivers, I thought I'd step back and take a moment to talk about how launching a new platform, especially a platform at higher throughput, can expand our revenues. What you see here, the Revio platform. The Revio platform compared to previous generation platforms.
Revio, with its higher throughput, and therefore lower cost to sequence, and the relatively short CapEx payback period, will increase the accessibility of HiFi for more labs across the world. You heard Jen share a lot about that. Therefore, we expect the install base for Revio to be larger than our previous generation platforms. We will also continue to innovate and add enhancements on that platform to continue to make it a very attractive platform for more and more customers, so the life of that platform will be longer. With its higher throughput, we also have the ability to increase the consumable pull-through. The consumable pull-through after a couple quarters of launch and shipment, will be higher than the Sequel II. The combination of the both will expand our revenues and help us grow in 2023 and beyond.
Now, executing on Revio, and of course Onso, is critical for us to be successful against our plan, but it doesn't stop there. We are also transitioning, you've heard us talk about this before, we are transitioning from a single platform to a multi-platform company, launching new products at different throughput capacity and also launching products at shorter frequencies than we have in the past. More frequent product launches will help us to gain the momentum to have more new customer acquisitions and to enhance the upgrade cycle with existing customers. We will be launching more higher throughput platforms, which will help to increase the pull-through, as you see on Revio.
The way that we're going to do that is by reducing the runtime, increasing the number of shifts that can be run at the same time, and increasing the density of the flow cell, similar to what we demonstrated with the Revio platform. We will also continue to develop more end-to-end workflows, more solutions for different applications, which will also help to drive demand for our technology and help to grow our revenues. Showing and demonstrating the value of HiFi long reads for more and more applications, as you heard from Jen, and dramatically reducing the cost to sequence, as you see on the Revio platform, is going to increase our addressable market opportunity and help us to increase our market share, and that is what's going to support our revenue growth target.
I talked a lot about launching new products, expanding our product portfolio, and so what you see here is that in 2026, with $500 million minimum revenue, that the mix of revenue is more diversified than it is in 2022. We're gonna have revenues coming from our long read platform and revenues coming from our short read platform. What you don't see on here is also the fact that we're going to expand the platform in terms of different throughput capacities. We're gonna have a lower throughput that you heard Dave share. We're gonna have our mid-throughput platform. We're also gonna have higher throughput platforms. If we were to break that out, that diversification of sources of revenue that help drive our growth is even greater than what is shown here.
The other element of this graph that I wanted to point out is that purple slice, which is our consumable revenue. Today, roughly about 40%, mid-40% of our revenues is consumable revenues. As we launch higher throughput platforms into the market, that percentage is going to grow. In other words, our consumable revenue is gonna grow faster than our overall revenues. We're gonna also have a short read platform, and so we're gonna have consumable revenue coming from our short read platform. Therefore, the total consumable revenue mix out in the 2026 timeframe is gonna be over half our revenues. Consumable revenues are higher margin than our instrument revenues, and so that's gonna be the starting point in terms of improving our gross margins, and will actually be the largest driver to increasing gross margins over this timeframe.
That's what I've labeled here as product mix. Expanding and growing our consumable revenue is going to contribute to us improving our gross margins for the company. Growing our overall revenues will also increase our gross margins and expand our gross margins through volume. Let me take a moment to dive deeper into how volume improves both our instrument gross margins and our consumable gross margins. What you see here is the breakdown of our cost of goods sold for instruments and consumables, and you see it broken down by materials, overhead, and labor. With higher volume, you get to spread the fixed costs over more units, so therefore the per unit cost goes down. Higher volume also allows you to have volume pricing, which also helps to improve your costs.
One of the things that you'll see on the instrument side, and we've already demonstrated this with Revio, is that we are gonna continue to have common components across platforms, which helps to further get the economies of scale from the benefit of having higher volume. The other aspect is that it helps to mitigate inventory risk as demand moves from one platform to the other. You've heard Mike talk about contract manufacturers and the benefit of contract manufacturers in terms of supporting burst capacity of demand. Well, the other benefit of contract manufacturers is that as our revenue grows, we do not have to make the commensurate investment in our fixed cost infrastructure, that we can keep that where it is and still be able to grow our revenues from where we are today.
Now, on the consumable side, what you'll notice is that the mix is more diversified, i.e., more labor and overhead as a % of the total COGS mix. What that means is that as volumes increase, you get more leverage in terms of gross margin expansion with higher volumes of consumable revenue. The gross margin on consumables is going to expand even faster. There are opportunities to improve our gross margins on the consumable side, such as introducing automation, especially with respect to reagent manufacturing, but also as our volumes grow, to manufacture in larger batches, again, to spread the labor and overhead costs over more units. You've also heard Mark talk about moving from 8-inch to 12-inch wafers, which has the same benefit of spreading those costs over more units.
There's quite a bit of opportunities, a number of different opportunities to expand our gross margin, and it starts with product mix and then, of course, volume. There are other contributions to expanding our gross margins, and before I go into it, I did wanna highlight that having Mike as our leader in the manufacturing organization and as a partner for me, we share a passion in terms of driving efficiencies to reduce our cost structure. Manufacturing efficiencies is also gonna contribute to our gross margins. You heard Mike talk about 140 different opportunities. Well, we'll see. We'll see about that. Over this timeframe, we are gonna have manufacturing efficiencies expand our gross margins.
There is a very tight, close early integration and partnership between manufacturing and R&D when it comes to the technology development, which helps to improve the cycles of learning and to be able to benefit from manufacturing efficiencies early in the quarter, early quarters after a product launch. You heard Mike talk a lot about quality and testing and improving our yields. Therefore, that helps to reduce the costs that would otherwise hit our P&L. Efficiencies is gonna help expand our margins. There's also an opportunity as we launch new products and we demonstrate more and more value for our products, especially for our customers, that we can price to that value. Having said that, pricing is not going to be a big contribution to our gross margin expansion in this forecast horizon.
Now, of course, the last element of the P&L in terms of getting to a point where we turn the business cash flow positive is OpEx investment. We are going to be very disciplined with our OpEx investment and prioritize the investments we make against our product roadmap, which is going to fuel our growth going forward. One of the other things that we did this past year was actually reorganized the R&D organization around centers of excellence to deploy key capabilities such as biochemistry, bioinformatics, instrument engineering to the programs that need that talent the most at the point in time when that talent is needed. That helps to drive efficiency.
There are opportunities to outsource components of our R&D, but where we get really excited is continuing to partner with others in the sequencing ecosystem and with our customers to continue to innovate, develop new applications, which will help to continue to drive demand for our technologies. A prime example of that is the Broad Institute and the MAS-Seq Kit that we launched last month. We've made quite a bit of investments in our commercial organization, which has helped to generate more awareness for PacBio and our technology than ever before. You heard Jeff talk about all of the momentum and buzz and excitement for PacBio at ASHG. What we've done is we started to get that flywheel of demand moving, and so therefore, to accelerate our revenue growth going forward, we don't have to make the commensurate level of investment going forward.
We've been investing in the business, which has helped to enhance employee productivity across the company, and so therefore, to support this revenue growth, we believe that our headcount growth will be limited. Let me just take a step back and walk through kind of the breakdown of our OpEx, and it's an estimate for 2022. What you see here is that 40% of our spend in OpEx is headcount-related. Another 20% is non-cash stock-based compensation expense. That's something to factor in when you think about us turning cash flow positive. Another 40% is non-headcount-related spend. This spend has grown for us in 2022, and that's because of where Revio and Onso were in their relative cycle of the development program because we were nearing the platform launch.
Towards the end of a development cycle, close to a platform launch, that expense base rises. Going forward, by us staggering our platform launches, non-headcount-related expenses will moderate and is well within our control in terms of how much we spend in each year. As you can see, our expectation is that our OpEx will grow roughly 5% CAGR out to 2026. The combination of all of that is how we get to cash flow positive. It is very exciting because we have come so far, but we are also just getting started. We've expanded the foundation to scale the business. We are now focused on accelerating our growth and then moving into durable growth in the outer years. It's a very exciting time for PacBio.
With that, I'd like to call Christian, Jeff, and Mike, and we can do Q&A.
Hanging in with us, Kyle? You good?
Sorry.
It's live. You're okay.
It's going down.
Maybe I'll ask, it's Dan Brennan from Cowen. Thanks. Maybe I'll ask, a multi-parter since last time I asked one. I guess the first one would be,
This isn't an earnings call.
You've kinda got to the point in not giving the order numbers, so I'm wondering if you'd be comfortable giving a sense of where orders are today, number one. Number two, when we look at the past product cycles, when we think about how long it took for the next gen product to exceed the installed base of the prior. For the S1, it took six quarters after the RS launch. For the Sequel II, it took 11 quarters after the Sequel I launch. I'm just wondering, you know, given the fact that you're pointing to certainly a bigger opportunity and an installed base that'll exceed it is kinda three years the right zip code for that? Then probably the most important one would just be on the pull-through.
Obviously, that's potentially the most exciting part of the revenue model, the fact that you've got a theoretical $1 million, you know, $1.3 million pull-through on this product, which is five times greater than the Sequel IIe. Like, how do we think about on a per platform basis, like what you think the customer utilization will look like? You know, on the prior platforms, it's been around 50% of the theoretical max. I'm wondering, is that the right zip code to think about ultimately where this settles out? Thank you.
That's a mouthful. Let's see if we can unpack that. Maybe start with, you know, start with our thoughts on install base and pull-through. We do believe that we have more scale than we've ever had so that we can accelerate, you know, so that we can accelerate our install base, and my expectation is that we could do better than the Sequel II and IIe install base. Our expectation is, you know, we have more capability, we have more reach, and as a result, I think we can move faster. That's the first thing. With respect to pull-through, pull-through is tricky, right? We haven't really given specific pull-through for Revio yet because the simple fact is we just don't know yet.
We will unpack that over the next several quarters. What's exciting is that, of course, you pointed it out exactly. At 1,300 genomes at $1,000 a genome, it's $1.3 million a year, which is massive. Now, most systems don't run that high, and most customers won't pay $1,300 a genome because what we will do. Or I'm sorry, $1,000 a genome. Most customers will actually pay less than that. What we will do is balance the elasticity gain from lowering, you know, getting the price with the you know, kind of our pull-through expectations. Typical systems, you know, historically have run anywhere from 25%-40%, 50% utilization, give or take.
What's interesting about the question is actually the users that get the best prices are the ones that probably go first and will set the table. Their pull-through, although they may be more higher utilization, you know, may in fact not, you know, it may not be straight math, right? 50% times $1,000 a genome. Oh, it's a half a million dollars a pull-through. That's probably not the way to think of it. However, we do fully expect it to be a multiple of Sequel IIe. I mean, there's no question about that. That's really the point, right?
If we can create systems with higher pull-through and with Susan pointing out that, you know, not only do we have the opportunity to grow gross margin just because the product mix changes between consumables and instruments, but we have a lot of opportunity to grow the consumable gross margin. We're gonna get a double benefit there. I think I kind of addressed-
Orders.
Sorry?
Orders.
You wanna, why don't you comment on that?
Sure.
Did I just punt on that one? No.
Answer your question directly. We're not giving any commentary directly on the number of orders we've received. I think what I will say is you can see on my last slide the tremendous amount of enthusiasm in terms of customer reaction to these products. You know, each of those customers that are shown on the slide have each ordered at least one and several of them, you know, multiple orders. That's not an all-inclusive list of the customers who have ordered across every single region as well.
I think customers have been incredibly enthusiastic about our customer loyalty programs and think that we've treated them very fairly in terms of, you know, compensating them from a discount perspective on Revio, you know, in terms of, you know, some of the more recent purchasers of Sequel IIe. That certainly helped as well. I think from a regional perspective, as you can see by the names on the slide, we've had a really strong start in APAC, so the team's done a great job there. You know, obviously, the European market is driven more by tenders, and so, you know, those orders will come, but they have to go through that process. I expect AMR to really accelerate here throughout the course of Q4 and into Q1.
I expect that we'll give a little bit more color on order volume, you know, sometime in early to mid Q1, I think, is the plan.
Yeah. I mean, I think at the end of the day, the vast majority of those you saw on the list are multi orders and significant orders. You know, obviously that doesn't give you the answer, but we're thrilled. I mean, it's been remarkable. It's just, it really has been remarkable about how fast, you know, it's a $779,000 piece of gear, and it just blows my mind.
Watch.
It actually blows my mind because we're all sitting still in this economic backdrop of uncertainty, right? We talked about that on our Q3 call, talked about it in Q2. We'll probably keep talking about it because I do think the world's in a really interesting place. When you have new technology that fundamentally changes the game, it actually helps to start insulate you from that. Given the fact that the order book has been so strong, it really starts to help us plan out our factory capabilities over the first half of next year.
The only other comment I'll add, given you asked a couple of questions there, but with respect to consumable pull-through, the early conversations for Revio, certainly customers have been very excited by the platform. What's interesting to me was that some customers were so excited that they were gonna place a multi-instrument order, they don't necessarily have the samples to fill all that capacity, but they are, by historical nature, going to fill that capacity. You have some labs such as that. Then on other cases, you have labs where they have multiple instruments, they have you know, multiple Sequel twos, and they decide to start with just one Revio, and then they plan to build the business case to purchase their second Revio.
Building that business case and the financial case becomes much easier, again, because of the short payback period on that CapEx investment. They're gonna have some of that variability in the early quarters. I wouldn't use that first couple of quarters of us shipping the platform and the consumable pull-through as indicative of what the steady-state pull-through will be because of that dynamic.
I'm sure on our calls, we will, just like in the old days, you know, as we launched a new platform, you know, we would really pause for several quarters to figure out truly what we think the consistent pull-through would be. Nonetheless, it's gonna be a multiple of what we do today and will completely change the look of our P&L.
Christian, one for you on just 1,000 new customer leads at ASHG that you highlighted. Is there any color you can share on how, you know, staging or how you've sort of gone about qualifying those leads, either by sort of customer type or by customer budget or an intent to purchase a bundle versus a single platform?
Jeff, you wanna-
Yeah. We're still in the process of you know, reaching out to all of those customers. You know, it's obviously a big number. You know, I think the interest is gonna be spread across the markets and the applications that the team has talked about. You know, like I said, I think the bundle has been a really interesting package for customers as well, and we've seen a lot of uptake from that. In fact, while we were sitting here earlier in the talk, you know, I checked my email, and we got an order for one of those bundles just as we were speaking here as well.
Those orders, I think, in terms of, you know, Sequel IIe and Revio bundles or Revio and Onso bundles are starting to come in. It's a slightly higher price point. When you think about it's actually a great deal for customers, especially on the long read side, that wanna get started, you know, with Sequel IIe now and then grow into that capacity that Revio will deliver for them as well.
Yeah. I think to be clear, we have two kinds of bundles. We have the bundle where you can buy a Sequel IIe now and take Revio later. We also have a bundle where you can buy Revio and Onso. It's nice to see you're doing some actual work while we're up here on stage.
Keeping up with the volume.
Got it. That's helpful. Susan, one for you on gross margins. You know, nicely ahead of at least where we were in 2026 in our model. Is there any color you can share on just steady-state gross margins on the IIe versus Revio, both on the instrument side as well as on the consumable side?
Yeah. The way that I'll answer that right now, because it is early for us to guide 2023, is that when we launch a new platform, in the early quarters, it generally does tend to be very expensive, and so the margins can be lower. I also talked about the fact that we do improve the cost, we reduce that cost, and that helps to improve gross margins, especially because of that early integration between manufacturing and R&D early in the technology life cycle. I think what you're going to see is that Revio probably follows that same pattern in terms of having some of the margins maybe be lighter than what the steady-state gross margins will be, and then it'll improve over time.
Of course, the consumable revenue and having that grow faster will help to pull our overall gross margins up.
I do think it's fair to say, Tej, just that, you know, the Revio platform as a whole will be a significantly higher gross margin than where we are today with the Sequel IIe, if for nothing else other than the incremental pull-through and that, and the mix change. You know, relative to the capability, we expect improvements on both sides, instrument and consumable.
Thank you.
Actually impressed with Tejas having a 2026 gross margin target.
Yeah.
Can you talk about the capacity that your current or % of your current customers that are running at full capacity? Should we think about them as being the first ones to adopt Revio? And then, you know, just in terms of the pace of the upgrade cycle, you gave so much information that I hate to, like, try to squeeze even more detail out of this, Susan, but it does imply probably a front-loaded instrument cycle because you have all those consumables coming in 2026. Just the way to think about that. I have one more question on OpEx, 'cause I won't, you know, belabor on too much for the questions. Go ahead.
Yeah, I think, what was the first part of the question again?
Higher throughput customers.
Oh, higher throughput customers. That's right.
Percent that are running at full capacity.
Let me answer that one first, and then you can go on the gross margin. But I think. Look, very few of our customers, if any, are running at full capacity. However, that's actually not the driver for buying a Revio. Buying a Revio is two components. One, it's economically better, even if you're not running at full capacity. Yeah, you have to come up with the capital, but we've already seen that capital is available, and that hasn't been a gate. But it's also, at some level, an aspirational purchase that customers we see already are looking at one-for-one replacements of. And remember, one Revio is 15 x more powerful than one Sequel IIe.
You have to think about the expansion of Revio in both for the high throughput labs that are already running, you know, very effectively as just a, you know, economic juggernaut and allows them to create the scale they want and actually take share from short reads, because as Jen pointed out, it just does more. You have that aspect. For the lower throughput customers, they're still Revio buyers, and the reason for that is because it gives them the ability to expand over time, and the per run costs are cheaper. Coming up with the barrier, the capital barrier isn't that significant. I would say at some level, there's a bit of emotion. You know, scientists are people, too, and they have these.
You know, there's really probably no need for your iPhone 14 if you have an iPhone 8, right? But people go through those upgrade cycles. It's a similar kind of thing, albeit, you know, in a little bit of a different world. You know, Susan, if you wanna comment on the front loading of instrumentation, that'd be great.
To answer your question, with respect to Revio, let's start with Revio and then go over kind of the timeframe out to 2026. The Revio platform is so transformational that we expect a lot of our Sequel II customers to upgrade to Revio. That is going to have the effect of initially increasing the relative mix towards instruments, and then as our customers start to use their boxes, that's going to grow our consumable revenues, and so then, therefore, the mix of consumable revenue will grow over time relative to our total. In the initial periods, probably as we're shipping a bunch of boxes and then our customers are ramping on those boxes, the consumable pull-through may be a little bit lighter.
Of course, as we ship more boxes and we expand the install base, then it'll kind of settle into that bimodal distribution that we've, you've probably heard us talk about in the past, in terms of having the high users really using their box at a high rate, which helps to increase the consumable pull-through. We're also going to be selling the Revio box to customers that have maybe a lower utilization, and so you'll end up seeing kind of this bimodal distribution is our expectation, and then the average will settle out where it is. The other thing that also helps to improve consumable pull-through is the fact that a dynamic that you often see with our platform launches in terms of the time between one platform launch to the next has been relatively long.
You go through this cycle where you increase consumable pull-through, and then maybe it settles back down as the market expects another platform to come. What we're going to do going forward is we're going to be focused on shortening the timeframe between one platform launch to another. It's not gonna be a replacement of our existing platform, but it's going to expand our product portfolio at different throughput capability. In particular, as we launch higher throughput platforms, that's further gonna help accelerate our consumable revenue and therefore help our consumable pull-through. We're gonna do that both on the long read side and the short read side. That's how our overall consumable revenue mix grows over time.
Yeah, I think that composition is really a big component of this is, you know, by the time you get out to 2026, our Revio customers are operating at full tilt, but between here and there, we will have launched other platforms that will be generating, you know, generating lots of consumables, too. On balance, the layering effect, so to speak, of instrument placements with respect to consumables in a year by year makes it so that you get to a steady state where consumables are a very significant part of the business. You know, whether we achieve that steady state in 2026 or not is still up for, you know, up for debate.
You'll see that impact our P&L, you know, even as early probably as 2024, 2025, but you'll see that improve through 2026 and then out into the future horizon as well.
I got a really long answer to that, so I'll pass the next one. I'll just make it really short then. Just as we think of 2026 P&L, you know, as we think about this mature sequencing P&L, 20% R&D, 10% G&A, 20% SG&A, maybe 25% sales and marketing, I mean, and that's the cash flow breakeven. I'm trying to ask a quick one because you gave me a really nice detailed, long answer, Mike.
We have a tendency to like detail. You know, that is a good question. The composition of the total P&L in 2026, I would argue that we're probably not still fully mature, quote-unquote, in steady state. I think we'll still be in a heavy growth mode. As Susan pointed out, through this horizon, we think our CAGR with respect to operating expenses can be pretty moderate, and she pointed out 5% on the slides. And the reason for that, of course, is the opportunity to stagger our product launches a little more effectively so that we don't have the bolus of non-headcount related expense. In terms of, you know, the traditional metrics of, you know, how much R&D, et cetera, that will actually really be dependent on a couple of different things.
One, how fast do we get towards the clinical opportunity? 'Cause I expect if the clinical opportunity is accelerated, then we probably will be spending a lot more, you know, more resources on preparing to be clinical, so to speak, and that affects the whole company. Two, on the SG&A side, you know, our commercial organization, the reality is we're in a pretty good scaled spot today. That was the plan, right? To scale it well in front of where well in front of where the demand, you know, where the actual revenues would come, and we're in that spot now.
There may be opportunities for us to specialize more, and that might require, you know, a different, a little bit different composition, still fitting within that spend envelope, but allocating the dollars in different ways. I think for today, it's probably—I mean, obviously we have the models that put it all together, but today it's probably a little bit too early to tell, you know, is it gonna be 20%? Is it gonna be 22%? Is it gonna be 31% or 18%? It's still a bit early for us to tell 26% explicitly.
Great. I think we have time for one or two more questions if.
Kyle Mikson from Canaccord Genuity. I'll probably be quick here. Just on the 2023 kind of order book, a lot of like, several multi-unit orders, it seems like. That's quite a few. That could be over, like, 14 units, $11 million in revenue for instruments. Are all those going to be shipped and installed in the first quarter of 2023, hopefully? Maybe just talk about, like, backlog conversion throughout 2023. Like, just, you know, a lot going on macro-wise. Just could be interesting to hear.
Sure. Jeff, you wanna talk about how we might feather out shipments?
Yeah. You know, I think some of the units that we have ordered will ship in Q1. We've set the expectation with customers that, you know, it will happen throughout the course of 2023. You know, we have a lot of customer demand as we've talked about here, and so, you know, we wanna be fair in how we allocate those first couple quarters of instruments. Certainly the customers that have ordered now are gonna get, you know, the single units, they're gonna get those. The multi-units, we'll phase those out, but I would expect the early customers to receive one or two, you know, in the first quarter as well.
Yeah. I think, you know, the strategy here for having a great product launch is all about picking the customers and being thoughtful about how you do your shipments. What that means in our context is that, you know, some of our highest throughput, high using customers will get their instruments first, likely. That way they have the most experience, and so we can manage them through the transition to Revio, and then we'll ship a bolus of orders, and then we'll stop for a while. We'll stop for a while, and by a while, I mean weeks. You know, we'll stop for weeks so that we make sure that everyone has a super smooth experience and we understand how successful the launch is.
We've done this a few times before in our careers, so you know, we know we have a playbook that we think works. For those with significant multi-unit orders, as Jeff pointed out, we'll feather those out over several quarters probably, because we wanna make sure that all of our customers get the opportunity to enjoy the power of Revio. That will be an important part of how we think about this. It also gives us the opportunity to perhaps be in a bit of a backlog situation, which is always good for being able to predict your revenues a little bit better, right? Today, nearly all of our demand comes from revenue generated within the quarter.
If you can get to a steady state where you always have a little bit of backlog carrying forward, it really helps as you manage and grow your business. Of course, you don't want that to be too long because it's a razor blade model and you wanna get those consumables flowing as fast as possible.
That was great. Thanks, guys. Then Susan, just on gross margin, yeah, like compared to today, it's like a 20%, 20 percentage points higher, I guess. It is a little bit, you know, higher than what we were expecting by that time, by 2026. I guess what year would margins kind of reflect in your opinion based on the mix of the product revenue? And, how do you kinda think about all these new instruments being launched and how that's sort of just like a headwind to margins? And maybe like without that, I mean, could you get to, you know, 70% in the near term?
What percentage in the near term did you say?
Well, I said, like, without all these like, you know, new products launching on the instrument side, like what could a gross margin number maybe be, mid-60s%, possibly?
Certainly over time. It is true that, you know, as I mentioned, consumable margins are higher than instrument margins. As we launch a new product, and if more of that revenue is on the instrument side, then it can have the effect of diluting our gross margins. Having said that, we also have opportunities to find ways to improve our instrument gross margins. Of course, that pull to lift our overall company gross margins is happening because of the consumable revenues. A lot of it is around new applications to drive demand for our instruments and have those instruments running, but also launching higher throughput platforms. That is what's going to help to improve our gross margins as a company.
We also have additional opportunities that we are thinking through in terms of improving gross margins, some of which we alluded to in the presentation, both between Mike and myself, to further drive efficiencies to improve gross margins. Can our gross margins be in the mid-60% over the long term? Most certainly. Our consumable margins can have a long runway in terms of expanding and improving, and so that's mostly gonna be what lifts our gross margins, but also so do our instrument margins as well.
Kyle, the one thing I would wanna add is that, you know, this is a new paradigm. Today starts a new paradigm for PacBio in the sense that we are going to be launching new products more and more often, and that will, you know, those platforms. You know, when we talk about launching a new platform, it's really only a one or two quarter kind of cost, so to speak, as you ramp up to speed of capability. As our revenue base gets bigger, that impact becomes less and less.
You can expect from us to be much more focused on more platforms into the market, driving that strategy that I started the day with, you know, multiple technologies across a stack, multiple products within that technology, which are generally platforms and ecosystems around those platforms, and then ultimately creating a layer of integration that, you know, quite frankly, companies haven't done yet. That's our challenge to make that happen and do it successfully. I fully believe that that's the future of genomics, and that's why, you know, that's why I do think PacBio is, you know, the future of sequencing.
Great. Well, I think that's a good transition now into lunch. We'll take the next 20 minutes or so, and we'll have another course coming out. We could finish up, and then Jonas will come on to wrap up the day with some of those paradigm shifts that Christian just mentioned. Thank you.
PacBio sequencer. Depending on the subgroup of pediatric cohort, we see an increase in diagnosis between 10%-30% with the new sequencer and the new solution that are being offered. It's gonna change so much the scalability of doing research to bring it to another level of thinking about extending since the cost is going lower. Now we wanna expand the use of long-read sequencing for different question and diagnosis. Our main thing that we've been wanting to do is the clinical validation of using long-read sequencing. It's gonna make a big difference in the validation process and also how we can bring it now to a clinical test and not just using for the research cases.
Please welcome to the stage, Co-founder and Chief Scientific Officer, Jonas Korlach.
Thank you very much. Hello, everyone. It's a real privilege to speak to you today. I will try to enhance your lunch experience by telling you about some of the fundamental paradigm shifts that our customers have achieved through the use of our technology. I would also like to give you my perspective of what it means for our business. By way of introduction, Jeff told you about the Sequel Insight system. Here is the graph of the cumulative data output by our global customer base. You can see that this is an exponentially growing shape, so the output of data is accelerating, and we can certainly expect that with the Revio system, this acceleration will be even steeper in the future. We are very pleased that last month, our customers surpassed 25 petabases.
That's 25 million billion bases. Truly remarkable. Now, in the research space, what our customers do with these data, they publish papers. You see the graph of the peer-reviewed publications is in very similar shape. More and more papers are coming out. You notice there's no Y-axis on this graph. The reason for that is because it's getting harder and harder to get an exact count of the number of publications. That's mainly because a few years ago, PacBio was always in the title or the abstract. Now, as it becomes mainstream, sometimes it's buried in the supplementary method somewhere, and so it's very difficult to get. The conservative estimate is that there are now over 9,000 peer-reviewed publications speaking about and demonstrating the power of HiFi sequencing. What are these customers saying?
I've listed here a few recent publications, the last few months, and I'm not gonna obviously read all of these, but I hope it gives you a little bit of a sentiment of what our customers are saying about HiFi being the current gold standard, being helpful for understanding regions that previously you couldn't. Drastic improvements in variant detection. Heralding a new era. Overcoming limitations, the accuracy and the high reproducibility. And then, autism that remained a mystery for several years, potentially having great benefits in the clinic and so forth. Our customers in these publications are really highlighting the value and the power of HiFi sequencing. Also, our customers, through these research publications, compare HiFi sequencing to other technologies. Whether it's Illumina sequencing, ONT, Sanger sequencing, you see similar statements.
Artifacts in short read data otherwise missed through Illumina sequencing. More effectively characterizing genetic variation in smaller sets of high-quality long read sequencing data. Accurate long reads outperforming those that did not. Our customers are calling for greater uptake of highly accurate long reads in the future and then, overcoming things that are misdiagnosed by conventional method and so forth. Through those statements and through their work, I'd like to now describe to you 10 fundamental paradigm shifts that change the way that researchers look at biology, that they look at genomics that have been powered by HiFi sequencing. The first one is, and this has been highly publicized, and we mentioned that a human genome was completed for the very first time, the Telomere-to-Telomere Consortium.
As you know, the assembly was directly built from the HiFi reads. I'm not gonna spend more time on it because you've all heard this in late March. What I would say is what does it mean for business? Well, in the course of this project, the researchers used all the available technologies in genomics that are available, and that allowed them to evaluate the relative contributions of each sequencing technology to generating a T2T genome. In this process, and now with other human and also plant animal genomes, my interactions with the researchers, they have communicated to me that they believe that HiFi sequencing contributes 98% of what it takes to make a T2T genome.
I expect that we're gonna do this a few more times with the kitchen sink approach and really get to true T 2 T. Then in a production setting, on a mass production scale, HiFi sequencing will be the workhorse and will be good enough because you get almost all of the information that there is to get. Now, the second paradigm shift is that the human genome doubled in size this year. That's pretty cool. That doesn't happen every year. I can tell you it won't happen again because six gigabases is the right answer. For the longest time, when you ask the researchers, how big is the human genome, they would say it's 3 gigabases. That was because of a technological limitation of not being able to segregate the two copies that each of you have in every single cell.
You have a copy of 3 gigabases from your mother and a copy of 3 gigabases from your father. This is a series of preprints by the Human Pangenome Reference Consortium stating, "We no longer consider collapsed 3 gigabase genome assemblies as state-of-the-art, but instead consider two genomes. For every diploid genome assembled, that is 6 gigabases versus three gigabases, where the parental haplotypes, that's the parental copies from the mother and from the father, are phased and fully resolved." That's a fundamental paradigm shift. What does it mean for our business? It means that now, the researchers can fully resolve both copies of the genome, and HiFi sequencing is the only technology that is capable of doing so.
In fact, this preprint, which has now been published in Nature about three weeks ago, shows that PacBio HiFi sequencing can phase and separate the haplotypes at over 95% and competing long-read technologies only about 50%. There's a big difference between what HiFi sequencing can do and what other technologies can do. The third paradigm shift is one of these preprints, and that's a human pangenome reference. Until this paper, as you know, the human reference genome was a single linear genome. Here, for the first time, the HPRC, the consortium, has built a new human genome reference out of many individuals. This is completely and solely based on PacBio HiFi sequencing, 47 genetically distinct individuals. Remember the two copies, so that's 94 parental copies. They have excellent qualities, completeness, accuracy, base pair accuracy.
This preprint immediately shows the dramatic benefits that you get from moving from a single linear reference to a reference that captures much more of the diverse, the genetic diversity across the human population with different ethnicities. They added about 5% of the genome. They added over 1,500 gene duplications and immediately show drastic improvements in variant calling, resolving complex regions, tandem repeats, improving RNA-seq mapping, ChIP-seq mapping, and so forth. What does this mean for business? It means that now we're seeing these 5%, we're seeing these gene duplications, but they can only be seen with HiFi sequencing, so it can no longer be justified to take short-read genomes and try to do true whole genome sequencing.
Now this leads me into the transition of the next paradigm shift, which is a big one, and that's what I call whole genome sequencing with the W. Let me explain to you where this is coming from. A little less than a year, we lost one of the giants in human genetics, Debbie Nickerson from the University of Washington. In 2014, there was a NHGRI workshop of the future of genomics, and I wanna play you a little clip. She's in the audience, so you will hear her voice, and she said this.
Putting the W back in whole genome is really important, okay?
Typical Debbie, if you knew her, very direct. She said this in 2014, recognizing that short-read whole genome sequencing doesn't sequence the whole genome. We're so pleased to see that her vision and her request that she did 8 years ago, is now becoming realized. It leads to certain breakthroughs, such as the paper that we mentioned already before, with more solves of diagnostics. The 13% was mentioned by Jen. I think it's worth noting that if you read the paper, Tomi Pastinen and the researcher at Children's Mercy actually point out that for most of the cases, they found pathogenic variants. The diagnostic yield was much higher, but they couldn't yet count it into the 13% because this was seen for the very first time.
They call this the N of one problem, that we need to catch up with the databases, we need to catch up with understanding what is causing. I've had so many conversations with him where he said, "I'm sure that that is causing the disease, but I can't yet count it in the 13%." Jen also mentioned this as a reminder, the difference between a true HiFi whole genome sequence and what you can do with short reads. You get all the information.
With the Revio system now, it was so exciting to have conversations with researchers who now are changing their thinking and say, "Now we can consider for every individual and every patient coming into our cancer center or into a medical center, giving them a high-quality, a true whole genome that we can then use, that has all the information, and we can use that genome as a platform, and we can mine that with bioinformatics for the types of information that we wanna look at." Now, as was also mentioned, the short-read technologies have tried to convert some of these X's into check marks.
I just wanna mention one thing that in my opinion, so a lot has been written about Infinity CLR, but one aspect that I think has not received as much attention as it should is the fact from the Illumina presentations showing that in many cases you get the wrong answer. Researchers will get the wrong answer. This is a screenshot from the Illumina presentations that have been shown multiple times. This is a short region, and if you look at this region and compare it to the gold standard HiFi, you recognize that there are three errors. In this short region, there are two incorrect SNPs, small variants. Over here, the phasing is wrong.
First of all, you see that out of the 14 reads, because the reads are so short, only three of them are informative, and three out of the four give you the wrong answer. In contrast, HiFi sequencing, all 14 reads give you the correct answer and give you the two haplotypes. I think it's really, you know, data are not available, as was mentioned. Even from the presentations that Illumina has given, it's very clear that researchers will get the wrong answer because of the errors that are introduced, which I think is very significant. With regard to the short reads, from the screenshots, you can measure the fragment size distributions, and you can see most of the reads are 2 KB or 4 KB.
That's not long enough to resolve genes like SMN1, SMN2, PMS2 and so forth. On this graph, HiFi sequencing isn't even on the graph. It's somewhere around here. We've written several blog posts commenting on the various stages throughout the years about the difference, the HiFi difference that clearly differentiates true PacBio long reads from synthetic long reads. Coming back to what researchers can do with actual true accurate long reads, the next paradigm shift is out-of-the-box five-base sequencing. We mentioned the methylation capability. The paradigm shift here is that this is taking us from methylation analysis, which until now has always been at the population scale because of the low resolution, can now go to the individual level.
This paper presents us with a scoreboard, 117 hypermethylation events detected with HiFi versus eight with short read bisulfite sequencing. That's the hallmark of a paradigm shift. I couldn't see it before at all. Now I can see all the epigenetics that is there. For the first time, at the individual level, it allows you to link the genetic variants to the epigenetic information. Here is a structural variant in one of the copies, and you can immediately see in the reads how that changes. You can see these are red when the structural variation is here, when it's not there, it's not red. You can link at the individual patient level. What does this mean for a company? This is like a blue ocean opportunity because that was not possible before.
Children's Mercy again is leading the way to add the epigenetics layer at the patient level. The next paradigm shift is, you see from the icon, it's about RNA and isoform sequencing, and I wanna give a bit of an introduction. This is, you all learned this in high school, the central dogma of molecular biology, where you have a gene that is transcribed into a message that's RNA, and that's then translated into a protein. Well, that's grossly incomplete because the actual central dogma of molecular biology looks more something like this, where from a gene through a process called splicing, the RNA message is being processed into multiple isoforms, and these isoforms then give rise and are translated to protein isoforms. These protein isoforms from the same gene have different functions. Sometimes they can have the opposite function.
From the same gene, you make two protein isoforms, and they have the opposite antagonistic biological function. If you're just counting gene expression, you will not understand what is happening. If gene expression goes up, is it the plus isoform or the minus isoform? And you can even have something like this, where your gene expression profile does not change, but there's clear changes between the isoforms. A few more things that we know about this process. It's highly abundant. Over 95% of all human genes that have more than one coding regions, called exons, are subject to this process and alternatively spliced. The average number of isoforms per gene is estimated to be seven or more, and we already know that 15% of inherited diseases and cancers are associated with this alternative splicing.
There are many papers that now represent the paradigm shift that goes from gene expression analysis to isoform expression analysis. That's a phrase that I borrowed from this paper from the Broad Institute, and there are many more papers. I just put them on until I ran out of space. I did want to put in two very recent contributions that were shown at ASHG at the bottom, highlighting how we are now starting to understand the role of these different differential isoforms in disease. You can see here, isoform usage differences in schizophrenia. This is about Alzheimer's disease and so forth. There are many learnings about the complexity of the transcriptome that PacBio Iso-Seq has provided. Just a couple more examples. 2,900, that's about 12% of all brain-expressed human genes are heritable at the isoform level.
What does that mean? It means you have two copies, one from the mother, one from the father. They're different, so they make different isoforms. It matters which of those copies is being transmitted to your offspring and will change the heritability, GWAS studies, and so forth. You will not understand GWAS studies if you don't look at the isoform level. PacBio has also shown, now that we have the full catalog, we can compare what was possible before. These are two cancer studies showing that 20%-40% of everything that's there can be resolved with short read sequencing, so you're missing 60%-80% of the biology that's there that can now be highlighted with PacBio sequencing.
On the plant and animal side, the paradigm shift that we've seen is that no plant or animal has a genome too complex or too large that would prevent it from getting a high-quality HiFi genome. Just two examples. This is a locus. It's causing terrible devastation in Africa, three times the size of the human genome. This researcher said, "It's like an 18-wheeler next to a compact car," which would be like the fruit fly and so forth. They got a high-quality genome using HiFi. This, I think, is the current record holder. This was presented about six weeks ago by the Sanger Institute. It's the mistletoe genome, so perfect timing for the season coming up, the holiday season. 100 billion bases. That's 30 x the size of the human genome.
In fact, in this graph, one of those squares is the size of the parental copies of a human genome. That's the human genome, and that's the mistletoe genome done with HiFi sequencing, very neatly arranged. Each of those crosses is a chromosome. What does that mean for business? We certainly fully expect that we will continue this leadership position where PacBio's been the core technology in biodiversity studies, in conservation genomic studies. I don't even have time to talk about agrogenomics. Every single species, every single strain of crops or livestock can get the HiFi genome, can get a high-quality genome, and now affordable with the Revio system. With regard to microbiomes, our customers have demonstrated unprecedented resolution of microbiomes.
This paper alone resolves to completeness more members of a microbial community than all of the studies that had been done before. That's the hallmark of a paradigm shift. We've seen several papers after that the resolution is now unmatched to what could be done before. This has been described in various publications and then also in GenomeWeb and so forth. Another paradigm shift in the area of gene therapy. These AAV vectors that are used for research with the ultimate purpose to be injected into people for fixing their genetic defects. Until PacBio sequencing, it was not possible to properly QC and properly sequence those materials.
Now that we are able to do this with HiFi sequencing, we see that there are quite a number of unwanted molecules that I think you wanna know about before you go on with your clinical trial or before you go on thinking about ultimately using it for therapy. We've gotten really great interest from the industry that is engaged in AAV and gene therapy to now use PacBio HiFi sequencing as a new paradigm shifting tool that will really allow them to accelerate their research and to have much safer reagents that are considered ultimately for gene therapy. Before I talk about the last paradigm shift, I just wanna mention that I've been really pleased to see increased adoption by clinical labs.
I've listed here a few publications and genes and diseases that are now have transitioned from the research into the clinical lab environment. This was from Nationwide Children's Hospital, a new pipeline, and in this paper, the authors commented that, quote, "The ability to generate long, accurate reads uniquely poises long read sequencing to revolutionize clinical NGS applications." We're seeing the start of that, and I can't be more excited to see what the Revio system will translate to in this particular area. We're seeing growing innovation by the scientific community. Mark mentioned this. As more data, as more instruments go out there, scientists do what they do best. They're being creative, they're being brilliant in developing new methods, new applications. We're leveraging this, and we're seeing this.
Some of them make it into our product. This is the principal method behind our MAS-Seq kit that we just released. DeepConsensus is already incorporated in the Revio system. Some are a little bit more experimental and proof of principle, like ZMWs on the bottom. Then we have Fiber-seq or completely new ways of simplifying the template preparation. We are working—we're very privileged to work in close partnership and close collaboration with these groups. Then we can evaluate them, whether we wanna take them internally, put them into products, or whether we want to have those as customer-supported applications. It's pretty remarkable that the Onso system hasn't even been on the market, and we already have one paradigm shift. This is extraordinary accuracy and sensitivity for short reads.
This is what we presented at the AGBT conference, showing unprecedented limits of sensitivity. This is 0.001%. That's 100 molecules with a resistance mutation being detected confidently in the presence of 10 million molecules without it. This task of finding the needle in the haystack is being made much easier. Mark talked about this, and I'm so excited by all the paradigm shifts that will come to pass as the high-quality short read sequencing will get into the hands of our customers. With that, I'd like to close, and if you allow me, I'd like to do so with a personal note. It so happens that Thursday of next week turns out that's Thanksgiving this year marks the 25th anniversary of the inception of the original idea of single-molecule real-time sequencing.
This is a scanned image of the drawing that I put in my lab notebook. Idea, watch DNA polymerase, make DNA, thereby sequence. The very next day, I met Stephen Turner, our CTO, and that was the beginning of the long partnership and friendship that Stephen and I have enjoyed over all these years. Of course, it was Stephen who was really the heart and soul of this journey and then also the company he founded. As you all know, he founded the company. He was the driver behind getting the first round of investment.
On the technical side, he not only invented the zero-mode waveguides, one of the principal pillars of the technology, and figured out how to make them, but he had so many other inventions and developments that led to where we are today. I'd like to ask Steve to stand, and I'd like to ask all of you to recognize his immeasurable contribution to our technology. Now, personally, if you ask me what the top two time periods over a quarter of a century with the technology, with PacBio were, I would say one of them is when we got the technology to work. That's the dream of a method developer. When we saw the first signs of life, the first pulse of the polymerase, which happened in this laboratory.
This is the basement of Clark Hall at Cornell University, where I did my PhD work, and Steve was a graduate student postdoc there as well. This is where the first pulses of polymerase incorporating DNA were seen. The second time is this time. It's right now. That is because we're now producing real products that fulfill the needs of our customers. I think the community has understood for a few years that HiFi sequencing gives you the highest quality, but there have been limitations with regard to throughput, with regard to cost. What's so exciting now is we're standing on the foundation of all these paradigm-shifting performance and publications that can now be scaled up.
I couldn't be more excited by the potential and by what's going to come to pass next year and the year after that, as the researchers really can scale this paradigm-shifting sequencing performance. I'm also just as excited about the extraordinary accuracy of the Onso system, because that will allow us to talk to any researcher who's got any sequencing project, no matter what's a long read project or a short read project, what kind of samples they have. I think that'll be very powerful, that we're the one-stop shop for giving you the best quality at the appropriate scale. All the speakers before me told you why they're at PacBio, so I'm gonna do that at the end.
I'm at PacBio because I couldn't imagine any other place in the life sciences where this type of paradigm-changing developments happen. The leadership team, with their experience, with their passion and energy, the entire team at PacBio, who work so tirelessly on bringing these products to market, working with our customers, providing excellent support. Then the scientific community. It's just so motivating, so inspiring to see what they can do with the technology, the enthusiasm that they have, telling me that for a few years they weren't able to solve this problem, now they finally can. It's been the thrill of my professional career. I couldn't imagine being anywhere else. With that, I thank you very much for your attention. Thank you.
Please welcome back to the stage Christian Henry.
It's pretty hard to follow that presentation. You know, you think about a science presentation over lunch, and you're trying to say, "Well, hopefully everyone will, you know, kind of, follow the science, but hopefully the food's really good so that everyone's kind of happy and satiated." The reality is, you saw that enthusiasm out of Jonas and his passion for the science and what we're trying to get done, and that's really what this is all about. You know, this is all about changing a paradigm that's existed for a very long time, and now we're at the precipice of the ability to fundamentally turn the entire industry on its ear. I thank Jonas for his passion. I really appreciate it. I thank Steve for his commitment as well.
It's so great to see, you know, the founders of the company. I thought it was really important to have them here at this day as we kind of move forward on our new journey. A couple of other housekeeping things before I get to the final takeaways. I'd like to personally thank, and on behalf of the management team, I'd like to thank Todd Friedman and Tanya Boyaniwsky in particular. Todd, as you know, is our head of investor relations, and when I told him, I think I told him probably in August, I said, "I wanna have an investor day, and I want it to be awesome." He delivered, and hopefully you guys really enjoyed the day and you learned something. The reality is also behind the scenes, Tanya. Tanya, can you stand up, please?
Every single person on the management team knows that this day wouldn't have happened without Tanya's persistence, her commitment, and her understanding of how to bring us all together. We just are really appreciative. Of course, the organizers too, Jane Main and her crew. We thank you for that. As we kind of wrap up, you know, I think today we tried to hit really four areas, and I think we were able to spend some really good time talking about the power of our technology and our ability to go after these really big markets with great technology and a great strategy.
We shared Revio and Onso with you as the beginning of our roadmap to the future, and hopefully everyone got a chance outside in the foyer to take a look at the instruments. I know the team out there was thrilled to spend time with all of you because all of your questions and your excitement for the instruments as well. Thank you for that. The other thing, it's very clear that this is a very capable team. We have expertise across the board, from product development and research, to operations, to marketing. You heard from Jen and Dave, really two of the premier life sciences marketers, and we're very fortunate to have them on the team.
Of course, I'm also extremely fortunate to have Mark and Susan and Jeff as partners here as we kind of push forward. Hopefully, you come away from today understanding that this is a complete team, we have deep experience, and we see the future really quite well. Finally, hopefully, Susan convinced you that we have a very strong path to creating a highly successful business with positive cash flows, growing revenue streams, durable value creation. This is really the opportunity for us to, you know, transform the way the market thinks about PacBio, because we definitely have the products, the market opportunity, the team, and the ability to execute.
When I started at the company and at JP Morgan in 2021, I said it would take us two years to get to this day, and turns out I was right. With that, I wanna thank all of you for your attention today. I wanna thank the entire leadership team for all of your support and for helping to build the next PacBio. With that, have a great afternoon. We'll be hanging around for questions if people have some individual questions. Enjoy your week, and thank you for your interest in PacBio. Cheers.