We are excited about today's discussion and the future of Pure Storage. Let me remind you that we will be making forward-looking statements today that are subject to assumptions, risks, and uncertainties. Actual results could differ materially from those anticipated due to a number of factors, including those referenced in the detailed disclaimer at the beginning of our presentation slide deck, and in our public filings with the SEC, which we encourage you to review. The presentation slides discussed today will be available on our investor relations website at investor.purestorage.com.
Okay, well, welcome, everybody. Thank you for making time with us today. We appreciate you coming to our product and technology-focused financial analyst meeting here at Accelerate. We have two hours today. The first hour will be a series of short talks, and then the second hour will be all devoted to Q&A. Charlie will start us out with our strategy, and after that, Prakash, he'll explain why Evergreen//One is the only true storage as a service. Rob will cover Pure's three AI opportunities. Bill will speak to our hyperscale bulk storage opportunities, and then Coz, he'll conclude with the next era of storage. After that, Kevan will join us on stage for Q&A. So with that, let's get started with Charlie.
Thank you. Thanks, Paul. Hi, everyone, it's really good to see you all here today. Thank you for coming. We know it's a lot of travel, and it takes you away from home and family, so really appreciate the time that you'll be spending with us. So, you know, as Paul said, that's the agenda. I think I'll just flash this up, but all of you are very familiar with our you know, most recent statistics. You know, subscription rate you know is getting to be a very substantial fraction now of our revenue. Now, of course, in Q1, it's our seasonally slowest quarter, and so subscription is going to be seasonally highest percentage, of course.
Large number of customers, and the number we're most proud of is the 82 Net Promoter Score. If you're familiar with Net Promoter Score, 82 is the highest that we know of. It's the highest in the tech industry, and it is a rating of how many customers will actually promote you to other customers. Not that they're satisfied, they're actually promoting you with other customers. You know, we do consider ourselves the one company in storage that treats storage like high technology rather than a commodity. Ergo, are 10 times most innovative in the Gartner Magic Quadrant, and we are making good progress on our penetration of the Fortune 500. We think... You know, why do we get a Net Promoter Score of 82?
We think it's because it's the difference between customers' expectations and the reality that we provide them. That is to say, their expectations are low out of data storage, it's difficult, it's complex, and yet we provide them a solution that's 10 times more reliable than what they're used to. It saves them in power and space, a dramatically lower amount of labor required to operate the same amount of storage. And as you might imagine, that saves them a lot of money and a lot of trouble. The fact that our products never become obsolete, you know, this is something that customers are just not used to. Regardless of whether you're buying servers, or switches, or routers, or power supplies, or storage, they're used to replacing every five to six years, right?
When they replace it, it's disruptive, meaning that they have to take their application environment down. You've noticed it, weekends, you're not able to get access to your banking information for a number of hours on Saturday night, and it's because they're doing some type of transition. It's called scheduled downtime. We don't have scheduled downtime, and what that means is we consistently upgrade our products, not just software, but hardware, without disrupting the application. No change-related downtime. What that also means, though, is year after year, the products just keep getting better. We add more capability, we add more software, the software is new, adds more capacity, adds more performance, so the products never become obsolete. And this is the core of our Evergreen program, is that the customer...
We have products that have been in the field, in a customer for over 10 years, where the customer has never given us another dollar of capital, and it looks like, if you were to visit that product, it looks like a product we sold last year. Every part of it has been replaced over time based on our subscription. So you'll hear more about that later. That results in lower TCO. And, you know, based on that, that combination of capabilities and that we bring to our customers, they've rewarded us with very high market share in all-flash. So this is what it was at the end of 2023 of just shy of 20%, and we have the Q1 numbers.
Remember, Q1 is our seasonally slowest quarter, and it's already gone up. You know, our market share has increased yet again, and as you see, we are within spitting distance of first place in all-flash. And remember, we only compete in all-flash. We don't compete in all storage, right? Now, if you were to look at our all-storage market share, it's less, of course, because we don't compete, you know, in the hard disk space. But as you know, we've been talking a little bit about that space over the last year. So what if we could compete at that, you know, in the hard disk space? Now, you all-- And so I'm going to share something with you that we haven't, it'll be the first time we've shared this.
I invite you all to do your own analysis. But as you know, the hard disk manufacturers are saying many things, including: "Well, we're gonna make it cheaper. We've got a next generation of technology. We've got HAMR, we've got this." And so, disk has a long life, and I already have my bet placed. I told you last year, systems weren't gonna be sold in five years. Well, and then you've come back and said, "Well, disks are gonna get cheaper." So what do you say to that? Well, what we have to say to that is that they could give away disks for free, and we'd still be cheaper. They could give them away for free, and we'd still be cheaper.
The cost of their common equipment, the cost of the space, power, and cooling that they take, the cost of the labor, including the failure rate of the disks that have to be replaced, all add up to more than what our arrays cost at this point, for the high capacity, low—you know, the low-price archive-level arrays. So I don't know that I have to argue anymore about this being cheaper. They can't be cheap enough. All right. But I'm not here, the reason why we didn't bring this up at the conference is we thought this would make a lot of headlines that took away from the main messages we have at this conference. So the main messages we have at this conference is that we are providing now the most consistent data storage platform, you know, in the industry.
So what do we mean by this? Well, alone in the industry, we have one operating environment that covers the full range of data storage needs for our customers. Block, file, and object, which are the three most popular ways in which to store data, which today, even by other vendors, is usually supported with different operating systems, different operating environments, different management environments. And we support it all with our DirectFlash modules. And what that allows us to do is use the same architecture all the way from AI to backup, meaning the world's highest performance down to the low price level where customers need, you know, for things like backup and archive, right? And from terabyte size all the way up to exabyte size. We have this on only two architectures: scale-up and scale-out.
If we could do it on one architecture, believe me, we'd be there. But scale-up and scale-out are very fundamental architectures that customers need and demand, and so we have it on two architectures, but they all use the same components. The same software operates as Cloud Block Store on two of the three hyperscalers. This allows- and it's exactly the same interfaces to applications in those two environments. And then for customers that wanna go so-called cloud native, Kubernetes and containers, we have our Portworx software that works native on the cloud, as well as on our storage as well. All of this supported by one management system called Pure1, all of it evergreen, that we spoke about- I spoke about at the beginning, and announced at this conference, now all supported by Pure Fusion. So the question is: What is Pure Fusion?
I personally think this is the biggest change that's occurred in storage in 30 years. So storage has traditionally been very fragmented, specific storage systems for specific workloads, and because of that, each of those storage systems is isolated to its individual application stack. Storage is not networked. Yes, you can copy data, but it doesn't look like a cloud of storage. It looks like an external hard drive for your computer, not like OneDrive, not like Google Drive, not like Dropbox, right? It looks like something that's captive to an individual set of application workloads. Pure Fusion changes that. It turns it into a cloud of storage from the perspective of the application.
It also turns it into a unified fleet of capability for the IT manager, where they now manage the entire fleet as one environment, right? Makes it easier for the IT manager and makes it look like a cloud of storage to the user. It also allows IT now to be able to create storage classes that meet the compliance and the governance requirements that their corporation wants and offer them as through APIs to their developers. So it allows for corporate compliance and governance that is much more regimented, much more policy-based, than it is paper and phone calls. So it's a huge change. I think of it as virtualizing data storage for the first time. So what are the benefits of this? Effortless management at scale. You're not managing individual arrays, array by array, you're managing a fleet.
So simple fleet management. It's a networked cloud of storage to the developer environment. Provisioning, it can be automatic. And the services are all based on policies that the organization can set. So they get labor savings. They actually get... We will sell less storage, because today, you might have one array that's full, you might have another array on the other side of the data center that's only half full or quarter full, but you can't take advantage of it because each array is tied to its specific application stack. In the case of Fusion, you can take advantage of it. It'll load balance across it, so you really take advantage of the pool of storage rather than individual arrays.
So you have much greater automation, much greater simplicity, so you have cost management, and again, it's all API-based, and it changes storage from being where the interaction between developers and IT is phone calls, and trouble tickets, and emails, to it being self-service for the developers. So a big, big change in the way it's done. All right, so that's what we think of as a platform now, where it's consistent, it's based on the same set of software and management, where it's API-based, managed as an integrated fleet rather than individual arrays, consistent, and networked, which is really important.
And we believe as well that as companies want to as they start going down the AI path, and want to use not just historical data, but real-time data for real-time queries, they're gonna want to be able to get access to data at work, to real-time data. That doesn't exist today in the environment because data is not networked. It's captive to specific applications. With Fusion, the data will be networked and under, of course, policy control, role-based access control, et cetera, available for real-time AI analysis. So this is why we believe a platform is the right time right now. It solves all of the customer needs that they're or the new needs that they're looking at. It's AI ready. It allows for AI access to their most important data, the real-time data.
Cyber resilient, we talked a lot about that. You'll hear more about that today. It allows for application modernization because we can... it's gonna be API based, where developers can get access to APIs rather than having just physical connection to a storage array. And because we have the same software on-prem as we have in the cloud, it also helps in the hybrid cloud arena. Both, remember, with traditional applications using Fusion and Cloud Block Store, but also with new cloud-native applications that are Kubernetes and container-based using our Portworx software. So, I want to just recap a couple of the big announcements that occurred over the last two days. One is our expectation that DGX SuperPOD certification will come by the end of this year.
We had Charlie Boyle, who's in charge of all of their DGXs, at BasePOD, SuperPOD, confirm that on stage, with us, yesterday, right? We are working with them very closely on a number of their RAG and inference, vertical, solutions. In fact, I believe we're, we're at the forefront of all of that. The benefit that we provide in this area as well is because of our consistency, we can run inference and training on the same platform, and we can run RAG by using Fusion to be able to get access to real-time information rather than just historical information that's been replicated and copied.
So it's a really big change in the way that a customer will think about their current data storage environment and architecture, which tends to be highly siloed today. Now, I do wanna mention this, the other major announcement was Fusion. I skipped over this part. Fusion was announced as being backward compatible to all of our arrays that are currently in the field. It will be available in a set of stages, the first one coming out around September, stage two around the end of the year, stage three in the first part of next year.
So with that, I'm gonna turn the stage over to a series of our team to give you more information on these three areas. One is storage as a service and AI, artificial intelligence and hyperscalers, as Paul mentioned. Starting with storage as a service, I'd like to introduce our general manager of our digital experience group, Prakash Darji.
All right. Well, thanks, Charlie. So a little bit about me. I joined Pure six years ago, and my background doesn't come from infrastructure, it actually comes from building SaaS applications and SaaS platforms. So I've built financial planning applications in my past, all for SaaS-based financial planning, as well as building platforms for developers to build on and platform as a service. And we find that storage as a platform can enable this, but only if you don't have a sunk cost. You can't have a data migration if you're gonna deliver a SaaS service. Our Evergreen architecture gives us that ability to also deliver a SaaS service that gets better over time, a single operating system, where you could provision, block, file, object, or whatever you need, also is required, so you can actually deliver an outcome versus having to deal with a fragmented landscape.
These are enabling technologies. Fusion is a management plane, now allows a consistent management experience, and then finally, that does mean you can build storage as a service. So what I, you know, posit to you all today is Pure is the only company with a technology stack that can build storage as a service, and we're gonna get into why. So there's three changes in terms of what happens when you need to deliver a service. One, you have to change how people buy. A lot of people think about this as a cash or credit problem, like, "Am I paying for consumption?" Well, yes, you should be consumption-oriented. You should pay for what you use, you should avoid over-provisioning, et cetera. Secondly, though, and a lot of people miss this, is it's a change in how you operate....
When you look at Salesforce.com, it's not just a, "Hey, I'm paying on consumption or not." It's a completely different experience where you're getting continual updates in CRM capabilities, and people aren't worried about operating a CRM system. In the same way, to deliver storage as a service, you can't be worried about operating storage. And finally, it's a change in how you experience. So if you think about the experience, you need an in-product experience that actually says, "This is what you need to do to run based on policies." Pure1 is part of our platform, is where you set your policies for how you run. Now, the easiest way to show this is, you know, when you think about watching movies at home, I think we've all lived through this transition before.
There was a company, you know, that, Best Buy, you could actually go and buy DVDs, right? And VHSes, for those of us who remember that time. And you own the DVD. You have it, but most of us don't, you know, typically watch the same movie over and over, unless you're my children. And then, you know, Blockbuster came around, and you could rent the DVDs. It solved the problem of ownership. And, you know, Redbox changed that to a subscription as well. You could actually go to that Redbox kiosk, right, and go get that physical media and put it in your DVD player, and then, you know, you can see where this is going. Netflix changed the game, right? I think we all know what the Netflix experience is. It's a change in how you experience movies.
Now, there's a recommended-for-you approach. So if you think about our storage, think Netflix. What is storage as a service? In product, it should be recommended for you. "Hey, you might have this new workload. This is what you need to do for it." That workload could be AI, that workload could be a VM. You might need more capacity, you might need more performance. So our experience as we define storage as a service is this Netflix experience. So the attributes to deliver storage as a service, first, you need to deliver flexibility. If you have a fixed appliance, you cannot do this. The Evergreen architecture, as Charlie mentioned, allows you to improve performance, improve capacity as you need. So when we deliver a service to a customer, they get a service definition, they don't get hardware.
We choose whatever hardware we need to meet that service obligation because our hardware is flexible. From an operating standpoint, that means the definition of the product is an SLA. So what is your performance SLA? What is your capacity SLA, efficiency SLA? And a service then assumes that you're moving the risk from the customer to the vendor, so it needs to be self-protecting. You have resiliency SLAs to secure your environment. And finally, it's the most efficient way of running, because in Netflix, you don't think about who's storing the movies, who's getting the movies, upgrading the content to 8K from 4K. Like, that's all the vendor's problem. And in storage as a service, it is the most capital-efficient way of running. You're not laying out capital. It's the most energy-efficient way of running. Pure has energy efficiency and labor efficiency as well.
You're not putting people to run and operate these systems. So one of our customers who started with us, Options, started with us back in 2019. We've been in market now for five years, leading the way in this area. They started with one small subscription at one site, and now they've scaled to 18 PB in multiple sites around the world. And they've gotten continuous SaaS-based delivery to improve their resiliency as well as their innovation. And that just... You know, if you look at this, our unique approach is built on this idea that we have this Evergreen architecture. That came from 10 years ago. While we introduced Evergreen//One five years ago, 10 years ago, built into our product DNA, we were able to go ahead and improve or change performance or capacity such that we could deliver a service.
You can't. Like, just layering on a subscription model on a box will not allow you to deliver a service. That service definition happens by site, and there, you know, most other vendors we see is, "Okay, it's a CapEx or OpEx question. Here's the box. How do you want to finance it?" That is not a true service. Now, the next one's interesting because, you know, we innovate in a lot of ways, and I thank Coz for this. He had a good idea. This is our contract, right? Simple, easy-to-use SLAs. This is HPE's GreenLake contract, right? So I'll leave that there. We'll pick it up later. But it's like a drug commercial with a ton of exclusions of everything that's gonna go wrong with you.
And the reason it's like that is because when you buy Alletra this, it's got this operating system. When you buy Alletra that, it's got another operating system. And then you need terms and conditions for how this one operates and how that one operates. Without a single operating system and a single management experience, you end up doing this. And, you know, I wrote a blog about it once, and it was interesting. They used to have a 300-page PDF on their website, and apparently, they took it down and disaggregated it into... I tried to download those documents, and they disaggregated it into a bunch of 10-page links, with a contract lifecycle management system to go ahead and assemble the papers. So it's amazing that that's what counts as innovation nowadays. We have these real SLAs.
We tell you what we do, and we do it, and we, because we're running a service in a customer data center, pay for the power, rack space, and consumption we use. So, you know, whatever power we consume, whatever rack space we need, we're basically using space in a customer's data center, and we pay for it. Now, a SaaS service gets better over time. Well, we're continuing at this conference to introduce new capabilities. We introduced new cyber resiliency capabilities, our paid power and rack. You know, it turns out, you know, we initially used to give people a check. Some customers couldn't figure out how to account for a check, so we now give them a service credit. Like, they can choose which way they want to get the money for paying for the energy and power.
But we're continually making our offering better, like a SaaS solution, and one way in which it's getting better is it's supporting new workloads. Traditionally, we used to support VMware, SQL Server, production databases, then unstructured data, backups. You know, we've supported traditional workloads, but there's a new workload in town that's called AI. Our Evergreen//One for AI workload type is a new service tier that had to be designed very differently. So our product manager who designed it talked about it, designed this after his water bill of his home. He said, like, "Hey, there's a problem. I get throughput of guaranteed water to my home.
It's based on the pipe size, one-inch, two-inch, three-inch, four-inch, and it guarantees water throughput to my home, and then I pay for the water we use." Well, if GPUs are the most expensive asset in your data center, you need to give an SLA to keep them utilized in a place where people are still trying to figure out, do I - how much do I need for training? How much do I need for inference? What's gonna change? When are my projects gonna change? So this now is aligned to, we give you a provision throughput guarantee to keep your GPUs maximized, and you pay a cold storage rate for the data you use.
So storage as a service allows you to choose block file or object, deployed on premises in a colo or in a public cloud with our Cloud Block Store offering, available in 30 countries and based on SLAs. The SLAs are the definition of the product. Management SLAs, efficiency SLAs, resiliency SLAs, where we pay for the power and rack space we use. So with that, you know, we talked about traditional workloads, and now we support these new AI workloads, and Rob's gonna come up and talk to you about AI workloads. While I clean up contract work.
I was gonna say, All right. Well. Yes, okay. So actually, I've had now 10 people comment on the T-shirt, so I'll open the jacket. This is actually not a new shirt. I think this is circa 2014. It says, "Got 99 problems, and disk ain't one of them," and you know, Bill and Coz will come up later and tell you a little bit more about that. But what I wanna do today is spend a few minutes kind of unpacking the AI opportunity and really the three opportunities that we see developing for Pure. We've had a bit of a dialogue about this over the last several quarters.
You know, spoke about it this morning at the keynote, but want to go a bit deeper with this group as to how we're thinking about it. And so just to... Oop, that's me. Just to recap, we really see three areas of market opportunity created by AI for Pure. Number one, and probably the piece that gets the most focus, the most talking about, is the area of storage directly attached to AI training projects, the storage that's directly feeding data into GPUs, you know, taking and working on the models. You know, clearly high performance, very much, you know, very much what we all have been talking about.
But just as important, and what we believe will become actually more and more important over time is the other two categories of our opportunity. Number two is the broader set of enterprise data we see sitting out there in, you know, in firms, in their proprietary datasets, that they'll want to go and connect into inferencing systems, into AI deployments over time. So we look at that as number two, the inference opportunity.
And then number three, just as meaningfully, as customers, as enterprises are, starting to plan, for the deployment of AI, as they're starting to plan for AI-powered systems, you know, one of the most common conversations I have with customers, when I go and talk to them is I'll go in and say, "Hey, so what are your AI plans?" They'll describe, "Hey, we want to go do all these great things. We want to, you know, improve our customer support, improve efficiency." And I'll ask them, "Okay, great, where does all your data sit today?" And, more often than not, I'll be told, "Well, you know, we've got about 15 different databases. Finance runs the system over here. HR has got the system.
You know, we've got 3 systems over here, and there's probably a whole ton of stuff I don't know about." And so what customers are realizing is step zero is really just getting their data house in order, figuring out, "How do I bring all these disparate datasets together?" And we're seeing that AI, as a workload, as a spending priority, is now driving a demand to accelerate infrastructure upgrades and modernize. So we really think about these 3 areas of opportunity.... And so if we look within those 3 areas, what are we seeing? Well, number 1, in the training space, this is the most developed, right? You know, I spoke this morning, we've said before, we support hundreds of customers in their AI projects.
The vast majority would sit in the top, part of that pyramid. These are customers, small and large. You know, Meta RSC, is one environment of particular size we've spoken about, with you before. But as well, you know, customers of all sizes that, again, are doing, direct training on these GPU, environments. We are seeing over time that a lot of this training work is becoming a bit more specialized, a bit more, shall we say, concentrated in some of the more advanced enterprises, the more advanced, technology firms, certainly in the hyperscalers. What we are starting to see is as these training projects, mature, as the models mature, we're starting to see, the developing opportunity and, and deployments in the inference, space as well, and it's kind of natural.
If you think about training, that's kind of the R&D phase, right? "Hey, we've got to go build these models. They've got to mature a bit." And then now we're starting to see the enterprise figure out, "Hey, what are the right ways to go connect these models to all of my firm proprietary data, and what does that look like?" So if we step back from it, you know, a strong segment up top that we serve these training projects in, and we expect that to continue to grow. But really, equally, if not more excited about the growing opportunity in the enterprise that we see as, you know, as firms start to deploy this technology within their own four digital walls, so to speak.
You know, so I wanted to take a bit of a second here and really unpack why we have this conviction, why we see these trends evolving. And it really comes down to the maturity of approaches that people have in terms of developing and deploying AI. Where it started was foundational model training. "Hey, I'm going to go take a bunch of general purpose data, I'm going to throw it at GPUs, I'm going to develop a general purpose model." You know, I liken this to you know, raising and educating a new college grad, right? They're, you know, conversant, they're highly overconfident, shall we say, in some ways, but not terribly specialized in, say, doing you know, all of your jobs, right?
The next step in this, that people kind of looked at is, "Okay, well, how do we go fine-tune those general purpose models? How do we specialize them a little bit? How do we take that new college grad and, you know, maybe we send them to a business school or, you know, train them as a, you know, an economist in grad school." They're certainly gonna do a better job of what you all do on a day-to-day basis. They're gonna have more understanding of your space.
The third approach is to say, "Look, what if we take that new college grad or somewhere in between, and we actually just give them the right context they need to do their jobs?" The models that you guys develop, the research notes that you have, all of the firm proprietary data that you have, within your four walls, well, they're gonna do the best job of all, right?
And so what we've seen is that, as these techniques have developed, right, there's become a little bit of a shift within the broader enterprise to say, "Hey, you know, we can get just as good, if not better, results in terms of specialization, in terms of, retaining security controls, in terms of the feedback loops, the ability to integrate, and, adapt to new data as it's coming in, by moving more into the inference space." That said, if we look across this range of strategies, it's pretty clear that data is at the heart of this, right? Data is being used in different ways in all of these strategies, and there's a tremendous amount of data.
Whether you're working with general purpose data, whether you're specializing that to a domain, or whether you're connecting it to the you know, the huge sums of private data that's sitting within the enterprise, data is really at the heart of this. And so if data is at the heart of this, then, you know, performance is clearly key, and I think this is an area that is maybe the most, shall we say, misunderstood or mischaracterized in the AI space, right?
You know, you can go out into the market, and you can, you know, look at the competitive set, and everybody will, you know, hold up a benchmark and say, "Hey, we're really, really good at this type of performance." And, you know, when I spoke this morning, I likened this to, you know, a drag racing car, right? Drag racer is really fast off the line, but it turns out, to win a real race in real life, other things become important. You gotta, like, turn left, turn right, you gotta brake, you have to have cornering. AI is kind of like that as well, right? If you look at the various strategies that I just walked through, it's not... You know, performance isn't all one thing, right?
As you look at the various stages of not just feeding the training data into the GPU servers, but all the work that has to go into preparing the data, how you're gonna go manage the development process of that model, saving the checkpoints away, how you're gonna manage connecting the finished model into, you know, your large sums of proprietary data, you've got a whole bunch of different types of performance you've got to work with. Really high write bandwidth, really high read bandwidth, really good metadata performance, latency becomes important. And more importantly, it's the sustained the ability to drive that sustained high level of performance.
Not just, you know, to post a benchmark and to crow about a number, but if these systems are gonna be mainstream, they're gonna be online, they're gonna be 24/7, they've got to be sustained. And this is where Pure is really set apart, right? The fact that we can deliver this wide variety of performance at a very sustained level, right? We have competitors in the space that have architectural limitations. They've built their architectures on limited caches that, you know, they do allow them to post really good numbers for a short period of time, until you overwhelm the cache, and then you fall off a cliff. We've seen this time and time again in the storage industry, and this time is no different.
If we look at performance, it's a key element of, you know, a successful data storage platform in these AI environments. You know, there's a couple others, right? I would also call out, you know, the ability to drive that performance without saddling a customer with a ton of different disparate solutions to go manage. Charlie talks about having, you know, a single unified operating system able to be managed by a single control plane with Fusion. You know, if we think about where the customers sit today, and we think about the tremendous amount of change that's happening in the AI space, that's a huge advantage for the customer, and to not have to go manage, you know, another dozen special purpose tools just to get the job done. But also enterprise readiness.
Things like security, reliability, availability, extremely important. If you look at the amount of capital, you look at the investment that's going into AI projects, you know, the idea that you can run this on, you know, a shoestring budget and science projects, an unproven, you know, technology, just isn't going to work. You know, we have other competitors in the space as an example, that take extreme shortcuts. You know, if you power off, you know, power off the data center, power off the rack, they're gonna lose data. And that makes, you know, that's. If your whole goal is to post a hero benchmark, you know, that'll work. If your goal is to support mission-critical environments, it's not gonna cut it, right?
When we step back from it, it's our ability to drive balanced, sustained, high performance across the board. To do it without, you know, a potpourri of different disparate solutions that have to, you know, be managed with a great deal of complexity. To do it with enterprise reliability that drives the NPS score that Charlie's talking about. To do it with a full container stack. Also, the flexibility to adapt over time, right? I talked a little bit about this this morning. This is a part of technology that's moving extremely quickly. You know, if an IT customer has a hard enough time predicting their capacity growth in a VM environment, how in the world are they gonna perfectly predict what their AI infrastructure stack should look like three, four, five years down the line?
You can't. And so the way that you ensure against that, the way you mitigate that, is you invest in a technology set that's gonna allow you to adapt, to evolve. We do that with Evergreen. We bolster it with the business models, Evergreen for AI, and the SLAs that allow you to grow and flex and build that into the consumption model. So, I wanted to close with a couple examples of customers that fit into each of the three categories of opportunity: the training, the inference, and just the general accelerant to upgrading. So let's start with training. We've talked to you all about Meta enough. I want to highlight the large GPU cloud win that we spoke about several quarters ago.
This is another large-scale GPU farm environment. We're serving a ton of training data into these, you know, array of GPU servers to be trained on. The customer, in this case, is also foreseeing a mix over time of training and data preparation, and perhaps even some inference down the line. And so they're laying out a general purpose infrastructure that can provide that extremely high level of balanced performance, and to do it in a 24/7, in a CSP environment. This is also an Evergreen//One customer, right? This customer saw the value that Evergreen//One provided on top of the core platform to give them that flexibility, that flexibility to shift their workloads over time, the flexibility in terms of paying by consumption. This is a CSP.
They want to, you know, they don't want to lay out a ton of capital ahead of revenues. Evergreen//One gives them that flexibility. All right, so let's take a look at an inference example. We have another customer. It's a Fortune 100 retailer. In this case, they're looking to deploy multiple AI use cases, inference, some RAG architectures, across a variety of different data types: LLMs, natural language, even some image, and video as well. In this particular case, the customer is actually using the entire portfolio. They're using a little FlashArray, FlashBlade, as well as Portworx, really to serve the entire, the entire set of the environment, even... You know, we focus on the GPU-connected part of the environment.
When we look at Portworx, when we look at FlashArray, that's serving the, all of the background services that make the inference stack work: the logging, the data preparation environments, the notebooks, all of the data pipelines that feed, these environments, that plumb the operational real-time data into the inference models. Also of note here is that, you know, we were chosen, for a number of reasons, but one of them was our reliability. We displaced an HPC solution that was previously there because the customer found it too hard to use, unreliable, and, you know, had significant issues, maintaining data integrity.
All right, I'll close with a third example, which is a customer who, as a result of embarking on an AI project, looked at their entire data estate and saw significant opportunities to modernize everything they were doing. Everything from user file shares all the way into backups. And in this case, the customer, you know, even before deciding on one or more of those strategies I showed you, realized that if they were going to be successful, they had to speed up the environments that their, you know, researchers and analysts were working in.
They realized they had to create more networked connectivity to their archives, to their backup, to their historical data, and that's where they started, by just, you know, modernizing those systems and getting their data house in order. So, with that, hopefully, this gives you a little bit of color as to how we're looking at the various pieces of the AI opportunity. Certainly, the training space we do so well in today, but also the nascent, but fast-growing, space that we see taking over long term in the enterprise. Great.
Thanks, Rob.
Yeah.
So, I'd like to introduce the next speaker, but before I do, a couple of things. One is, this is a very new space for us all to be covering, so, you know, we wanted to really provide you some deep background here. But secondly, I want to introduce someone that I don't believe you've probably met before or spoken to, Bill Cerreta. Bill has been with Pure for 11 years. He has a very deep background in technology, including, believe it or not, in InfiniBand. And Bill has been in charge of all of our, both the hardware and the core portions of our DirectFlash technology, really from the beginning. He's now the general manager of our hyperscaler line of business. With no further ado, Bill.
All right. Thank you, Charlie. Okay, so I wanted to give you a quick overview, and I wanted to start with the opportunity, okay? So our main push here is hard drive replacement. That's our first foray into the hyperscale. TrendFocus did an analysis and looked at the total number of hard, hard drives that are consumed worldwide. They're projecting in 2024 that there will be 855 exabytes of hard drives in the world, total. That's all the hard drives in the world consumed, and 68% of them will go to the hyperscale. That's a huge opportunity that, as of right now, is completely untapped by Flash. There is no company going into this opportunity and offering a solution that can displace that, that, those hard drives.
Another thing to note is almost all of that 68% is consumed by a very small number of companies. The lion's share are consumed by four hyperscalers based in North America. So it's a very, very fragile market, and there's only two suppliers, Western Digital and Seagate, that are servicing this market. So not only do we feel that the opportunity is really large and that we're uniquely positioned to go after it, it's also a very fragile and brittle market. Very few customers with very few suppliers. So what is our advantage? Well, we offer advantages in power, space, and waste, which are some of the biggest things that the hyperscalers are worried about today.
In power, just one year's consumption of hard drives, this is not the install base, entire install base, just one year's install of hard drives consumes over 3,000 GW, enough to power 270,000 homes. And with our solution, we lower that consumption by 80%. It's also... Hard drives are very bulky. They take a lot of space, and space is at a premium right now. Just those hard drives from that one year would consume 60,000 square meters of space, or the equivalent of 7 entire data centers. And on the waste side, just in failures, okay, annual failure from a one-year install, they're creating over 200 tons of waste. That's not talking about the refresh and recycle period, which is quicker with a hard drive, right? They might have to do that every five years.
With a flash solution, it might go as long as 10 years. So just on the failure count, we can lower that by 99%, and we can do even better when you take into account the refresh cycle. All right, so what are we doing? Well, the solution is Purity plus DirectFlash modules. It's not all of Purity. It's the flash management piece of Purity that's really a differentiator, and when attached to our DirectFlash modules and integrated into the hyperscale, offers this really strong, total cost of ownership advantage. As you know, we're scaling the DirectFlash modules very quickly. Our 75-TB module, which we launched last year, is already our mainstream module. It's not some esoteric, halo product. It is our main module today for our QLC opportunities.
At the end of this year, we're releasing the 150 TB, and Coz hinted at a 300 coming in the future. So we continue to innovate. And by the way, we already have line of sight of what the design will be for 600. So with this combination at the hyperscale, we can improve density, efficiency, lower power, but it's not just a throw it over the wall and they take it. We have to go through a co-development with the hyperscalers to integrate into their, their architectures, and we're doing that now. Now, they could just use off-the-shelf SSDs, and if you're following the SSD market, you can see that those drives are getting larger. But as they get larger, they're getting harder and harder to use. A DirectFlash module does not have all the encumbrances of an off-the-shelf SSD.
All of the complicated functions of the DirectFlash module have been elevated to the host, centralized, and managed across the entire fleet within a node through Purity, which really lowers the burden on the complexity of the drive itself. We're seeing that large drives from off-the-shelf vendors now need much larger read and write sizes, and they're harder to manage. They have higher write amplification, and so we do not believe they'll last as long, and right now, they're also single sourced. So we think that we have a strong competitive advantage against that. All right. So let's talk about our advantages versus hard drives. Okay, the hard drive was first invented in the 1950s. We are at the end of the innovation cycle, okay?
If you look at the history of hard drives, recent history, as they've continued to scale, there's been a projection every year about when they're going to get to the next size, when they're going to get larger and larger. They're missing these forecasts by years, and then that miss is compounding. And so now the level at which they're rising in capacity has gotten to almost nothing. We, you know, by the time they get to 40 TB, we could be 300 or 600 TB per module, using less power with higher performance. They're also getting more and more exotic, so they're doing technologies like HAMR, and the hard drive offerings from one vendor are now no longer compatible with the other vendor.
So you have two hard drive companies that are making incompatible solutions, so there's no longer a second source. Of course, with our solution, we offer less space, power, and we believe that that's 85% total lower operating costs for in a hyperscale environment, which is quite optimized for lowering operating costs, but we still believe we can go much further. All right.
So Charlie's talked about a design win this year, and we are working closely with our lead customer, co-developing engineering, business development, working directly with them to get this technology integrated into their environments. That's going on today. And the value proposition that we're offering is resonating with multiple hyperscale customers. We're really only addressing hyperscalers that are based in North America, but the message is really resonating. They're under the largest pressure to lower power that we've ever seen.
We've always talked about lowering power and being more efficient and helping them, helping all of our customers, but right now, it's the power problem is more acute. Because of AI, space is at a premium, power is at a premium, and it's been a, it's a bigger problem now than it's ever been before. So as I said before, this isn't. We acknowledge that this isn't like creating an enterprise storage array, putting up general availability, and selling to the market that we're very familiar with. This is a completely new model. It's a co-development and integration, and we've formed an entire group that only focuses on this, and we're doing that today. We believe that once we get these wins, we'll be sticky, it'll be a defensible position, and our position will get better over time because we'll continue to innovate, okay? We have innovated.
We're the world leader in QLC today, we believe, and we've learned so much about managing NAND at scale, and as we get more and more involved with hyperscale, we'll be able to tailor to their needs and leverage the knowledge that we're gaining from our standard fleet to this problem. So with that, that's the hyperscale opportunity. We really feel like we have an innovation advantage, and that we can get into the hyperscale and have stickiness and a defensible position. All right? Thank you.
Well, our next speaker needs no introduction. Coz? Coz.
You don't need a clicker. Yeah. Yeah, I also need no clicker. So, thanks for coming today. I'm gonna talk to you, try to give you a little better perspective even on a couple of things that people were chatting about earlier today. So to start with, let's just keep going from the hyperscale. As Bill mentioned, last year, we talked about 75 terabytes, 150 terabytes, and, you know, I wouldn't be complete if I didn't have them all. So this was the 75 from last year. You know, as Bill mentioned, it's our mainstream, so nothing particularly special. But last year, I only had a 3D-printed mock-up of this, and this is a 150-terabyte. Our hardware team came to me at the beginning of the year with one of these.
I said, "Great, thanks! I need to take it with me next week on a business trip." They said, "We only have three of them right now, and if you take it, it's gonna set back our GA date by a month." So I let them keep the one they showed me and had them make a new one for me the next week. I will pass this around so you can see it, but one of the things to note as well, the size of the flash chips on these two are different. It's because they come from two different manufacturers. One of the key things is we are multi-sourced on our flash supply, and it's something that if we're gonna go capture the hyperscalers, we have to continue to be multi-sourced.
We have to develop even yet more sources because that's a lot of disks we gotta get rid of, and that's what we're gonna do. So I will... You can pass this around, and you don't need that. That's just a protective cover. And I will say that this, which last year was a 3D mock-up of a 150, you'll see I stuck a 300-terabyte label on it. Now it's a 3D mock-up of next year's 300, which will look exactly like the 150. You know, we talk about these density increases, and they sound too far. They sound amazing, but you have to recognize that, for example, the flash vendors are already talking about 2-terabit dies. You know, Western Digital talked about it a few days ago.
I saw the other day a new paper presented by Kioxia that actually talks about getting to 100 gigabits per square millimeter by 2027, which is about 4 times where they are now with the 2-terabit die. So that'd be roughly 4x the density up from the 2-terabit. The 2-terabit is what we're gonna build the 150 on, and I mean, the 300 on. So, you know, we have the visibility into the flash roadmap. We're not depending on new science or fantastic new developments the way the hard drive vendors are. So, yes, next year, you know, at this time, I expect to be showing you a 300-terabyte module, which will ship towards the end of next year.
And every time we do that, that day of flash just wiping out all hard drives comes closer and closer, and it's something we're really committed to. We have to scale up our software. As I pointed out to some people earlier today, so you go from where we shipped 10 years ago to what we're shipping later this year, and an individual array has gotten about 420 times larger. Now, I only give the developers 3x as much memory, so they have to keep getting more efficient, 'cause memory costs money, memory is something that can fail, memory, you know, has bugs when you use it. So it's something that we keep pushing on that software scale.
Every generation of the NAND, it gets harder to deal with, and so as Bill pointed out, we've been doing this for over 10 years now. We have a lot of engineers that are really good at dealing with the NAND, at understanding it, and at getting the maximum value out of it and the maximum efficiency. So that is why we're so committed, not just in the hyperscalers, but in the enterprise, to scaling that, to increasing the efficiency, improving the economics, getting rid of all the hard drives. That is a key thrust, you know, of the company. So that's enough about the NAND. Another thing I wanted to touch on is Fusion. From the beginning, when we started the company 15 years ago, one of the key things we set out to do was to build a product that was simple and easy to use.
Now, I'm gonna go out on a limb here and guess none of you have ever used a product and said to yourself, "God, I wish this was harder to use. I wish it was more complex." Well, everybody in the industry built really complex stuff, and we set out to build something incredibly simple and easy to use, and we made a giant leap forward. You know, the business card manual for running storage, instead of having, you know, something that thick, was a giant leap forward. But as people deploy fleets of arrays, and as they try to run them better, that's not easy. That is hard. And you see the cloud-based services, they, you know, deploy thousands and thousands, hundreds of thousands of systems and tons of storage, and they had to learn to do it more efficiently.
What we're delivering with Fusion is a quantum leap forward in the simplicity of managing a fleet. So, you know, you think about Fusion and what it delivers. Today, somebody has a fleet of arrays. They've got an array for their finance application, an array for their engineering, and they've got some arrays for marketing, and they have unused capacity in all that, and they have different performance needs and unused performance. They buy those arrays year after year after year, and all those arrays have slightly different features and slightly different performance levels and different configurations, and it's a big mess. You know, so you go to some large IT department, and they don't understand what they've deployed.
It makes it really hard when they say, "Oh, I want to deploy this great new cybersecurity feature." They have 27 different storage configurations that they have to deploy it across, and it makes it really hard. What Fusion does is it lets them define a much simpler storage environment. It lets them reason about it much more simply, and it lets them balance across their fleet, right? So, you know, if you think of all the arrays in their fleet, you know, I want to take the utilization with Fusion, you know, up from 30%, 40%, or 50% - 60%, 70%, or 80%. Yes, we'll sell them less storage. They'll get more value out of it, then they'll turn around and buy more because they get more value out of it.
You think about, you know, you're trying to deploy an AI infrastructure, and what do you need? You need high performance. You don't need to get high performance out of one box that's ultra-high performance. You need to get it out of your entire fleet, and Fusion is the key to enabling that. It's the key to making that simple. It's the key to making that obtainable by people, and that's why I really view what we're doing with Fusion as a quantum leap forward in how you're gonna manage this, how you're gonna enable it. It really takes us into a whole new era of usability and of the applications that we can serve. So, you know, that's why we're really excited about that announcement.
You know, and just to talk to you about, you know, the other thing, you know, Rob was talking about AI. So again, with AI, the enterprises, they have tons of data. They do not want to make a copy of that data for AI. They want to use that same data because every time they make a copy, like, oh, you're a bank. Here's account information of my customers. Here's all this private data. I make a copy, I suddenly have to secure it. I suddenly have to control it. I have all these regulations about it. So again, what Fusion does, what we're doing, you know, with the rest of our architecture, we're enabling you to take all of the data in your enterprise, make it accessible for your AI use cases, as well as the rest of your use cases.
... So that's the groundwork of what we're trying to do to the storage industry. And then the one other thing that I want to touch on briefly is agility. Everybody underestimates the value of agility in their business. So Fusion increases agility. Another thing that increases agility is Evergreen//One, right? So you think about an Evergreen architecture. It means you can change what you've deployed. It means that when you buy it, it meets your needs, and when it doesn't meet your needs, you change it non-disruptively, seamlessly. You only pay for what you need. Now, for years, Evergreen//Forever let you-- I mean, Evergreen//Forever let you do that, and then we came along with Evergreen//One, which just takes that to the next level because you stop purchasing arrays, you stop purchasing storage, and you start purchasing outcomes.
And it's a mindset that we have to keep working to change in the industry because too many people just keep viewing it as, you know, I'm buying these arrays. I'm deploying these arrays. No, with Evergreen//One, you're buying an outcome. And again, it's simpler, it's easier, it's changing the industry, and that's what we're focused on. It's what we've been focused on since we started as a company, and it's what we're going to continue to focus on: changing the industry in every one of those ways, simplicity, flexibility, agility, driving disks out. That's our goal. That's what our goal is for the next five years, and I think that's really all I had to say today. I'd like to do some Q&A now.
All right, we're going to start some Q&A while people are coming up to the stage. They're going to bring some chairs up, but in the meantime, you know, we can, we can start, and all of us are available. Wow! All right. Oh, do you want to?
Let's start with Mike here, okay?
Mike.
Great, um-
Let's get you a microphone, Mike.
Get Mike a mic.
Sorry.
Hopefully, software and SaaS is resonating. That's key to our values. I'll vamp while Mike gets his mic.
Thanks.
Go ahead, Mike.
Mike Cikos with Needham. And I guess a two-parter on Fusion. First, I know that you guys have had this out there in the market. I believe it went GA a couple of years ago for FlashArray. So could you just walk us through how this is different now, as far as how it's deployed and offered to customers?
Yeah.
And then the second piece, I think Coz might have hinted at it, but what is the thought process as far as monetization for Fusion? Is it really as simple as customers get more value, and then they buy more storage, or is there a separate, uplift in spend associated with Fusion?
Yeah, why don't I start, and others can chime in? So when we introduced Fusion two years ago, for a variety of reasons having to do with tough technical problems, we decided that we would develop the core capability in the cloud alongside Pure1. And the core technical problem at the time was that we offered it only for greenfield deployment, meaning, you know, new arrays in new environments, because one of the big technical challenges was getting it to work in existing arrays where they already have a methodology for provisioning workloads, and Fusion is a new methodology for provisioning workloads, and we couldn't figure out how to avoid conflict. Well, we have figured that out.
So the second version now is that it will operate on the arrays themselves, and it'll be a software upgrade to those arrays. Our value proposition through Evergreen has always been that any software operating on the arrays is just part of the Evergreen subscription. So now it's part of the Evergreen subscription. So as Coz said, you know, the value proposition for us, there are two really value propositions for us. One is Fusion works on our products. Today, we probably have less than, on average, 10%, wallet share in the, in the Fortune 500.
We may be at 60%, but if you took the average, we're in a, you know, relatively small footprint. So this works best when you have a much larger fraction of your storage based on Fusion. So we think it's going to allow us to expand our market share, in addition to we think customers will become accustomed to now operating as a fleet, and therefore buy more.
Carol, let's go to,
I'm not in charge of-
... Howard, and then Aaron. And then I'll—we'll work our way back from there.
Great, thank you. Howard Ma with Guggenheim Securities. There are a lot of things I can ask about, but I want to ask about the AI opportunity. I think you guys do a really good job in talking about the sequence of AI, 'cause I feel like you have to understand that first in order to size the opportunity, right? So if it's everything from model training to, you know, where I guess data prep first, you know, fine-tuning RAG. And I mean, thanks for sizing the billion-dollar storage opportunity for training. I'm not going to ask you guys to try to size the inferencing and, you know, and RAG opportunity, which I'm sure is on everyone's minds.
But maybe you can give us a glimpse of, you know, through your customer conversations. So really, how are those evolving? And, you know, examples would obviously help. And how are you evolving your go-to-market, both direct sales, educating the partner ecosystem, and, you know, and maybe through those conversations, you can give us an indication of, is the inference opportunity, is it maybe 10 times or, you know, 100 times, you know, for every, you know, for every piece of data trained, how, you know, I guess-
Sure
... how many times does, you know, how many copies of data do you need, you know, to, on the other end? Otherwise, why spend all this money on the training side?
... Yeah, thanks. Absolutely. I'm glad this isn't an earnings call, so we can get three questions in at once. So a couple of pieces in there, and let me try to hit them. So I think that, you know, as we're having more conversations with customers, I would say that the maturity of the—both the technology, but also, customers' realizations of what is possible, is moving very quickly. I'll give you, you know, just a personal example. You know, I was out in New York meeting with a CTO of a large financial services firm, probably about a year ago, and, you know, on that spectrum of options I showed you, he was saying: Hey, we're going to, you know, our firm has access to all kinds of proprietary data nobody else has.
We're going to go build out a bunch of GPU farms, and we're going to go fine-tune or train our own models, and here's all the great results we expect. I met with this person again, probably four to six months ago, and I was, you know, just checking in, "Hey, how is it going?" And he said: "Well, you know, we're kind of reevaluating our options. You know, we were in the process of building out that training program, but we had another team, you know, play around with RAG, take a look at what could we do with off-the-shelf models that were used in the right way, connected to our data in the right way.
And not only did we find that it answered all of, you know, a lot of the, you know, data governance, data security, role-based access control, open questions they had, but number two, more importantly, it was showing better results than they had expected, right? And so I think that even for advanced practitioners in the field, the realization of what's possible has advanced very considerably in the last six months. As far as I think the second part of your question, what are we doing in terms of our field, our, our go-to-market, you know, activities to go capture this? Number one, you know, we're making this a central focus for our solutions development effort.
I think with the AI training opportunity, we focus very heavily on the core platform, our partnerships with NVIDIA, at the infrastructure level, to really meet the performance needs. I walked you through the reliability, all of that stuff at the platform level. When we start thinking about inference in RAG, I think the value of solutions and integration with ecosystem partners becomes super important, and we're making that a centralized focus, not just to go and build solutions and white papers, but really focus them to key verticals and vertical specific use cases.
And then thirdly, you know, I would say that, you know, we're gonna couple that with, you know, dedicated expertise in our go-to-market, both our direct touch sellers and technical specialists, but also working with key partners in the space, partners that have advanced AI practices and analytics practices. So really hitting on each of those elements.
Yeah. I don't expect the bulk of our sales team to really be experts in the AI area. We do have very focused small teams, largely headquarters-based, actually, not specifically sales-based, more product experts, technology experts to work with the field in opportunities, in those opportunities, to be able to help and provide guidance for our customers.
One additional data point, you know, we are using this technology internally. We've introduced at this conference an AI Copilot. When we trained on a data set we had, which is the telemetry coming off our arrays, and we needed to move it to this copilot approach to create the vector database, one bit translated to 10 bits. Meaning, in your vector storage, you were asking the question on how many additional copies, the vector database version of the training led to a 10x storage usage for our internal use in terms of what was required to make it accessible in an NLP model.
I really hesitate to try to scale the size of the market right now, to be honest with you. It's, there are just so many unknowns. It'll be, I do believe that it'll be larger than the training market that we... Now, by the way, the $1 billion is what we estimate was sold last year, total, you know, across the market. So, you know, what- whether it, you know, increases this year by 50% or 100%, I don't have a good view on that now, but I don't feel it's gonna be a $10 billion market, just to be very, very, very clear with that. I do feel that, AI is going to uplevel, whether it's inference in RAG or whether it's just the upleveling of the overall, data storage environment in customers. I do believe that that'll be significant.
Can I just clarify one point that Prakash made? It's that the 10 x, that, that's the size of, of storing the vector embeddings, right?
Yeah.
But then, at inferencing, running inferencing on-
You're just checking data. You just want to read data at rest.
Yeah.
And it
Will you repeat the question real quick? You mind repeating the question?
Yeah, sure, sure. I just want to clarify that the 10 times that Prakash is talking about, that's the storage size for storing the vector embeddings.
Correct.
Right? But running the inference, so running inference on the vector database could be, you know, or should be many multiples more, and that would require additional data storage. Or is that not the right idea?
So you're thinking about the right way, and, and look, I think if we step back from... I'm with Charlie. I think I would hesitate to try to size the market, but if we look at just the tons of data that enterprises have in their own four walls, it dwarfs what is out there in the public that's driving these training environments, and I think that's what we see as a larger opportunity. Because over time, a lot of that is going to be connected into AI-powered systems, but it's going to take time.
Okay, thank you.
To make this easy, Aaron, after you're done, why don't you pass to Tom, Tom to Simon, Simon, Asiya, Asiya to Pinjalim , Pinjalim to Wamsi? We'll just go right down the front row.
Go back. All right, so I have 12 questions. Joking. So I want to go into the hyperscale opportunity, as you would imagine, right? So it seems like that's a big deal. Three real quick questions. One, what has to be done yet, right, to start to see the volume deployment? Two, is this hyperscale in the context of, hey, we're the backend storage for a big cloud organization, their external service offerings or their social media platform, whatever it might be, versus their internal out-
Yeah
operational workloads? And then the third is, how do we think about this disaggregation? Are you licensing Purity? Are you selling DFM? I'm just curious if that's got to be figured out yet. How does that business model attribute to this play out?
Yeah. Let me start, and then I'll pass it. I think it's going to go to both, Bill and Kevan. First thing has to happen is we have to have a signed contract. A signed contract, you know, so, you know, we're in testing, we're in commercial discussions, but, you know, until you actually have a signed contract, you don't really have anything to talk about.
So, you know, we are expecting, well, yeah, I don't have any wood to knock on right now, but, you know, don't want to jinx it. But, you know, I think given the momentum we have, the pace of the conversations, the quality of the conversations, you know, this year is what we would expect for that. Bill will comment in a moment, you know, on what else needs to be done, but I wouldn't say there's no rocket science in what needs to be done. And you know, I think you know. And then, so Bill, you may want to just-
Yeah. So, we are in the development phase, and we're going through testing. We are in the lab at the customer. We're working directly with them to integrate, which really means that, you know, we're going to them. We're making the solution easy for them to consume and integrate into their architecture without making a lot of changes. So that's the part we're in. Now, something like this still has a qualification phase that has to go on at the customer. So once we pass the current phase, we'll go into a qualification phase, that will take some time.
And then let's hit. You want to hit the business model, Bill? Yeah.
Do you want to hit that first?
Where at in this organization are you?
With the-
Oh, just to be really clear.
Yeah.
Okay, what we're looking to do is replace all of the hard drives that run all the external stuff they have. The internal IT infrastructure, that sort of—that, that's where you'd sell FlashArrays or FlashBlades.
Right.
We're talking all of the many exabytes in all those applications that are facing outward-
Their production environment.
... in all their production environment. And to that end, you know, just to reemphasize something that Bill and Charlie have both been saying, we have to make it seamless for them to consume it. And so-
Yeah
... you know, I think if you take, let's say, the five biggest hyperscalers, the product for them would be 99% the same, maybe 99.9% the same, but a tiny bit customized for each one, because each one, if you want to go in there, you have to make it completely seamless to them.
Also, I want to be clear, you know, we're trying to be very consistent with the way we use the term hyperscaler and AI, you know, and even enterprise. And that is, if we talk about AI, it could be sold to almost any customer, even a hyperscaler, but then it would be storage that is dedicated specifically to, you know, large-scale GPU environment, dedicated to that, right? If we're talking about a hyperscaler, we are specifically talking about not things like Cloud Block Store, or if we even sold into an IT environment that was in the hyperscaler, we'd consider that just an enterprise sale. When we talk about hyperscaler, we're talking about what we would do in their production environment. So when you hear us say those words, you know, I hope you, we're trying to be consistent with those, so now you understand what they mean.
So then, from a business model perspective. Well, first of all, who we're engaging with, we're engaging with engineering and procurement, to give you a sense from that perspective. There will be two contracts, as we would expect. We would have a licensing of the software, as well as potentially the hardware, in terms of how that model might work, as well as a support agreement for the software, a long-term support agreement. So those would be the two agreements that we would potentially move forward with, in connection with this opportunity.
Thomas?
Maybe as a follow-up to that, just a clarification rather on the hyperscaler with that. I was going to ask a question about Evergreen with these hyperscalers. Is there—will there be a services and support component to a potential hyperscaler?
There will be. And Bill, do you want to hit that first, and I can, I can hit... But absolutely, but it's not going to be the same Evergreen that we offer with our solutions, and, and maybe you can-
Right.
Go a little more in depth.
So, of course, there's a support obligation, and we will be supporting them directly, but it's not a subscription, if that's what you're asking. So it's not an ongoing subscription.
There will be some revenue in the services and support line.
There will be, and we've got to be careful in the terminology of subscription because it is a subscription. It's just different than what we have today with an Evergreen//Foundation or Evergreen//Forever. But there will still be a subscription element associated with the software support.
Got it.
Yeah.
Then maybe back to Fusion. I mean, Charlie, you mentioned it's storage virtualization. I remember hearing these terms 15, 20 years ago. This is a dream for the industry from my perspective. But you mentioned a number of times model disruptions. I think we'll maybe get some double clicks on that going forward. So maybe just maybe talk about those stages of rollout, what you'd expect in terms of potential, you know, model disruptions. And maybe embed in the answer, we've always talked about Pure shops, right? This is the one code base and the fact that you have this extension to now attack hard disk drives, that you give the better opportunity for firms to become Pure shops.
Yeah.
Is that who maybe we're talking about? You know, just want to understand, you know, this model disruptions we might see.
Yeah. We're breaking several models, and part of the challenge is that these models have been successful models for 30 years. And so you get mindsets that are put into place. I asked the audience yesterday to suspend their beliefs, to question their assumptions. Because there are a lot of assumptions made in storage, and in particular, it's been bought and sold the same way for 30 years, meaning that it's been sold to specific use case environments and chosen by subject matter experts in those use cases.
The assumption is, yeah, I have this unique type of a storage in, system for this use case, and I have a different unique storage system for another use case, and storage is only providing value in that use case, not across the enterprise at a higher level. Any good storage salesperson has been brought up to sell directly into that subject matter expert. Same with the channels. By the way, same with the customers. Senior-level people in the IT stack, they never want to hear about storage because when they do, it's about it being broken. And, you know, they know for a fact, and that's the problem, you know what, what Mark Twain said about facts, right?
You know, he said that, you know, that people are, you know, very convinced, you know, about facts, you know, even those that just ain't so, right? And they are very convinced that, no, it's unique to that subject matter expert, so I have nothing to add, so they push it down there. So we have to raise it up. You know, Fusion means now that you, you know, have, in many different ways, enterprise value. It's lower cost, it's easier to manage, it's self-service environment. Everybody wants to be a cloud-like environment, you know, for their own developers. It creates that cloud, cloud-like, development. It's lower space, power, and cooling when power is at a premium. So, and the subject matter expert's not necessarily gonna care about those things.
It'll be a more senior... So that's a model of, you know, it's not just a technology model change, it's a mindset model change. And that's probably, right now, now that we've developed the technology or at least well on the path, you know, with clear roadmaps on this technology, I think that's the bigger challenge for us.
Yeah, and if we step back from it, I mean, just to simplify, it's like, you know, we do a really good job of making individual arrays easy to manage, but if you have 100 arrays, you buy another one, that's like incrementally more work for you to go do. With Fusion, your life actually gets easier as you buy that 101st array, right? And so it's really flipping that mindset and that model on its head.
Thanks a lot. Simon Leopold with Raymond James. I've got two. The first one is reflecting back on your discussion on power savings. One of your competitors has become, I'd say, more passionate, more vocal in arguing that those claims that you've posted in your ESG, that they're no longer true, that their all flashes, even though they're solid-state drives, they're claiming lower power consuming than yours. If it was more than just sort of their statements, they're blogging it and so forth, we get this, he said, she said. So anything you can do to help us understand where that is today?
Well, I'm glad to have them, you know, put it down on paper and show it. I mean, if you just look at their data sheets, it's more power. I mean, it's that simple. If you look at their data sheets, it's more power. But, you know, I'd invite, you know, any third party industry analyst to do their own testing. Be glad to... Yeah.
And then-
Just a couple of-
Go ahead, Coz
No, so a couple of things on that, and Bill might actually chime in on this one, too. So when you buy an off-the-shelf SSD, and you look at it, and it's got these specs, and it's like, "Oh, it can do like 1.5 million IOPS." The way it does that is by being higher powered. So if you look at SSDs, and you look at the spec sheets for SSDs over the last 10, 15 years, their power consumption per SSD has been getting higher. With our DirectFlash, we don't try to do 1.5 million IOPS out of one unit. We try to build for efficiency, and so our DirectFlash modules are actually using less power than SSDs per.
The other thing, though, that's really big about it, we'll put in larger DirectFlash modules than the competitors will SSDs, because with the DirectFlash, we get far better, more consistent performance. And, you know, with an SSD, it's like, "Gee, I'm gonna go do garbage collection. Wake me up in a quarter of a second when I'm done and get your I/O." And in order to hide that bad behavior, they need more SSDs, so they actually have a lot more back-end I/O bandwidth. So generally speaking, forget like a hyperscale opportunity and a regular opportunity. If we're bidding 18-terabyte DirectFlash modules in a FlashArray//X, probably the competition is actually really bidding the 7-point, you know,
6.8
... 6.8 or the 3.84 TB SSDs. That means more SSDs, more power. It means more, chassis, more power. It's a bunch of that. You get to the big DirectFlash, and then there's just no comparison. And so, you know, when you actually look at the configs that people are comparing, we're way ahead.
Yeah, and I will just add one more thing. I think it's also about validation points. You know, when we have, we've had significant customer wins, and those customer wins are in part due to the power savings that we're driving, that they're measuring themselves, right? And I do think, you know, in the event we get a design win with a hyperscaler, the driver, the primary driver for that is gonna be power and the differentiation on power. So it really comes down, in my view, in terms of the validation points we're seeing across the globe.
... Great, that, that's helpful. I appreciate those answers. The follow-up is regarding some of these newer companies that position themselves as software-defined storage, so like the WEKAs and the VASTs. How are you thinking about them as new competitors, and what is- what's different about competing with them versus the legacy players?
Yeah. So there's a real problem, in my opinion, on the term software-defined whatever. So I went through the software-defined networking wars, and at first it was defined as software on a white box. What defined it as software was that it was software, and that, you know, they'd put it on a white box, and it would work. Over time, software-defined networking came to mean something else, and I think much more valid, which is you could use software to define how you wanted your networking to operate. All right? You know, that's what Arista did. It's what other companies did. Software-defined storage today, when a customer uses it or when these other vendors use it, they generally mean, "Oh, we sell software.
Put it on your own box." It doesn't mean that you get to use your application software or even your managed software to define how the storage actually works. I would argue Fusion is software-defined storage. It's not software that you buy and put on your own hardware. We don't force our customers to do that, which just adds risk and complexity and cost. It's the customer. We, we've created a system that allows the customer to write software to define how the storage works, how it works for them. So I, you know, I'd argue that they're doing the same thing that, you know, Dell has done and others have done, which is separate software from the hardware and say, "Great, we get higher margins for higher, for a higher stock price. You, Mr. or Mrs. Customer, get to integrate that all on your own. Good luck.
Look, I, I'll also just say it a little stronger. You know, I think if you step back from it, they're taking that approach. At the end of the day, the question is, what are the values they're delivering to customers, right? We look at what sets us apart. They're not doing better on performance. You know, it's not an architecture that sets them up for success in terms of reliability, absolutely not in terms of efficiency. You can measure efficiency in cost dollar terms, you can measure efficiency in power and space, but it's not an architecture that's going to be successful there. And so you really have to ask yourself the question of, what problem are they solving?
I will quote the great Andy Bechtolsheim of Arista, who said, "Which one of you wants to use a software-defined elevator?
Over to you, Asiya.
Thank you very much. Asiya, Citi Research. I know you don't want to put a TAM around the AI portion of training. What about on the hyperscaler side? Like, do you want to help us, like, figure out how big this TAM is, how it kind of... You know, I think I've asked Kevan before as well, like, how should we think about that-
Yeah
... as your revenue growth? And then, if I may, for Rob as well, on the AI journey, you know, maybe, I know you shared the anecdotes of your various customers. Just in general, how do we think about where people- where customers are in that journey? I mean, we've had a year of GPU compute, AI training. As we look into now AI inferencing and modernization, like, how, on average, where are folks in your customer base on that journey?
Yeah.
Thank you.
Let me start with the first part of the question, which is that AI in the hyperscale environment, and now I'm talking about the major hyperscalers, right? Not the GPU clouds or others. GPU clouds will buy, let's say, a standard, although very, very high-powered, high-performance storage, but they'll buy storage arrays, you know, from vendors, right? What's interesting about the hyperscalers is they use their own infrastructure because they have such wide, they've striped the data so wide that they can provide the kind of performance that GPUs or even large-scale GPU deployments need from their own infrastructure.
Well, if we start replacing the infrastructure of hyperscalers as Bill talked about, that will be the infrastructure not only for their core, but for their AI as well. So it's, you know, in the large-scale hyperscaler environment, we're actually, you know, we're competing for, as Bill said, the disk first, but if we replace their infrastructure, it will be in the AI environment as well.
Bill, would you go ahead and talk through this slide again? Because this directly addresses, I think, the question as well.
Yeah. So if you can imagine 855 exabytes and 68% of that, so let's call it something over, over 500-
Thank you
... exabytes in 2024, the, it's a little bit, misleading to just look at the spend on hard drives. 'Cause we really need to look at the total cost of ownership. We know that the acquisition cost of these is lower than the acquisition cost of our solution, but we're winning on the total cost of ownership. The actual acquisition cost of, of this type of spend is on the order of $5-$8 billion annually for those hard drives, but that's only the acquisition cost.
So you're saying it's a multiple of that. You're saying it's a multiple of that because the TCO is much bigger than the 5-8-
The solution cost to them is multiple times just the-
Right
... cost of the drives.
Right.
Yes.
Okay, and so that's kind of when you think about your TAM and your growth, you're thinking along multiples of that $5 billion-$8 billion.
For the total market.
For the total market.
You may want to comment on that, Kevan, 'cause the way we sell-
Well, I don't-
Yeah.
I actually want to talk about it in exabytes.
Yeah.
Yeah.
I want to talk about it in dollars.
That's why I put this slide up.
Yeah. I wanted it the other way around, and I was-
Yeah, no, I get that.
I know what the exabytes are.
Let's talk about it in exabytes.
Yeah.
That, that's what I want to talk about the opportunity, and I think that's the appropriate way to talk about it right now. Yeah, so I don't know if there's anything more you wanted to add, Charlie or Bill, on the-
No, that's right.
Okay.
I mean, until we come out with exactly what the commercial, you know, which we'll announce it when we can. But you know, obviously there'll be some confidential information we'll have to hold back. But, you know, in terms of the structure of the commercials, that will be... Once we get there, that'll indicate what the specific opportunity is for us in a better, in more, it'll at least bound it in a way that's more explainable.
I also want to say that, although our initial thrust is hard drive replacement, we believe we're competitive against the SSD solutions that's being used in the hyperscalers as well. So we think that also makes it a larger opportunity.
And then I say, I'll hit the other part of your question, which is, I think, hey, where are we in the adoption cycle in the enterprise side of AI as they're looking at inference and RAG and so on and so forth? You know, I think we're still very much early in cycle. You know, and I think it's going to take, you know, a couple of years, really for both the techniques, the application sets, the solutions. You know, this is still a very fast-paced part of the market, and, you know, all the software is still evolving very quickly.
I think the other thing I've seen, you know, pretty pronounced over the last year is, as we talk to the enterprise, you know, 9, 12 months ago, I'd go and talk to a customer, and there was a lot more just blind euphoria, right? "Hey, I've got these, you know, AI budgets. I have to go do AI, and you know, let's go figure out what to do." Now I'm having much more nuanced conversations with customers about, "Hey, how am I going to map that to, you know, a use case that's going to show value back to my organization?" You know, nine, 12 months ago, I would ask a customer,
"Okay, great. Got it, want to go do all this stuff. What are your KPIs?" And you kind of get blank looks. Now people are really starting to think through, Hey, what are the use cases I'm going to go focus on, to either drive efficiency or to drive, you know, revenue for them, or so on and so forth. But I think we're still early in cycle.
Hey, everyone, Pinjalim Bora , J.P. Morgan. Staying on the hyperscaler point, I'm trying to understand how important is it for you to hit the milestones around the DFM module, 75, 150, 300, to drive that cost curve down for a hyperscaler opportunity? Is it dependent on that? Are they already seeing some of the cost-
So far, everything we're talking about is based on the 75s. 150s just make it better.
150 even, even more compelling for the hyperscalers?
Yeah. 300, even more than that when we get to it.
Understand.
Yeah, it probably goes without saying, hyperscalers also have multiple levels of performance that they and multiple levels of price performance that they want to be able to, to make use of. So we'll probably see that layer over time. You know, they'll have some. You know, there is different performance le-- there's different price performance levels between 300, 150, and 75, and frankly, even smaller drives for much, much higher performance environments, and you'll probably see a range within a hyperscaler.
The key point is that as the drive gets larger, all the ancillary equipment around it, you need less of it, basically. Right? Brings down the networking needs and the server needs around the storage, so it does lower their cost.
Yeah, understood. One question I'll, there's a lot of positivity, obviously. I'll ask a little bit of a on the other side of things. I mean, you-- I think last day, I think you talked about you were collecting 24 TB of metadata, I believe, in a year?
That was over the last year.
For the last year.
25 TB of telemetry over the last year.
So you probably have a good idea of kind of utilization of your fleet that you have already sold, at this point, right? And you're, you're releasing Fusion now, and maybe this is a good question for Kevan. But as we think about the next, I don't know, next few quarters, as Fusion gets into the hands of people, and you have said in this conversation here, maybe there will be less revenue, a couple of times. So I'm trying to think, will there be a period of, you know, digestion as the utilization rates go up?
Actually, let me start that.
Yeah, please. I wouldn't have had an offering be released if it was a reduction to revenue. But anyway, please.
So, it's probably not well known, but almost every new release that we have of Purity itself, including, you know, we're going to be doing, going through a Release five of our controllers over the next six months. Almost every new release... Six months?
No, probably the R5s are more like nine to 12.
Okay, nine to 12 months. But every new release we have, we actually improve our data reduction ratios, which means that customers, again, without buying anything extra, all of a sudden get another 10%, another 15% storage on their existing arrays, and we've been doing this for years. So, you know, we, this is something that actually is already part of, A, our value proposition, but B, you know, part of our revenue profile, you know, for many years.
Yeah, yeah. I mean, the better we make the product, you know, the more we sell, because we're still not the majority in the market. And, you know, and it's not even just when we do new controllers, or, you know, new generations of DFMs or anything. Oh, we figured out how to make the software better. We improved the data reduction. We improved the efficiency.
So FlashBlade, a year ago to now, we get about 25% more usable bytes out of the larger configurations because we've improved the erasure coding, we've improved the garbage collection algorithm, so we can reduce the holdback. And we've done a bunch of things. That's value to the customer. The customers then say, "Wow, this is even better than it was when I bought it," and they want to buy more of it because they're constantly replacing old stuff that they bought from other people, and they want it. And with Evergreen, they're not replacing old stuff they bought from us, so they keep that-
So that revenue is not there either, right?
Yeah, they, they keep that on our subscription, and then they go say, "This is great, and I'm buying more.
Yeah. It just adds to the TCO story in a very compelling way. So we already have a really compelling TCO story with our customers, and you just think about Fusion adds another significant layer on top of that, and that's a revenue driver. And that's what, you know, when you think about our success and our market share gains, TCO has been a large piece of that, and I think Fusion just adds to that.
And by the way, Evergreen actually indicates to you that our market share is underreported because we don't have refresh cycles in the sense of decommission that array, and we're just going to give you another one, pretty much the same, but you're going to pay for it all over again. That, you know, Evergreen//Forever is not included in our market share.
Thank you, Wamsi Mohan, Bank of America. Thanks for doing this. I guess I have one on hyperscale as well, right? So if we look at the chart you guys showed on relative to the HDD, the TCO, as you're having these discussions with hyperscalers, in a hypothetical scenario, if they were to replace a large chunk of their install base of HDDs with your solution, you get these benefits. Generally speaking, you know, what is the way in which hyperscalers are going to approach this?
Because is it in an existing data center, a piece of it that will be taken away? Because otherwise, you've got this issue of a lot of this is power cooling, overhead, labor costs, which kind of don't just go away, right? Like, you replace a section of the array or a piece of the storage. The hyperscaler is actually still incurring most of that in the rest of it, so it doesn't give the full benefit of that cost. So I'm just wondering how the customer is thinking about it, and how do you think that adoption takes place?
You want me to start?
Yeah, I think-
Start.
Okay, so a few things. First of all, remember, that chart we showed is annual. It's not the install base, okay? Everything that we're talking about is forward thinking. So they're always thinking about and mapping new installs, new installs, new installs. Now, their turnover on the hard drive part of their estate is five years, right? I don't anticipate that there'll be some that it'll be a lot of brownfield-type installations. They'll be mostly greenfield, and... But the greenfield advantage is gonna be felt fairly quickly if you think about it. In just a few years, a large portion of the estate will have flipped over, and, and they'll start to see the benefits.
And I'll just add, if you think about the scale that these guys are operating at, I mean, honestly, this wouldn't be worth their effort if their end goal was to bring us in to replace 2% of their annual deployments. It just, you know, they get economies of scale and standardization and being able to do all this co-design, do the qualification, tightly integrated into their architectures, and then go take advantage of that by stamping it out over time. And so to Bill's point, right, you know, the way that hyperscalers work is very different than the enterprise, right? They're not going to say, "Hey, I built this, rack design a couple of years ago. I'm gonna go refresh individual components." They're gonna go build a new, generation, right? When that thing needs to be replaced.
Or even a new data center.
Or even a new data center.
Yeah, they mainly build new data centers.
Or fully decommission an existing data center.
And-
You know, a clean sweep of the entire thing. New servers, new switches, new power supplies, everything, right?
But when they do that, it's gonna be a highly, you know, tends to be a highly regular, highly repeatable design because that's where they get economies of scale from.
So, if I could follow up, I mean, if you think about that, it, you're gonna see these things in some massive, big chunks, right? Like, I mean, we're talking about, like, replacing a data center scale. If we are tracking sort of we know what the CapEx kind of these hyperscalers are, Kevan, would you say this is a good way to think about, you know, how we should map over time, right, as new data centers start to pick up? Like, should we be mapping your revenue roadmap based on sort of build-out of new data centers?
That will be-
For this opportunity, specifically.
Yeah, that will be one data point. Now, the other thing we talked about is what is the construct of the commercials gonna look like? And that's why I alluded to the fact that, you know, one avenue, and a likely avenue, would be licensing both the hardware and the software. And so you're gonna have some trade-off on the top line, if you'd go down that type of construct.
But obviously, the build-out of data centers, once you get the win, which is absolutely the first gate we've got to go clear, that would be an important data point. And in fairness, once we get a win and start getting some momentum, we will lean in, with a view on the market in terms of dollars. But I wanna wait till we get a win, we get some ramp, and then we'll come in. I just think we'll be more sophisticated in terms of what that looks like.
... And if I could just ask one more, Kevan, just on the margin profile, right? Like, generally, when people are selling anything to hyperscalers, the margin profile is usually dilutive to rest of their business. How are you thinking about the margin profile for you as it pertains to hyperscalers? Generally, for most other applications, there seems to be this margin issue, but curious how you're thinking about it.
You're talking operating margin, obviously.
Yeah, operating margin.
Yeah. So I think a couple things to think about. You know, once we get through the gate of a hyperscaler win, there will be some investments we'll need to make. And we're being, you know, very discretionary in terms of leaning in with those investments until we get the win. But there will be some investments for a period of time. Once we get through that, we absolutely expect it to be accretive to our operating margin.
Could you also say about gross margins?
Well, gross margins, you know, it would be beneficial for us because I, I keep talking about the... It just depends how the commercial constructs go. If we end up licensing the hardware and the software, it's going to be the, the gross margin profile will be at, maybe slightly improved to what you're seeing today.
Thank you.
Yeah.
So we've made it through the first row. Are there any folks in the second row?
They were feeling left out, I'm sure.
Gonna be here for a while.
Yes.
Sorry, folks. It just was easier to walk down the line.
I think just start at the second row, pass it along.
Okay, let's go back to Aaron then.
Let's do second row.
I'm watching nobody-
Skydeck. That was where I was sat, Aaron.
I will keep going. Aaron Rakers at Wells. So, first off on the AI thing, you know, the relationship with NVIDIA, I've seen a lot of your competitors also have SuperPOD-
Sure
... relationship, DGX relationships. So can you talk a little bit about, is there any go-to-market relationship with NVIDIA, or what differentiates when you get that certification with SuperPOD, versus some of these others? And then, not to go back to the hyperscale thing, but the licensing idea is kind of an interesting business model attribute. Is it licensing on capacity deployed? Is it... How does that-
Let me start.
How is that licensed?
Remains to be seen, but I think the point that I want to make sure folks are understanding is that there, you know, depending on how the commercial constructs go, we have the opportunity, based on that, the commercials and how we're designing it, to have less top-line revenue and better gross margins, right? And licensing, I'm just using licensing to give you a flavor of that. How that ends up in the commercial construct remains to be seen. Yeah.
You know, in terms of the NVIDIA, and I, you know, Rob is leading our discussions there, but let me start. In terms of our relationship with NVIDIA, I would say we have a premier relationship with NVIDIA at the, you know, for the enterprise activities, you know, RAG, inference, package, package vertically, industry vertical solutions, and so forth. I think we're probably the furthest, the most advanced in that, in that area. We've been working with them for six years, after all. We probably have the broadest set of verticals that are using a combination of DGX and our, and our product. So that's, and, and that continues really strongly.
I will say that, for a variety of reasons, largely having to do with the fact that most of the very large-scale GPUs deployments were being conducted or being led by ex-HPC professionals. You know, we were not, and, and, and are not, you know, an HPC-oriented company. So some of the HPC-oriented vendors, which were typically pretty small, got out of the gate first, and, you know, in terms of SuperPOD certification, etc. We still, as far as I know, have, in fact, the largest infrastructure in a giant-scale GPU deployment in the form of the Meta RSC. That 24,000 GPUs, still the largest in the industry.
It's 100% based on our infrastructure, and we are in other Meta AI environments as well, so we have a lot of experience there. SuperPOD certification, as you might imagine, a lot of companies, a lot of customers are doing exactly what NVIDIA advises, and without the SuperPOD certification left us a bit out in the cold. So this is an important step for us.
Yeah, so I agree with Charlie, just to add on a few things. You know, I think as I walk through, you know, we believe we've got the best all-around storage platform. It, you know, if not the best, one of the very best, you know, out there. I think getting the SuperPOD certification is an important gate to get through with some customers who really are looking to NVIDIA, not just for validation, but also, you know, just kind of that white glove service that Charlie and I were discussing during the keynote. I would also point out that, you know, some of our competitors who, you know, are having this activity, Aaron, that you mentioned, it's really driven primarily by their GPU server business, which is an entirely different, you know, kind of ball of wax, if you will.
Thanks. Simon Leopold again, Raymond James. First, a clarification. Charlie, I think you just said you are in Meta's newest 24,000 GPU-
No, I said we're in the RSC, which is 24- ... the 24,000 GPU. So you were in the original cluster, but the one they announced in March- We're not in that one.
You're not in that one.
Correct.
Okay, that's what I wanted to clarify.
Yeah.
The question I had was, in terms of the potential commercial agreements with the hyperscalers, I'm trying to sort of understand the boundary conditions of what's possible. I think one of the scenarios that is in the realm of possible is essentially a license of the operating system, which would allow the hyperscaler to buy the NAND on its own. ... which would be less revenue to you, but more margin. I just want to understand if that's on the table. Appreciating it's nothing's a done deal, but I just want to make sure I understand the range of scenarios. Thanks.
Yeah, yeah, maybe I'll just hit this.
Yeah.
I mean, I think as a starting point, in terms of where we are, with the potential opportunity in front of us, near term, we'll be taking care of the NAND and taking care of the DFMs. The solution we have, because of Purity and the DFM, is we can go all the way to allowing a hyperscaler to control the NAND purchases, control the DFM purchases over time, and just license the software. We could be—that could easily be a solution over time, but that's over time. I think in the near term, we'll stay focused right now on us controlling the NAND, as well as the DFMs.
Over here, please, for the next question. Wamsi?
Yeah, thank you. I was curious, if you're going to let hyperscalers kind of go down that route, like, would you be willing to do that for enterprises as well? And why or why not? Because you, I know, Charlie, in the past, you said you want it to be sort of one throat to choke, like the, you know, performance ways, like there are a lot, lot to be lost maybe in disaggregating the two.
Yeah.
But why is that seem to be okay for hyperscalers?
Well, there are several reasons for that. First of all, as you might imagine, hyperscalers have their own file environment, their own block environment, their own object environment, all of which will be operating above our storage, okay? That's not what enterprises want to buy. What enterprises want to buy is, you know, complete file systems, object, and block systems, right? Secondly, because of the way... hyperscalers spend a lot of time in architecting their environment to be reliable, to be robust, to have, they really understand the failure conditions, right, and have software to be able to deal with all those failure conditions. Enterprises don't have that infrastructure. They want the systems to be robust, highly reliable, and so forth.
Why do we build a you know a integrated system to begin with? It's because we have... how do we get 10 times higher reliability? It's because we are not old-fashioned software-defined storage. Half the time fails because a cable is loose, okay? You know, we design a system to be you know upgradable in place in less than an hour, and this is a highly integrated system. So we think that, in general, that's best for enterprises. Now, you'll probably point out, which I think is probably correct, there's this in-between world, okay? Between you know hyperscaler 10 and and you know enterprise 50 or 100. It may open up some opportunities there, but it won't be the same thing.
And I'll just add on to that. I think Charlie made a good point in terms of what the hyperscalers want from a software point of view versus a component point of view. You know, I think the other thing I'd point out is a big contributor to the reliability that we can deliver is the sophistication of our supply chain. And I don't mean our supply chain just from a procurement point of view, I mean, from a long-term tracking understanding, you know, what were the lot codes of individual components that we purchased that went into this particular unit? If something fails, what are we replacing it with? Are we replacing it with something that's of the same generation, has the same firmware? A lot of that is, you know, we do that every day.
We do that every day for our enterprise customers. That's what drives the reliability for our main product lines. The hyperscalers, when we talk about the top ten, they all have very sophisticated supply chain units who can handle this. When you get into those nether regions, that's where you've got to be a little bit, you know, discriminating.
Okay, go ahead. Great, Mary.
Hi, I'm Mary Lenox with Morgan Stanley. So for the tier twos or the GPU startups, how are you evaluating whether to make the Evergreen deals? And then do you see that as becoming more popular with that set of customers?
That's a great question.
Yeah.
So, you know, obviously, when you go into an Evergreen//One deal, you know, you have to have some understanding of the, of the if it's a startup, you have to have some understanding of the prospects of the startup. You don't want to be left standing with a bankrupt company, you know, after a year. But, you know, so we do the standard, you know, analysis of companies. But for the most part, you know, we have high confidence in the companies that we extend Evergreen//One to, you know, that they'll have multi-year staying power.
Well, and we're leaning-
You know, I'd say-
... we're leaning into Evergreen//One, and maybe, Prakash, you can kinda hit the-
Yeah
... offering as well. I mean, we wouldn't have come out with an AI offering, if we weren't leaning in and thinking that Evergreen//One would be a compelling model for customers on a consumption basis, that allows them the highest level of performance in connection with their GPU usage, but at the same time, giving them cold storage rates in terms of storing the data. And I don't know if you wanted to add anything more to that, Prakash.
We've grown up as a company in our early days with some SaaS startups in and around Silicon Valley that now happen to be larger companies like Workday and ServiceNow, and LinkedIn, et cetera. Like, in the early days of our company, you know, the model of Evergreen was built and designed for SaaS-type companies that obviously need to run and operate a service in a non-disruptive, continuous operating way, right? And then we brought that kind of advent of value into the enterprise.
In the, I believe that a lot of these SaaS startups in vertical orientation with, you know, domain experts and application developers will be, you know, whether it be healthcare and genomics or whether it be, you know, specific AI functions, you're going to see specific SaaS-based applications that are tied to solving domain-specific problems. I think the startup community that, you know, will emerge there needs a way of getting up and running with the same attributes that Evergreen offers. Evergreen//One has to be uniquely designed to guarantee the throughput and utilization of solving those, you know, difficult challenges of training that change with inference over time.
So when we designed this offering, we designed it with an idea in mind that a lot of the real value in the early stages of this is going to come from people building vertical applications. And those vertical applications will have a period where your requirements will be training-oriented, where you need high-throughput pipes, and as you shift to inference, you're going to move to more data and less compute, kind of in that cycle. So startups can't handle a lot of, like, just like in the early days of Pure, startups can't handle a lot of that, you know, like enterprises have a lot of money and infrastructure volatility. You need to give them the flexibility to grow, and I think this is going to enable an economy. So we de-designed and defined it based on those attributes for that vertical orientation.
Okay, let's take the final here from Howard. We're about out of time. Okay, we'll take two more. All right, Kevan. Wamsi, you'll get to wrap this up. Go ahead, Howard.
Great, thanks. This is Howard Ma again with Guggenheim Securities. I want to ask a follow-up, and it actually dovetails off of Prakash, your the last statement you made. But I... The question is really, I want to take a step back and just ask higher level about IT budgets and the selling environment. And I want to posit two trends that I was hoping you guys could either, you know, confirm or deny. And the first is, so twenty, I guess 2023 was peak IT optimization. So maybe, you know, there was more pent-up IT demand in general, but specifically storage demand than in prior years.
And then if you couple that with, as enterprises are starting to scope out their AI needs, and this is where it relates to Prakash's comment about, you know, so you have these—you have, you know, new SaaS apps, startups. You know, they're essentially changing the—they're gonna introduce a whole new set of applications that are, you know, going to be compute storage intensive. So if you couple that, so, you know. So I guess in short, you know, is there pent-up demand plus you have all these, you know, this additional need that could result in, I guess, you know, all this coming online and, you know, we're already in the back half of the year, so maybe 2025. Yeah, I'd love to get your collective thoughts on that. Thank you.
Do you want to hit some macro thoughts?
Yeah, I would say that, let's see. I can't say that I see a lot of pent-up demand. I think demand is good, not great, you know, not a whirlwind, but I think it's... I would say on the order of the second half of last year, it's kind of continued. You know, we started to see a steady increase towards the second half of last year, and we were hoping that that would continue to increase at the same rate. We're not really seeing that. We're seeing, you know, just a modest improvement in overall demand, I would say.
I think part of it might just be, because with the, you know, with the Broadcom situation and AI, it's, possibly putting pressure on other parts of the budget, or at the very least, creating a little bit of uncertainty in IT teams. And so, you know, we've, we're probably in a period where, you know, uncertainty causes a bit of hesitation. Hopefully, we'll get through that uncertainty in the next quarter or two, and we'll see a resurgence. But, right now, I'd say, you know, it's good, not-- but we're not seeing a big resurgence.
I think where our demand is coming from, just to follow up on that, is from our entire portfolio. And we're seeing really good traction across really our platform that we've talked about. It's actually even more amplified in enterprise. We're having conversations with enterprise companies today of orders of magnitude that we haven't had before, and I think that comes back to really what we've done in complementing the entire data storage portfolio and platform with the introduction of our E-family, which really completed that for us.
Okay, last question. Wamsi?
Yeah, thanks, Paul. Kevan, this one's for you, on margins again. If I go back in time, right, and just look at the history of as you've really expanded your margins, you got a lot of leverage from prior sales investments that translated to faster revenue growth, and then you were able to capitalize on operating margin. As you think about the hyperscaler opportunity, right, like, clearly, your go-to-market investment, you convince these hyperscalers, "Hey, this is a fantastic return, like, you got to go down this road.
" You don't need to keep convincing them. You've got four people you're trying to sell to. Like, you know, it just feels as though the operating leverage should be massive in this case. Are we thinking about this the right way in terms of incremental operating margins being much, much, much stronger because your go-to-market investments will be kind of minimal on a relative basis?
Yeah, I think, well, you're thinking about it right at scale. But I think, again, for a period of time, there'll be investment to scale. ... But once you get through that scale piece of it, yeah, absolutely. Yeah. Uh, absolutely.
This is a, you know, medium to long term, you know, opportunity for us, right? I mean, it's, it'll start, assuming we actually get a design win, revenue should start next year, but they could be modest compared to, you know, follow-on years.
Charlie, did you wanna have any concluding remarks?
Sure, we can sum up. I hope that, I hope this has been useful for you, for you all. You know, we've tried to be very, you know, very frank, you know, open, with where we are, in technology and our, you know, and our, overall, both performance and, and, if you will, profile now as a company. What I would ask, and if you don't mind, I would ask, which is that we really believe that the changes that we're making, in particular with Fusion, is a fundamental shift, in data storage.
You know, analysts, that have been following, not just our company, but the data storage industry, you know, for years if not decades, have come to see it in a certain way, very much in a hardware-centric, you know, profile. And very much the way that the industry itself has seen itself, which is very use case dependent, with the expectation that the more platforms that you have, the better, the better you are. You know, what we're doing is entirely different from that. I would say that we're doing the real software-defined storage, meaning that we're allowing our customers to define the way the storage operates based on software. They can define service classes, storage classes.
They can define policies whereby data is copied, it's managed, it's tiered, it's backed up, et cetera. And they can do that all by policy rather than through manual intervention. It's a fundamentally different way to... It really is a platform. It allows companies to start looking at their data as a cloud of data, rather than as data in individual silos managed with different types of environments. So I would. My ask is that you be part of starting to think about data storage in a very different way. I want to thank you all for your time, for coming out. We know how valuable it is, and hope you had a great time here in Las Vegas. Stay out of the heat, and hope to see you again. Well, I know I'll see you all again in about two months.