Good morning, everyone. Welcome to our 2024 Analyst Day. As we begin the day, I'd like to remind you that during today's presentation, we may make forward-looking statements and refer to certain non-GAAP financial measures. Forward-looking statements include predictions and expectations about future events and our financial and operational performance, and are subject to risks and uncertainties, many of which are beyond our control and could cause actual results to differ materially from our expectations. For more information on such risks and uncertainties and other factors that could affect these expectations, please see our SEC filings.
Non-GAAP financial measures are supplemental and should not be considered as a substitute for or superior to GAAP financial measures. You can find reconciliations of the non-GAAP financial measures referred to today to the most recent directly comparable GAAP measures in the slide deck accompanying today's presentation. Thank you, and please welcome Mark Adams, CEO.
Well, good morning, and thank you all for coming this morning, both in person here in the room at our NASDAQ marketplace, as well as those online. We couldn't be more excited to present our company to you today, not just what we've achieved, but really where we're going as a company. Today's agenda is pretty targeted, and we have a whole set of team members from our company to take part of the agenda and tell our story to you. After I give a brief set of opening comments around our strategy going forward, we're gonna have Mark Seamans, who's our VP of Global Marketing, talk to the market opportunities we have, as well as starting to define some of the complexity that the market has with deploying AI.
From there, our President of Intelligent Platform Solutions, Pete Manca, is really gonna talk about the engagement with our customers and how we're building those relationships and scaling that part of the company. We're also gonna talk a little bit about the strategic nature and the role of memory in AI. And today, we have Andy Mills, who's our Vice President of Product and Technology for Memory, to articulate how memory plays such a critical role in the performance of compute and, more broadly, AI. After that, Jack Pacheco, who all of you know, is our Chief Operating Officer and served as a transition role while we're having Nate join us. Jack's gonna really kinda present the financial numbers as it relates to our business and our long-term model that we're presenting today.
I'll get up and have a couple of closing comments, and then post that, we'll certainly open it up for a session of Q&A, and be able to answer any questions you might have. So again, thank you very much for coming. When I started back in, it was August, the end of August 2020, that was really the end of the fiscal year, and I got to know the company. Jack was kind of a key executive sponsor for me coming into the company and getting me oriented to the company, and one of the things you might imagine is, like, how do you get as much information here? You know, the classic CEO is the first 90 days.
So I talked to customers, I talked to suppliers, had a lot of internal meetings. This was actually during COVID. I don't think we had our first real in-person meeting for seven or eight months, if you can imagine that. One of the things we also did, and after about four or five months, is I wanted to go out and see what folks like yourselves thought, both existing shareholders, as well as the targeted shareholders that might be interested in the company at that time. And it was really. I tell this to people 'cause it was. It's not a very expensive thing to go do, and yet we did this study, and it was like a playbook for a new CEO and, obviously, the team.
What I heard, again, wasn't shocking, but it was a great template for me. What I heard upfront was, "You know, you guys are close to 80% memory. You know, it's just, it's not gonna attract a lot of investors. There are some people who like it, but a lot of people don't wanna play the cyclicality in memory. It's just too tough to call." The risk of not only Brazil, but consumer memory, which is the most volatile part of memory in Brazil, and quite frankly, at the time, it was a very consolidated. There was five or six major customers. So a lot of risk, a lot of uncertainty in that business. Also, I remember this, the direct quote I heard was, "We don't need another holding company.
We can diversify ourselves." That was the quote that I heard at the time, and it made a lot of sense. We're a small-cap company, and, you know, you do what you do, we do what we do, and diversifying isn't something you're calling on us to do. And so I heard that loud and clear. And finally, you know, between memory and then a couple acquisitions, Artesyn at the time, if some of you have been around the story for a long time, they bought Penguin, of course. The company also bought a smaller company called Inforce, but it was kinda cloudy where the company was, what was the real strategy of the company? And if you think about all that, I left that, I think, "This is amazing.
This is exactly what we should be thinking about as a group." And that's really the roadmap we used. So if I flash forward a little bit to where we sit today, you know, we have a business today that's now 30% memory, not 80%. It was, like, 77%, 76%, but we have a business that's dramatically lower exposure to the broad cyclicality of memory. Not zero, but a lot less. Of course, we were able to divest of the Brazil business, and it was a great outcome for the shareholders. It was just incredible at the time.
I think we sold that for a multiple that was actually higher than the multiple of our company at the time, at a time that the business went split in half, gross margins and operating income negative. It was. But we found the right strategic buyer. I wanna also say one other thing. Not only was it great for the shareholders and great for our company, it was great for the employees in Brazil. And as a CEO, that's super important for me. We have a responsibility to all of our team members, and they found the right partner to go scale their business. As you all know, we have about just slightly less than 20% ownership in that business, 'cause we believe in the business, it's just not where we were going as a company. But again, good outcome for us there.
Quite frankly, we've stopped the whole, you know, holding company mentality on how we think about M&A, how we think about operating the company, and we're on a path to be more of a traditional operating company model, which is exciting for us because it gets us more focused on where we're going on this great opportunity you're about to hear about today. And then finally, I would say, why we're here today, is we're here today to articulate our go-forward strategy, and you can see it around the room, solving the complexity of AI. Before we get into the glitz and excitement around AI, and there's a lot of that, I really wanna hype on one thing. We're an operating company first. You're gonna see a lot of exciting innovation today, you're gonna see a lot of customer engagement activity.
You're gonna see a lot of good stuff going on today. But as a management team, we're focused on execution. One of the best lines I've ever heard about strategy is: strategy without execution is a bad strategy. And I want you all, as shareholders, to know this is what drives us every day, execution. And so when I stood up here in April 2021, a little bit over three years ago today, we were baselined at $730 million. Now, that was normalized because we took the Brazil piece out of the business just to be able to compare apples and apples. Gross margin around a little bit below 21%, 20.7%, and operating income, about 4.6%. And so we came with that, those results, and we set a long-term model at the time.
We talked about high single digits growth in top line. We talked about a gross margin target from 20.7% to 26%, and we talked about an operating income target at 10% from 4.6%. That's what. If you go back and look at the presentation, or for those of you who have notes from the meeting that day, this is what we presented. Again, from an execution mindset, I just wanted to show you how we've done against that. Top line growth in the 12%-13% range, gross margin slightly above 32%, call it 32%, and operating income at 10.4%. Again, execution mindset.
Before all the excitement around AI and scaling and all that good stuff, we have been a responsible and prudent management team, and that's the way we think about the business. Our job is to return to our shareholders and to our stakeholders, and that's the highest priority we operate the company with. And if you look at our balance sheet, if you look at our operating results, that should be loud and clear as we think about where we're going as a company. We're gonna operate this company the same way, with a prudent mindset to run financially, fiscal, and responsible. So, maybe I'll inject my first attempt at humor today. For the last four years, I've gone into meetings with all of you, and we only had hour meetings, right?
55 minutes or what have you, and, this thing comes up, and they're like: "Hey, what is SGH? Tell me what that means." And clearly, it was a company in transformation and evolving. And one of the big highlights today, to start our communications is that, today, we wanna announce, that we intend to rebrand the company Penguin Solutions. And there is a process, as a public company, that we will go through, and I think we're talking, mid to late fall, that this will be formalized. We have to go to our shareholders for approval. We don't anticipate, but we don't know, so I can't say with 100% certainty, but I'm pretty confident that we will be coming Penguin Solutions, and a company that solves the complexity of AI.
For those of you who are aggressively thinking, well, okay, what the symbol gonna be? We've proactively taking the pen on securing P-E-N-G, Peng, for our company's NASDAQ ticker. If we get the shareholder approval. How am I doing, Anne? Okay? Good. So, this is a major symbolic milestone for us 'cause it really does highlight this transformation. Now, when you think about some of you who've been with this company a long time, it was public once under the symbol SMOD, SMART Modular, and then it went private, and then it went public again under a memory restart, and then they bought a couple of these assets I mentioned. This is the Silver Lake management team and Ajay, who is the founder of the company. And they put the company on a path to diversify away from memory.
What we're really talking about today is the next step in the transformation, and the next step in the transformation is Penguin Solutions, positioned as an AI innovator, and that is the message today as we position this, solving the complexity AI of AI as an innovator, as a solutions provider, a much different company than we were less than four years ago today. Here's a staggering fact, or statistic, I should say. All the noise about GPU sales, incredible hardware sales, and the, and the likes, back in 2023, only 5% of global enterprises were deploying AI. By 2026, that number is supposed to be closer to 80%. And so not news to you, AI is pretty exciting. Maybe slightly news to you is that the world of deploying AI is in front of us.
This is the early innings, and that's the game we're in. What's this complexity we talk of? Well, that's gonna be a large part of our message today, showing you what it's like. I can raise my hand to the question that I've never worked in a data center, but if you go to a data center, and you take a look at what's going on in terms of the AI infrastructure environment, it's quite different. Of course, using someone else's quote is a little bit easier, but it's radically different to implement and deploy AI than it is a traditional compute environment data center. Why Penguin? Why us? Today, I can say that we don't know of many other solutions providers that have been in the business over two and a half decades.
Penguin's history that started out primarily in hardware, 'cause that's where the game is being played at the time, you know, laboratories, government, education. 25 years of deploying these types of systems in a data center environment, you can't bottle that up. You can't just show up as a crypto company or something like that and say, "Hey, I wanna go to AI tonight." That's just not possible. That know-how from a systems and architecture standpoint is very powerful. Beyond that, the software layer to manage the infrastructure. What does management mean? Well, in a multiprocessor environment, there are systems resources that need to be allocated and scheduled. Then there's a diagnostics, diagnostics component that allows you to see what's breaking down.
You know, one of the most hidden facts of data centers today is the failure rate of GPUs, and being able to know that ahead of time, proactively, through your software platform, to be able to diagnose faults ahead of the time, predictive. It's all about uptime. Uptime is money in AI. And so we have this software platform that we use to drive our infrastructure, and we're continuing to invest more and more in doing so. That enables us to proactively grow our solutions capabilities and be more valuable to our customers. Managed services. These data centers, as I said, are complex, and you're gonna hear about the complexity shortly. Again, 25 years of people going on site, working with customers, knowing what's gonna go wrong. By the way, you talk to anyone who runs these data centers, things go wrong.
It's not that things aren't gonna go wrong, it's what you do when they go wrong to get back up time and be in the highest performance you can be, and that's a capability, again, that we have that's very unique to the marketplace. Innovative solutions. You're gonna hear bits and pieces today, but we are leading in terms of innovation inside the data center. Cooling systems, advanced processors beyond today's traditional what we know GPUs, memory solutions. Now, if you think about this, without memory, there is no such thing as compute. The biggest bottleneck today in artificial intelligence compute solutions is memory. There's not enough bandwidth to allow the compute workloads to enable the memory to respond quick enough. It's all a direct link.
You're gonna hear about technology today that allows us to improve on that over time and some really radical new technology we've invested in. So, slightly different than we've probably talked about in 2021, the investments in innovation we're making are substantial. Finally, as we think about characteristics of all of our products and solutions and services, high availability, high performance. Has to be available, and it has to be peak performance. That's how we design all of our products and solutions as we go to our customers. So in short, we're positioned as a trusted advisor for our customers, allowing them to harness the power of AI. We're technology agnostic. We don't come to you and tell you, as a customer, "You have to do this or this." There's sometimes we use vendor A, sometimes we use vendor B.
We're great partners, but it's really the customer environment that's gonna dictate how we design our solutions. And so being a trusted advisor allows us to really be more sticky in terms of our long-term relationships with the customers, and we've had a lot of customers that stayed with us, not just through HPC, but now in the AI generation. Before I hand it over to Mark, we had an announcement yesterday afternoon, and there's many components to it. But what I'm most proud of, separate from the financial piece and separate from the strategic value, which is amazing for us, what I'm most proud of after that is the endorsement of a massive conglomerate company, global in nature, and the value they see in doing business with Penguin. So Jack will talk a little bit about the structure of the announcement yesterday.
What I just want to share with you today is, sometimes in agreements and announcements like yesterday, there's kind of like this one aha moment, like, "Oh, that's going to happen." We're very excited about the potential areas of collaboration that we've been in discussions with our partner. On the compute side, the AI infrastructure data center side, on the software side. By the way, they're very advanced in investing more heavily in AI, both in their own organic way, but also in the terms of the investments they've made, including Bloom Energy, Anthropic, Lambda Labs, and today, SGH. And so, when we think about these type of assets, it's just an enabler. Continuing down this list, the global reach. Most of you who follow the company know that majority of our business is U.S. based for Penguin Solutions.
We have the aspiration to expand beyond that over time. Having a partner like SKT will be invaluable to that objective. We've been talking a lot as it relates to the future of AI and the role that it will be playing at the edge with potential inferencing platforms and the likes. Again, another opportunity for us to collaborate with SKT and build winning products to put us ahead of our competition. And finally, as they think about processing and memory together, working together with them to enable the highest performance and high availability platforms.
You know, so there's a lot here to unpack for the optionality on what we can do with SK. There's a lot of potential areas for collaboration that we're discussing with them, and again, I can't be more excited. It's just a great endorsement for our company, and opens up many avenues for us to explore. So with that, what I'd like to do is I'd like to hand it over to Mark Seamans, who's our Vice President of Global Marketing. Mark?
Hey, thanks very much, Mark, and good morning. I appreciate the opportunity to have a few minutes to be able to share with you today, information about our market opportunity and also about customer challenges. You saw that in the agenda slide, and the reason I bring both of those topics up is because they're really inextricably linked as we talk about this wave that we're seeing with AI. It is, in fact, as Mark started to mention, as we'll get into some more depth on here, it's the challenges of these AI solutions, which does create a tremendous opportunity for Penguin because of the expertise that we're able to bring in helping customers get to their objectives of deploying the technology.
At a really fundamental level, and our Chief Technology Officer, you'll have a chance to meet in a few minutes, Philip Pokorny, has talked about the very nature of the workloads of AI is fundamentally different than what customers have been doing for decades in enterprise computing. You know, with the advent of virtualization technologies over the last few decades, right? We've seen where customers have tried to take compute nodes and to be able to slice them to run multiple workloads to get more utilization out of those nodes. What we see in AI is that the size of these jobs is so big, and it's so complex that customers want to link together hundreds or thousands of nodes to work on a single job, the training and the creation of these large language models.
And so doing that is a different set of skills, it's a different set of technologies and processes, and we'll get into the details here on that today. Now, obviously, as we all know, AI is expanding across a wide array of industries, and this is a nice laundry list of industries, but it's really as you get down to the next level of the use cases and the stories. And by the way, this has been something at Penguin for the last two decades. I've been, you know, fascinated as I've worked in high-performance computing, at looking at people, how they're using the power of compute to drive competitive advantage, and AI is in a completely new wave of that. You know, we hear stories about people, you know, using AI.
I was in a meeting the other day, where someone in the insurance industry was saying: Well, we're going to take drone footage, and we're going to use AI to assess the health of rooftops within a community in order to drive a more precise insurance quote. You know, likewise, even within our own business at SMART Modular, we use AI in the business to be able to automate and drive up the quality in manufacturing by using AI to automatically inspect the quality of parts moving along the line. So it's gonna be rather than one or two big bang innovations, it's going to be a whole series, hundreds and thousands of small innovations that AI is gonna drive to move the needle on customer business. And what we really see is AI happening in waves. Mark talked about this. We're in the early innings.
So as we think about what's happened, and, and really, we've been engaged, although it was prior to this whole wave coming along in the development of large language model solutions and large-scale AI systems, going back all the way to 2017 with some of our customers. But this early wave that we've seen has been the deployment of systems targeted at the creation of these massive models, things like GPT, things like the Llama model that we've heard about. But what we believe is coming is this next wave of enterprise deployment, which is gonna stand on the shoulders of this foundational work that's been done, right? We hear about this all the time, people doing fine-tuning, right? And retrieval-augmented generation, and pruning of these models for specific use cases that relate to those industries that we talk about.
We believe, fundamentally, it's our ability to help these enterprises to accelerate their time through this challenging process of getting these solutions, deployed, that creates this opportunity for us. For both wave one and wave two, the wave beyond that is this wave of inference. It's actually the reason you're creating the models, is to deploy these models to do whatever it was that you were trying to solve. It's this inference, whether it's in the data center using today's GPUs, or as Mark mentioned earlier, a whole host of innovative GPU and accelerator technology that we're doing early work on, that creates an opportunity.
And then beyond that, it's our ability to start to push this processing to the edge with some of the edge technologies you'll hear about today, where we can move the compute out to the point of where the data's actually occurring in order to speed processing, reduce the requirement to move the data by actually taking the compute to where the point of interaction is occurring. But as we think about deploying these systems, you think about what a customer is trying to accomplish. They've kinda got two high-level things they're trying to solve, right? If we distill the whole world down, the first thing they need to do is to get a system to production readiness, and they wanna do that on a timely basis and under their established budget.
When they get to the conclusion of this, they hit this magical day that we work with towards, with a lot of our clients, which is day one production. We're ready to take the system confidently into use. But that's just the beginning. As you've got these large systems in place, there's this whole process you need to manage in terms of smooth operations. And, the way we do this is actually through a combination of technology, right? Which is hardware, software, and also our professional services organization that gets heavily involved with these clients. And third, through a set of very refined processes we've developed just by having done it and kinda had the reps to understand. And so when you think about getting to production, it's all the way back at the beginning. It's processes around designing, building these systems, and deploying them.
Build is interesting for us in that the process of building an AI solution, starting in the factory and working its way all the way to the customer, is a very specialized thing, that we've got a specialized factory that this occurs in, and specialized teams who work on this with the customer to make that happen very efficiently. Then, when you get to operations, it's about managing that system, but beyond that, it's about anticipating the fact that we're gonna need to expand the system and eventually evolve it as new technologies, new processors, and new software technologies come into play.
When we think about getting down to the pieces that have to happen, it really runs the gamut of what needs to occur, from design to rapid deployment, to working with your customers on their workloads, to maintaining the performance, and then at the end of this process, to being able to put processes in place that allow that system to evolve over time. Well, what's the downside if you don't do this correctly? This is interesting as we engage with customers, right? We talk about this concept of achieving your percentage of potential.
It's kind of an interesting concept, and in fact, we have some patent-pending technology around software that we could run in order to quantify a value for a customer's AI system to figure out how well are you doing in terms of the potential of the system that you have, and what it's supposed to deliver, and what are you really achieving? But if we simplify it, really what it comes down to is they bought a lot of hardware. And by the way, this is the most expensive hardware most of these customers will ever have purchased to perform IT-related activities. So at the worst case, none of your nodes work, and at the best case, every single node you bought is ready for full production use.
Then it's how you interconnect them, 'cause remember, every one of these nodes, as we talked about, is working on a common job, so it's not. It doesn't work. If one node works well and another doesn't, the whole system is dragged down to the speed of the slowest runner. So we've got cluster performance across the network. So if you get it just right, you've got 100% of your nodes working at a 100% perfect configuration, you're achieving optimum, by definition, AI infrastructure value. For the money you spent, whether it was $20 million or $50 million, you're able to achieve that output from the system. But that's not always the case, and customers often that we're getting involved with midstream on their deployments are seeing, "Hmm, you know what we're seeing?
We got a bunch of nodes that are not able to go to production." So maybe customer has 70% of their nodes that are ready to run in production, and maybe the configuration is such that for the configuration that they've got working, it's only running at 70% of the speed it could deliver. And while you think about, "Oh, well, I'm gonna get 70% of the value of the $30 million I spent," it's actually a multiplicative effect. So when you think about it, what you're really gonna get is 70% of 70%, which means I'm gonna get less than half of the value of this system that I spent all this money on, is gonna be actually dripping out the end in terms of, of value.
So what we do with this process, designing, building, deploying, and managing, in combination with our services, with our software that we put in place, is to help customers, either from day one, if we start working with them from the get-go, or whether we start working with them midstream, is to overcome this lost value and to essentially reclaim that, right? And to be able to demonstrate for what you've put in place, you're getting the full value of that. And, and the cost that it takes in order to achieve that value relative to how much money you're spending on this infrastructure turns out to be a fantastic ROI for customers as they do this, especially given how unfamiliar this technology is to most people.
So as we think about these topics I brought up earlier: design, deployment, workload execution, performance, what you're gonna find is that for every one of these, and this list is actually much longer than this, there's challenges that sit behind every one. You know, you've got data center power limitations that are gonna govern what you can do in terms of your system design. You've got an ability to really know, am I ready to run to production, or am I gonna hit some false starts? From a workload execution perspective, you know, one of the systems, for example, that we put in place with a large customer, for that one network had 40,000 network connections. 40,000. The whole thing's interconnected, and if you have a handful of those not working correctly, the entire performance of the system comes down.
But think about that, something that spans three data center halls, and it's running slowly, and you're trying to figure out, well, where's the problem? It's this idea of being able to drill in and find the needle in the haystack to fix that and get the whole system back up to production, which is extremely challenging. We've got a lot going on, as I mentioned, with network, and in many cases, it is the fabric that holds the whole solution together, and then managing this change process.
So a lot to take care of, and so we talked a little bit here generically about some of these pieces, but what we wanted to do to kinda bring this to life for the group here would be. I'm gonna invite a couple of industry experts up to the stage, and they can come on up and grab their chairs. I'm gonna invite Phil Pokorny, who's the Chief Technology Officer for SGH, and I'm also inviting up [Fadi Jabara], who is an advisor and an AI industry executive. But the reason we have Fadi as well is that he's lived this in practicality, in real life, putting these systems together and delivering them. And so before we dive into some of the Q&A, to kind of hopefully illustrate what's going on, why don't I take a second and just let them introduce themselves so you can get to know them a little better. Phil?
Hello, my name is Phil Pokorny. I am the Chief Technical Officer for Penguin Solutions, and I've been with Penguin for 23 years and have been a significant contributor to many of these large clusters that we've deployed that we're talking about today.
Hello, everyone. I'm [Fadi Jabara]. I'm one of the advisors that they brought up here to talk about Penguin, and I've got a bit of an interesting background in supercomputing and a little bit about Penguin that I'll tell you about. My background in supercomputing is I started off actually at IBM, working on a number of the top 500 systems, many of them that hit number one. I spent some time at Meta, which is probably the experience that's most interesting and specific to Penguin. I was actually part of a team that was challenging Penguin internally within Meta. So it's pretty common for Meta to do this. They would hire multiple vendors.
They would have multiple teams within Meta working on similar projects to see which one would pan out to be the best. I was actually at one of the competitive teams, and that's how I got to know the Penguin Solutions and some of their technologies. One of the things that I did wanna highlight, if you don't mind, just right off the bat b ecause Mark talked about this, that I think it's so important is this concept of redesigning, and I think it was actually mentioned in the quote from Meta is redesigning the systems. It's not the same as traditional IT. And I think the part there that you wanna call out is that, you know, GPUs are really expensive.
Typically, one GPU can be more expensive than the cost of the entire computer, and where the fundamental shift is coming in is that, when you think about previously when you were designing IT infrastructure, you were trying to optimize for the memory, you were trying to optimize for the CPU, 'cause those are the bulk of the costs.
Well, just one GPU can cost more than the entire computer, and you're putting 16-32 of these GPUs per node, and that's creating such a cost concentration around those GPUs. If you can't extract the value, you're really gonna have a hard time. So leading into what Mark was saying around that 70 by 70, I think you're being generous. It could be potentially much worse than that. So.
Thanks. Thanks, Fadi. So what we wanted to do was just to touch on a handful of topic areas. We'll introduce them, and we'll really just invite Phil and Fadi to kinda comment. And the first one we wanted to talk about is this topic of power within the data center, and within the industry, you probably hear some of these discussions about, you know, people trying to find, locate data center capacity to be able to deploy. It's a big deal right now, right? Because the power consumption characteristics of these systems is so fundamentally different.
As you kinda see in this chart, going from the left, which is the simplest or most traditional enterprise deployments on the left there, where we have these CPU racks with, you know, 40 nodes per rack, and you can kinda see what's going on as we move to the right with the power requirements for a single rack of compute for these systems. So what I wanted to do is kind of, people that have kind of lived through it a little bit, invite them to comment. So, Fadi, maybe you can, you know, give us a little bit of info.
Sure. This is probably one of the most critical and foundational decisions that you've got to make when you're thinking about your AI cluster. It's really how the power works in your data center that you've got or potentially one that you're gonna end up purchasing or getting into, renting in, hosting in. And I want to give a little context to this with respect to traditional computing. It's very common to have a 10 kW rack. That's very common in most data centers. In there, you can fit roughly 40 computers. That's not uncommon. You'll see that. Other places can get as high as 96, but 40 is a very common number.
So you've got 40 computing nodes that sit within a rack, and most of the infrastructure around that, the networking above it, the power density that comes to it, everything kind of is based on that one decision of having roughly 40 computers and the power consumption of about 10 kW. When you think about a GPU, a GPU, a single GPU could be 500 W. 500 W or more, actually. 500 W, quick math if you do that, if you put 16 GPUs in a single computer, you're gonna end up with just one computer in a rack. Once you've done that, now you've got this one rack that's got one computer in it. It starts to become very inefficient in the way that you're using your data center space.
It becomes very inefficient in terms of how you're delivering your power. So the obvious answer might be, "Well, just increase the power to a rack." Well, it sounds like a great idea, but there's some challenges that come with that. First, obviously, the power and cooling that come with that are significantly more challenging to deal with, and this is probably the life that Phil lives every day. It's really difficult to deal with that. And it also turns out that if you keep increasing the power to the rack, the cost of it doesn't grow linearly with it. Two 20-kW racks, for example, are much more cheaper than running one 40-kW rack, as an example.
So you really wanna make this design decision in terms of where you put your power and how you emphasize it, per rack, per the GPUs, in a way that's really gonna optimize that cost extraction of the GPU. So I talked about earlier, getting the value of the GPUs, you really wanna pick that decision, carefully. It's a challenging decision, and a lot of companies have to deal with it, and if you're shoehorning into your existing data center, you really need some ability to have some expertise to help you think about the to make that best, that best decision.
That's great. Phil, do you wanna touch on that?
Yeah, exactly right. You know, that a typical data center in the past might have only been 3-5 megawatts. You know, it was kind of a typical unit you could rent from the data center providers, but now we're seeing, you know, 15 megawatts, up to 20 megawatts are starting to be table stakes. And that's been driven by these ever-increasing power levels per rack. But working with Penguin, if you have an existing data center space, even though it may be slightly less efficient, it might be significantly cheaper to you for us to co-design your new AI cluster to fit in some existing data center space that you have, rather than having to go rent some new data center or outfit some new data center or upgrade your existing data center.
And that's where the engineering expertise and the design that we do in our design, build, deploy, manage sort of model can be a real benefit to customers. It's sometimes said, too, that every tech has to have an auto analogy. I don't know if any of you have Teslas and installed a home charger and maybe had to upgrade the electricity in your garage so that you could actually power the Tesla charger because the Tesla draws so much electricity, so much more than your dryer, than your stove, than your range, your air conditioner, any of those home appliances. This, that's a good analogy to what we're describing here.
Yeah, it's interesting, Phil, and it's. So in our Penguin business, you know, one of the things we've seen, as Phil mentioned, is advising customers on the actual data center design, but also an area where we're getting increasingly involved is just helping customers, even if they're moving to a colocation facility, to have our finger on the pulse of what's available and to be able to assist them in that process of locating an appropriate data center to place the system that we'll work on co-designing. You know, but we talked a few moments ago, as I talked about criticality and networking, and the next area really is networking within these systems.
It becomes the thing which governs the speed at which your entire system is gonna run, and that's why we put up here, "The network is the platform." So, what I wanted to do maybe is, I'll ask you, Fadi, you know, to touch on again, you know, what's your experience been in terms of some of the network deployments and the criticality of that and the challenges?
Yeah, it's, I think power might be the, you know, one of the more primary solutions, and the network is very close second, or if not, just equally as important, as you mentioned there. And I think, when you talk to a lot of people, they, you know, they say, "Hey, you know, we've got a data center that's full of computers. Why isn't it a supercomputer?" And I think the thing that most people don't really comprehend completely, unless you're really in the nuts and bolts of it, is a supercomputer really is the network. That's what makes it a supercomputer. And, you know, to give you an analogy of how that works, and I think this can maybe best describe it, we've all used calculators. If you punch into your calculator, it gives you the answer immediately, right?
So how fast can your calculator go? Well, really, it's a function of how fast your fingers are. It's not the compute engine that's the limitation, it's how fast you can input. And really, in this analogy, your fingers are the network, and the compute engine is the GPU, the calculator is the GPU. So unless you can type in faster, the GPU will always be able to compute more than you can put into it. So the network there really does matter. It is the driving force of it. And these networks are very complicated, right? They're not the common kinds of networks that you might see in a traditional IT rack, where you have one networking cable going to one computer, going up to the top of rack switch, and then on.
You'll see very complex networks, topologies that are associated with it. The HPC guys have been developing these for years, and it turns out these AI workloads are even yet more complicated because they're more chatty than some of the typical, you know, nearest neighbor kind of style networks that you might get from an HPC market. So I think the expertise that you've gained from HPC is immensely helpful, and then it's one step up, yet more complex on the AI side.
Yeah, no, it's, it's a, it's a great point, that, the computation, as the limit, the, the last machine to provide an answer to the others, because they're all coordinating together, they all need to share results. And the last machine to answer is dictating the pace at which the whole cluster can operate. You know, you, you mentioned the virtualization from, from the last 10 years, where, those workloads are not correlated, right? Those are client-server kinds of workloads. But instead, of communicating externally to a client, these are now clusters that are designed to communicate inside of them, inside between the nodes, in the cluster, and that's a dramatically different communication pattern.
Yeah, it's, it's interesting to see again, how the complexity of this network plays its way in. But as we move beyond kind of the complexity of power and networking, you start to get now to the actual physical process of getting these systems into play. And in some cases, we work with clients from the beginning all the way back to the design, and one of the things we do is we work through that, is to put in place, before we ever get going, a definition of what does it mean to be ready to go to production? And then we work through to that so that when we turn the system on in production, it's ready to go. And so we're trying to avoid this idea of false starts. I wanted maybe, Phil, we'll kinda ask you to chime in on, you know, the challenges involved in this and some of the things you've seen customers work through.
Yeah, this is one of the really exciting things about Penguin, is that we are kind of the full service provider, right? So it starts with design, build, and now we're sort of talking about deployment. And in deployment, it's important that you've got those design documents, you've got those build documents, so that as you go to deploy and you're starting to connect things together, everything's labeled appropriately so that you know where to plug the ends in. But then also, once it does come up, you can query the network electronically, extract the connectivity information, and be able to verify that what you actually deployed is what you designed.
That then also serves as the basis for now, ongoing, benchmarking of the system before you turn it over to users, finding users, a select set of users that you can use to work out all of your other business processes in terms of how do you authorize users onto the cluster? How do they get their files on the cluster? How do they get their applications onto the cluster? How do they view their results? And so having a few, key, willing guinea pigs, if you will, to help you debug the cluster, is an important part of avoiding a false start, where when you go to turn it over to your wider user community, they don't start... You know, they're not asking you uncomfortable questions like: Well, how do I get my files on? How do I get, you know. And they're not frustrated by you saying the cluster is ready, but they can't actually get in and use it.
That's a great point. You know, Fadi, you've probably seen some of that as you've worked on system deployment.
For sure, and this is gonna feel completely ridiculous making this statement to this audience, but time is money. I think you all know that. And if you can't get these GPUs up and running quickly, you're losing so much of your value, right? Because these GPUs, you know, every couple of years, new ones come out, and for the energy that they take, as we talked about power earlier, for the energy that they take, you wanna get the latest generation in so you can do more computation. So having the ability to get these systems up and running as quickly as possible is super critical. It's surprising, you know, often a traditional system might take, you know, maybe one month, you can have it up and running. It's pretty common.
I've seen these AI systems, like the last one I deployed, it just was plagued with bugs. You just couldn't get, just couldn't get the thing up and running. It was 4, 5, 6 months before we finally got that thing to acceptance testing. So it was. It's very, very challenging. Having things that Phil talked about, you know, just starting off, it seems simple with really good documentation, but then having the tooling that can help you get to that level of consistency, to know that everything is connected correctly and configured correctly, it sounds simple, but this is the blocking and tackling that you have to have the discipline on in order to drive these systems up and get them up and running quickly.
Yeah, I'm gonna come back to you, Phil, on this next topic, because it kinda follows on to this idea of, you know, users coming, and either they've identified or maybe we've identified, Penguin, in our management of the system, that this is not operating at full speed. But I've got literally tens of thousands, potentially, of network connections, you know, a host of interconnected GPUs, you know, top-of-rack switches, a lot going on, and it's this idea of being able to find that needle in a haystack many times, right? And maybe you can kinda touch on some of the work that we've done and the experience customers have had.
Yeah, and this goes back to the percent of potential that we talked about earlier, right? As you're shaking down your cluster, you're gonna be trying to isolate, you know, why am I not getting the potential that I expect to get? And that's part of that debug process. It also then factors in with when users get on, and they say, "Well, I expected my job to run this fast. Why isn't it running that fast?" Having a set of benchmarks that you've run on that hardware, that you know you're getting 100% of the potential, so that you then can start to work back to: What is the customer's job doing differently? Why is the customer's expectation wrong? And how can we improve the performance of their application on the available hardware?
There's also, you know, you talked about 40,000 network connections, you know, in a large cluster. When one of those is wrong, how do you find it? And part of that is having electronic documentation that you can instantly compare. You know, is the system actually wired the way, or did someone, in the act of servicing something, get something connected back in incorrectly? It's also understanding the interconnectedness of the various parts of the cluster so that you can say, "Hey, okay, we can isolate this problem to this set of machines. What's common about that set of machines? Or what has changed in that set of machines?" And the information about the cluster over time and, but also the way that it's built, is really the key to finding and isolating those problems.
Great. Fadi?
Yeah, you know, Phil kinda covered one half of the story on the hardware side, but you've got equal problems on the software side as well. 40,000 cables, you have 40,000 operating system settings that you've also gotta deal with. You know, it feels like there's just an endless number of variables that you have to, that you have to deal with. You know, disk performance could be slow, network performance could be slow, could be the hardware, could be the software, and these, this kind of analysis is quite tricky to do, right? It requires a level of expertise that is beyond your traditional, you know, standard, you know, operations team or your standard, maybe your SRE team, as you might have in a normal business. You need real expertise.
You really need an HPC expertise to drive these answers, and even the best can't do it without the tools. So often, what you'll see is people are investing heavily in tooling up front, writing custom tools themselves, working with partners to get the tooling, and maybe leapfrogging ahead a little bit to take the advantage of the tooling that their vendors have. But at the end of the day, it's just a sea of choices that you have to make, a sea of settings you have to make, and then you need to make sure every computer in the node has got the exact same settings. Otherwise, it's impossible to track down these bugs.
Yeah, I'd like to highlight that, too. You know, Pete will talk about this a little more later. Our Clusterware software and patent-pending AI module, which actually monitors the cluster on an ongoing basis, and the consistency, making sure that all the hardware in the cluster is consistent, is one of the key benefits of running something like Clusterware.
Yeah, thanks for bringing that up. I was gonna go to that point as well. It's that software that we've developed to hit the needle in the haystack, which is also, you know, a powerful piece of how we help these customers maintain that high level of operating efficiency. Maybe the last topic we'll touch on really is, after all of this, after you get into stable production, you're really at the beginning of the journey, not at the end. You know, as part of our services engagement with customers on these systems, increasingly, the proposal we put to customers is for multi-year services engagements with them beyond the day one production, but to help them all along the journey of operating it, so that they can use AI versus struggling trying to run AI infrastructure.
So what I wanted to kind of ask as we, as we kinda close, and maybe I'll start with you, Fadi, is, you know, what are some of the other things people think about in terms of, you know, operating the system, you know, at scale as the thing's a fully living, breathing thing, as you have to manage it in a production world?
Yeah, it's. This is really interesting. I think this is where the hard part starts, to be honest, right? You can get a cluster up and running. You're happy. You've got it configured well, and you think, "Hey, I've got all this value sitting in front of me. How do I extract it?" And then you quickly find out, stuff breaks. When you've got hundreds of machines like that, at any point in time, stuff is broken. There's, you're never 100% healthy, right? So the strategies that you have to manage and deal with broken systems, this is kinda where the expertise of running these things for a while really matters. You know, previous systems that I've built, you know, we've had replacement strategies, where you've had hot swaps sitting on the side. It can get very expensive.
As I mentioned, those GPUs are very expensive. These could be $100,000-dollar nodes easy. Having a few of them sitting on the side doing nothing is not necessarily great. Understanding how to manage your system from a you know, making sure that these long-lived jobs are running well, and if they do fail, how to pick up where you left off, so having that understanding of how to do the checkpointing, that kind of thing really matters. And I think at the end of the day, when you operate these, again, it's not your traditional SRE or operations teams that are gonna do it.
The teams that are really gonna do well here are the ones that have the expertise, that come from an HPC background, 'cause it's the most similar to it, and then shifting their thoughts over to what AI brings and how it's a little bit different, to get that consistency and, you know, the configuration and to make sure that you've got good strategies for when failures do happen, 'cause it is inevitable they will happen. Having the good strategies in place is really what matters here.
That's great. Phil?
Yeah, you know, highlighting again, you know, the issue of consistency and making sure that as machines maybe, you know, get pulled out for maintenance, that when they go back into production, that they've been upgraded to whatever is then the current version of, of firmware, of software, all the rest of it, so that, you don't get a mismatch. That's another one of those sort of difficult problems to solve that's important that your management tools are making sure things are consistent. But also, speaking to the future, right? As the CTO, I'm thinking forward to, you know, what can we do to bend the curve so we don't need a 40-megawatt data center, right? Are there technologies, and we are working with a number of companies that are trying to create more efficient, more FLOPS per watt, types of accelerators that could potentially replace GPUs in some of these applications.
Or, you know, targeting specifically AI, for edge, for inference, for training. And what are the technologies that are going to, as you say, you know, evolve and expand these clusters with over time? Because this is just gonna be your first cluster. We wanna be there, as Mark has said, you know, we wanna be with you for the long term. We wanna help you deploy your second and third and fourth cluster.
That's fantastic. So we wrap up this section, I'm gonna thank you guys, and, and I really appreciate the input that you shared. You can see, right, that operating these systems today in a production environment introduces a myriad of issues for customers. But what we're also trying to do at Penguin is work at the tip of the spear in the next-generation technologies that it, that are coming, that are gonna help customers meet the challenges of AI as they move into the future.
So as we wrap up this section, what I wanted to leave you with is a little bit of a case study that was developed with both a customer, a great customer, Shell, of ours, and a great partner of ours, AMD, and it kinda goes to something Phil talked about, this idea of advanced cooling technologies and some of the innovative technologies that are coming that we're gonna start to see, I think, on a more repeat basis, as AI evolves into the future. So, we'll, before we proceed to the next section, we'll go ahead and take a minute to hear the story.
Shell has set a target to become a net zero emissions business by 2050. We have several roles to play. We are a producer of energy, we are a user of energy, and we are a seller of energy, and therefore, a partner of change. Producing immersion coolants is an important part of that, because that allows us to sell more products that help other industries emit less carbon.
Today, 99% of data centers are air-cooled, which means that from an efficiency perspective and sustainability, it's not the best.
When you're developing next generation solutions, the theory is one thing, but what you really want is you want to see it in practice. Having this immersion cooling solution deployed in our data center really allows people to see that it works on the ground.
What we do is we take a tank of lubricant, and we take the server, and we rotate it, and we immerse it fully into the lubricant.
The analogy that I would actually use is, imagine on a hot day in Houston, standing in front of a handheld fan to try and cool yourself off and jumping in a pool. The most effective way of cooling off in that moment is jumping in the pool, and for me, immersion cooling is very similar. So air cooling versus immersing. You put your servers into that liquid, and you take the heat out a lot faster.
This cluster that we've built with AMD EPYC technology and immersion cooling is an example of a very dense cluster that would be impossible to build if we didn't immerse it.
If I had tried to do this with air cooling, I would need three times as many racks, which means I need three times as much power distribution, three times as much floor space, three times as much networking. Being able to keep a small footprint is critical.
I think the power of collaboration and partnerships like the one we celebrate today with Penguin, AMD, and Shell, is to truly demonstrate that these solutions work, and how can we leverage technology in a way that is fully sustainable and ensures that we can continue to progress as a population?
That's a great video. Hi, my name is Pete Manca, President of IPS. I've been with the company all of three months now. Prior to coming to SGH Penguin, I was SVP and GM at Dell for Dell's APEX business, as well as their hyperconverged and converged businesses. I was running large P&Ls there. I saw the opportunity at Penguin, I saw what was going on in the market, and I saw what the people, and the technology, and the process that Penguin had, and thought this was just a tremendous opportunity, and one that I wanted to be part of. I'm here to talk to you about how we drive the success for our customers and how we overcome the challenges that these guys have talked about.
They did a great job of talking about all the different challenges that face customers today in this, this AI environment. Penguin has the processes, the people, and the solutions to overcome these challenges. And this is what we do. We live it day to day, 25 years of experience of overcoming these challenges. I'm going to come back to this slide real quick. Mark described the loss of value from not having the correct node availability to not having the correct cluster performance. This is really poignant. I think this is the most important slide we put up here, of how much customers lose in time to value and actual revenue opportunities for these customers.
If you think about these systems, as Fadi described, they're hundreds of thousands of dollars per server, per environment, and you've got 4,000 of them, 16,000 of them. You're talking millions, tens of millions, hundreds of millions, in some cases, of equipment. If you're losing half of that value, you're losing a lot of money, number one. Number two, you're losing the opportunity to take advantage of what you bought those systems for in the first place. So this is very, very important. How do we go about solving this? First thing we do is we sit down with the customer, and we talk about what's the best technology choices for the outcome that you want. We design it right from the start. After we get that, that design down, we do a very predictable deployment process.
We take the software that Phil mentioned, we leverage that software to help deploy the systems, manage and monitor them, and then we wrap it with expert services. Again, 25 years of experience developing and servicing these types of systems. That's fundamentally how we provide time to value for the customers. In short order, we deliver assured AI infrastructure, so our customers can have accelerated AI outcomes. That's really what we're about, right? Getting our customers up and running, getting them to value faster, so they can start leveraging these AI systems for what their, what their business needs are. So how do we do that? I'm gonna talk about two fundamental approaches that we have here. First, for customers who want an easy button, we have a, excuse me, ready-to-run AI infrastructure called Origin AI.
Origin AI is a prepackaged, predefined set of resources that al lows our customers to get up and run very, very quickly. For customers that want a more custom option, we'll work with them and tailor the solution with specific technologies and different partners to enable them to have a more custom solution for their environment. And we'll talk about a little bit of that, of each of these in a bit. Underpinning all of this is our unified process: design, build, deploy, and manage. And you're going to hear a lot about that. I'm going to repeat that a lot. Right, design, build, deploy, and manage. That's fundamentally what we're about. We take the customer from start to finish with their AI implementations through this process. There we go. Let's talk about ready-to-run or Origin AI first.
Prepackaged solutions come packaged with all the hardware, software, services you need, sparing, the warranties, all the stuff you need prepackaged. Customer can start from a very small 256-node implementation. They can scale up to a 4K node implementation. They can increment along the way, start small, grow the business, go to larger. It's faster time to value. It's prepackaged, it's standardized. It allows us to actually have a much simpler, deployment and environment, provides a good margin solution for us as well. So think of these as prepackaged solutions that our customers can purchase, get up and running quickly. We announced our first one with NVIDIA, recently, about a month ago, and we'll roll these out with, additional technology vendors as well in the future.
For customers who want a more bespoke option, want more tailored, we'll go in there, and we'll work with them on what the configuration requirements are, what their application needs are. We'll work with multiple partners, and there's a sample of partners here that we work with. It's not a complete list, but we work with multiple partners to bring together these very complex systems, integrate them together, and get them up and running quickly for our customer. I'm gonna come back to this again. I'm gonna spend more time on this slide. So this is really fundamentally what we do at Penguin: design, build, deploy, manage. First thing we do is sit down with the customer, figure out their requirements, help them design the correct solution to solve their business need. We use that experience.
We've had 25 years of experience to help design the correct system. Once we've designed it, it goes into build. You think, okay, build, that's pretty simple, just building these things up. It's not simple at all. This is one of the more complex parts of this. We're taking thousands and thousands of nodes. We're pre-building them in our factory, pre-wiring them, pre-configuring them, and testing them in the factory. That takes the complexity out from having to do that on the customer site. Traditionally or typically, you might drop ship your equipment at the customer site, and you might send your service people in. You try to wire, rack, and stack that at the customer site. That's nearly impossible. So we take that complexity out, reduces the risk significantly.
Build it in our factory, test it, ship it in rack to our customers, and then deploy it. Right, gets them up and running much, much quicker. That's that assured AI capability I talked about. The time to value is much faster. We'll project manage the installation. We'll get them up and running. Think of that as getting up to day zero or day one operations. And then after that, it's the ongoing management, day two and beyond, and that's where our managed services capability comes in. We manage the systems for them. We monitor them. All the stuff that the team talked about with regards to complexity around root cause analysis, determining what the issues are, where your performance might be a challenge, our software can monitor that for you, pinpoint the failure for you, and our managed services will help correct it for you.
So again, take one last look at this design, build, deploy, and manage. Fundamentally, this is what Penguin Solutions does. So you heard Mark talk about the waves of AI, and I'll just skip right to this slide. You know, if you had a You Are Here thing, it would be right around the middle there. I think we're starting to see the enterprise customer start to adopt AI and start fine-tuning those models. The next massive wave coming up is gonna be infrastructure or inferencing at the edge. And you look in terms of hundreds of solutions for large language models, maybe thousands or tens of thousands for enterprise. You're looking at hundreds of thousands or more at the edge. And think about these systems at the edge. I'll skip to the next slide.
They're just as critical, just as important as those that run in your data center. In fact, if you look at this IDC quote here, 77% of organizations define their edge workloads as either highly critical or, or critical to their business needs, right? So that means your infrastructure at the edge has to run as pristinely as what it would inside of a data center. It's got to have mission-critical capabilities, but at the edge. So think about how hard that is. Are you at the edge? And when you talk at the edge, it could be as simple as retail for point of sales, and I'll talk about a. I'll give you a case study on that in a bit. It could be oil and gas.
It could be military operations, where that you need much more ruggedized environment, and you don't have IT staff out there, right? You don't have IT staff out everywhere at the edge. So your systems better run. They better be highly available, right? They better be easily repairable. They better be remotely monitored. Those are the important things you need at the edge, and Penguin can help you there as well. They can help our customers there as well. So we have multiple options at the edge. First I want to talk about is our ztC Endurance platform. It's our fault-tolerant platform, seven nines of availability. This is targeted at customers who can't tolerate downtime, right? For customers who need more ruggedized or more hot-swappable type, we have ztC Edge components, ztC Edge products.
These products are really built for remote, ruggedized type environments, all managed remotely through our software, which I'll touch on a little bit, and with our services team. So again, we have an entire portfolio of Edge products, and we're gonna continue to build this out because as these waves continue to progress. We really feel like this is going to be an important one, and we have a—we're really well positioned to take advantage of that. Let's talk about a case study. This is an edge case study. Dis-Chem is a pharmaceutical and retail-based company in South Africa, and they needed always-on capabilities for their point of sales and the inventory tracking systems. They couldn't tolerate downtime. They had a mismatch of systems, networks, you know, and storage, excuse me, and storage systems.
Doing that today, they consolidated down to one ztC Endurance box per location, 250 of these across their locations in South Africa. And you can see the quote here from the Head of Infrastructure, "And no downtime in the Stratus-provided hardware." That was critical for them. These stores have to be up and running all the time. You need edge devices that act like your mission-critical components inside your data center, and that's what the ztC Edge Endurance products do for you. Let's talk software a bit now. You can't do all this without really efficient enterprise-class software. You know, you can throw a million people at it, and you'll try to solve the problem, but that's not really efficient. What you need is software that helps you monitor and manage your systems. So Phil touched on ClusterWare.
We'll start there. ClusterWare is our software that helps you design, test, burn in your platforms, get them up and running, get the correct software installed, and then manage it ongoing in the data center. Right, so ClusterWare is really the foundational piece of this software suite. We have this patent-pending technology called the Shared Infrastructure Modular, AIM. Think about that as using AI to manage our AI installations. Right, so it's doing, you know, GP utilization analysis, workload analysis, does predictive failure analysis. So it's really monitoring the environment and making sure that it's up and running and trying to predict where the failures might be. So again, using AI inside to manage AI. Go around the circle here, Penguin Cloud Central. Our customers leverage Penguin Cloud Central for bursting to the cloud.
So if they have capacity that needs to burst to a public cloud, for example, they can leverage Penguin Cloud Central software to burst those workloads for excess capacity when they need it. And then finally, Fault-Tolerant Active Service Network, this is our remote management software. This allows us to monitor and manage the systems that are out there on the edge. It's a very comprehensive software suite, something that we're gonna really lean into going forward as well. It's very important that we continue to innovate in this area. It really sets us apart. All this is wrapped by a very comprehensive end-to-end services portfolio. Professional services to help the customers design their systems, customer care to work on the customers with, if there's any issues, and then finally, managed services to manage these environments going on.
So a full end-to-end suite of services capabilities that we can leverage with other customers. We're also a certified NVIDIA DGX- Ready Managed Service Provider. So that's important. We're actually one of the few in the world that are NVIDIA certified today. So that's, you know, that's a feather in the cap to the Penguin Solutions teams who have been working on these installations and working with these customers for years. Let's do one last case study, 'cause this kind of brings it all together. Voltage Park, an AI cloud service provider. They needed to deploy 24,000 GPUs, you know, at scale, right? 24,000 GPUs. They needed help managing this. They came to Penguin. Said, "We need your software, we need your services, we need your capability to manage this environment so we can sell these services to our customers.
Can't do it without you." And so we went in with an H100 solution here across 4 data centers, leveraged ClusterWare software to help design and build the clusters, then manage them ongoing, then our professional managed services to help them manage the cluster ongoing. And I'm gonna show you just a quick video here from the CEO of Voltage Park.
Voltage Park is a cloud service provider. We have over 24,000 H100 GPUs deployed across the United States. So the mission of Voltage Park is to democratize compute to everyone. If you're a large enterprise and you want access to a long-term contract with us, or a professor at Stanford and you wanna do a two-week project to train a model, we wanna be the solution provider for you. As a new entrant into the cloud service provider arena, we've realized, and I think everyone knows this, that managing a large AI compute cluster across the U.S. requires not only great technology, but a large team. As a result, we wanted to go to the market and find a managed service provider that had that track record, that history of providing high uptimes for their customers at scale over long periods of time.
Additionally, we wanted this managed service provider to be a seamless extension of our team. More intuitively, we also wanted a partner that had top-tier software that has an ability to manage the hardware on a preventative basis to find the problems before they occurred, and within the data centers, have a great team and processes to do break fixes in a timely fashion. After running a very detailed RFP process, it became very clear early on that Penguin was gonna be the right partner for us. Not only do they have the technical expertise, decades of experience, but they're able to move very, very fast. We were very, very impressed. We're very close to launching 18,000 GPUs across the U.S. at four of our data centers.
We've also been able to leverage the full suite of Penguin Solutions and have had them source and validate which shared service storage provider we should be using, which they did in under a month for us. The complexities of this infrastructure are like nothing that I've ever seen before. We're definitely at the first or second inning of the AI revolution, which is gonna be the biggest revolution that we've seen in our lifetime.
All right, another great customer testimonial. Look, in closing, Penguin has got the right people, the right processes, the right technology to really help customers through their AI transformation. We've shown you a couple of customer clips today. I think they're great testimonials. Again, I'll go back to our design, build, deploy, manage philosophy. This is really what sets us apart. You know, again, Penguin Solutions has the right things. As you heard earlier, as I transition to Andy, as you heard earlier, memory bandwidth is an important part of AI systems. So here to talk about that and how we're gonna help solve that problem is Andy Mills.
Thank you. Thanks, Pete. So we're gonna take a little bit of a deeper dive into memory and what it means in particular for AI. There's a lot of complex issues that arise that are driven by AI as we move forward into this new era. And one of the things that you may have seen plenty of times, this chart, some of you may have seen this before, I'm not sure, but let me just explain it a little bit. You've got a plot here of the number of parameters that models typically need in AI, and we're gonna relate that to memory here in a second. And then you've got from 2015 to modern day here.
Then you've got this linear kind of growth here of what is actually being deployed in terms of GPUs and memory. So let's just take one of the data points just before. You can see there's kind of two areas here, the compute-limited era, and you've got the memory bandwidth-limited era, and you can see we're well into that right-hand part of the curve now. Look at the growth. I mean, you've got in the early years here, models were easily able to be, you know, fit into a single GPU. You could run most of these models, multiply them out across multiple GPUs. As you cross that, that barrier or that line, you look at GPT-2, probably GPT, GPT is the most famous. There you've got 1.5 billion parameters in GPT-2.
Just going to GPT-3, it rose to 175 billion parameters. Now, those are, those are pieces of data you've got to crunch, you know, crunch in terms of GPU and memory. Once you cross that line, you're now having to spill outside multiple GPUs. You're going to the network. We've heard about the network before, and you need a lot more GPUs to solve the problem. So they call this, this gap that's emerging here, the memory wall, and that's one of the complexities here that we're, we're actually working to solve. And as you probably appreciate, those of you who, who've worked with the company for a long time, we've been in memory a long time. We know how to do a lot of things with memory, that really play well into this. And in fact, if I summarize it.
This is old data. I literally was reading something on the plane coming out. The newer data is over two years, you've seen a 750x growth now, not, in the number of parameters, versus only a 2x growth in memory capacity. That's just in the space of two years. We're in this rapid, rapid growth cycle, and it's impacting memory in three different ways here. Reliability, I'll touch on in a second, something we've been doing for a long period of time. Scalability: how do you scale up your memory, and get it to deal with this rapid growth in the number of parameters you've got to process per GPU? And lastly, performance.
We've got some really exciting stuff going on through some of the investments, Mark referred to earlier as well in that area. So let me touch on those, those three. First one, reliability. We've, we've had a technology out there for a number of years now called Zefr. It means zero failure memory. The zero failure refers to the test period that we run in our factory. We look for no failures. If you have any failures whatsoever in the memory, we're testing these, these things under stress and under duress. If they experience one failure, they're out. And the reason we do that is because if you think of AI training models today, they can take weeks, weeks and months to get through the training period. The last thing you want is memory to drag that or, or make that fail, that process.
So having reliable, ultra-reliable memory is absolutely critical. We literally take temperatures up to 70 degrees. You know, we're raising it really high, and then we run HPC workloads on these for seven or eight hours. And again, a single failure means we kind of push that to one side. It's not bad memory per se, it's just not meeting the minimum bar we think is necessary. And Penguin deploy most of their servers now this way in our Penguin server solution group, and we have a number of external hyperscalers also using us now. Now, this involves using off-the-shelf memory modules, as well as our own modules that we make.
The next one that's kind of is really new, and I think exciting for us, is we've just launched some new products this year, which are based on CXL. CXL means Compute Express Link. It's actually a special computer bus that allows you now to attach memory in a much more plug-and-play fashion. So we're now addressing that scalability issue. And I'll show you an example here in a second, but we just literally launched, and we're now shipping for revenue, our first products based around CXL. And this is an exciting growth area for us because we are literally having most customers who are building anything with AI, AI or even just classic enterprise, big memory models, are now looking to CXL as their primary method for expanding memory, you know, beyond what the CPU itself can provide.
And we'll show you a specific example of that in a second. The last one I think is really, really where we start to get into the innovation area. We've been partnering with a number of people in the industry, but Silicon Photonics has been, in my view, something that's been in the R&D lab for a long time. It's really coming out to shine now in telecom and now in memory. What it allows us to do, if any of you are familiar with HBM, high bandwidth memory, we've been, the memory industry has been trying to scale up memory as best it can, and it's been going to 3D stacking technology. For example, the die inside memory, now there's several stacks of die, and they do this with high bandwidth memory to try and increase the parallelism.
You know, how fast can you get data into a stack? Well, rather than just going in through one door, you can go in through 8 at once, for example, right? Get more data in and out. The limitations of that are you have to put that HBM chip on the same substrate or the same net, right next to the die of the GPU, literally 1 to 2 millimeters. So from a scalability side, that's a real complex problem to take and say, "I need to now scale that somehow." So what we're working with in the lab, this is kind of targeted for, like, a 2026 kind of time frame product. We're starting to work on solving the problem of decoupling that HBM and putting it in an external memory appliance.
Now, that's actually gonna open up, because you can now make that appliance backed up with DDR5. You can put a lot of other memory technologies behind it. You're taking the problem from 96 gigabytes up to terabytes of storage. It's no good just doubling the memory. You have to quadruple it. You have to 10x it. You have to 50x it. So a lot of exciting things going on in there. So let me actually start with a case study here. We're actually actively working this with a customer, so we can't name who they are right now until they launch. But the industry's AI-based analytics. There's a lot of folks out there now trying to take basic servers and just put more memory in them.
This is just standard CPU memory extension here. What this product here, and I've got it, brought one with me to, just so you can see kind of physically what it looks like. This is our CXL add-in card that allows you to add up to 8 additional memory sticks or DIMMs inside a server, and you can put multiple of these inside the server. So in this particular case here, you can put up to 8 of these things, or 4 typically. Depends what your power budget is inside that machine, power and heat. We've seen that before. Got to keep balancing all those, those various pieces.
The challenge the customer had here was, without this, if you look at that memory, you won't be able to see it on this diagram here, but there's only about 24 of these little DIMM sockets. That's. These are the sockets that take these individual memory sticks here. You only get 24 of those. So the industry today has been using these 3D scaled-up memories to put in those sockets. They're expensive. There's, it's not a linear increase in cost. It's like a quadrupling in cost. Once you go to 3D stacked memory, you really want to use the cheapest or most common memory, which is what this targets here, which is 64 GB or 96 GB, going to 128 GB sticks.
This, putting this inside that box here allowed us to take the whole DIMM count from 24 to 88 with a simple add-in card, keeping the GPU power the same. That brought the cost down by 2/3 to about a 1/3 of what the original cost was, 66% cost reduction. Because the only way you could do this in the past was to add a second server with more GPU with more CPUs in it, with more memory, and that was the only way you could scale in the past. So this opens up a whole new exciting avenue for us in terms of scaling memory and eventually, as we showed before, scaling to HBM levels in the future. Okay, and with that, I will hand over to Jack, our COO.
All right. Great, Andy, thank you for that. Well, you know, really good stuff here with CXL and the advanced memory products that Andy talked about. And, you know, we've heard a lot of great information today about the compute, our memory solutions from everybody before you. And so my job today is to take all that now and turn it, and how that translates into the long-term model that we'll show you towards the end of my presentation here. But really, we've got, you know, our Value to Create strategy has got these three pillars here that we talked about here. The first one is really profitable growth.
You know, as Mark has touched on, you know, if you look at the company over the last three or four years, we've gone through one of the worst memory downturns ever, and we've been trying to transition back from a holding company to an operating company, but all that time we've been profitable. Every single quarter, we've been profits, and we generate profits. So, you know, as we focus on this growth, one thing you'll see from us is that we always will focus on profitability. We're going to drive profitability, and we're going to do that, as you're going to see some higher margin recurring software and services, kind of streams come in the future. That's what we're really going to focus on. We'll talk about that.
So we are going to transform our business to a more software and services-focused company, and we're going to do that by investing to expand our capabilities in these areas. But as we do that, we're still going to maintain our fiscal discipline, and as Mark talked about it, our execution. We are an operating company. We will execute as we go forward, so we're going to continue to focus on that as we're driving the growth of the business. And of course, we will deploy our capital in a very prudent manner, right? We want to make sure we maximize shareholder value with the funds that we have to maximize the investment, but also get the best returns for our shareholders.
So if you look at the last four years, you know, kind of Mark talked about it, but our revenue, we grew from 2024, a 13% CAGR. But the really important thing here, if you look at what really grew, what really grew was the compute solutions memory. I mean, revenue, sorry about that. You know, that thing almost doubled in the last four years. And what grew that revenue, if you look at services, services grew at a 36% CAGR at the same time. So really, the services revenue really started driving the growth in our overall revenue of the business. If you look at what that translated to, look at our non-GAAP gross margins. So in the same period of time, our margins grew from roughly 20% up into 30, so by 11 points.
That was really driven by this higher margin services type revenue that we added in this period of time. Also, that translates now into profitable growth, which means our operating margin also went up. We kind of talked about our pillar of profitable growth. You'll see it here, that our operating margin now is up almost 6 points to 10%, as, you know, as Mark showed before. Really, just to make sure we're maintaining our fiscal discipline, you know, operating income also grew at the same period of time. We will continue to grow, but as we grow, you know, we will grow our margins. At the same time, not only did we grow margins, so look at our cash. From 2024, our cash was up over $380 million of cash. We increased that.
But at the same time, look at the debt. We knocked our debt down by 1.2 times to little about 1.4 times EBITDA right now, right? So I think this shows you the kind of fiscal discipline that we have as a company. We grew the revenue, we grew the margin. Look at our cash and our debt. So we will continue to grow in this manner as a company. Okay, one thing you've heard about in this whole presentation is how we're going to invest for growth, right? So I think that the four main things that we're going to look at is we're going to expand our software offerings, right? We want to offer better offerings to our customers and provide them more value.
It's the key to us is we have to add value to our customers, so they want to come back to us. You know, we're also going to grow our service capabilities. As Pete mentioned before, we had a service not only at the cloud, but on-prem and at the edge, right? So we have to provide better services at all those places. And to get there, we're going to have to go invest and go to market, where we have to invest in sales, we have to invest in marketing, we have to invest in customer-focused engineering. People go talk to the customer. That's going to be big for us, so we can go out and win these new customers that we're trying to go get.
You know, Pete mentioned the edge offerings we have today, but we definitely have to expand these so we can meet the AI needs as it really proliferates out of the edge here. Excuse me. So I thought right now, if you talk about investing, you know, for growth and all that, I mean, you know, we talked about the transaction yesterday. We're talking about $200 million we're going to get here, 200,000 preferred shares, but this is going to help drive our growth here. So for the details, for everyone who didn't see the press release, there's a conversion price of about $32.81. It's based on a 30% premium to the volume-weighted average closing price over the 15-day period. So if you look at the 15-day period ending July 12, you know, it's 30% over that price.
We will pay a 6% annual dividend, and we got, you know, the other little all the little T's and C's that the lawyers here put together, you know, in the whole agreement, but we won't go through all that with you here. But really, why are we doing this? Because what we need, we want funds to be able to do what? We got a partner to help us, but also accelerate our software development, as we've talked about, right? We need to build out a focused, go-to-market organization, AI here, for the team. But have to scale our services more. And then we need to really expand these edge products that Pete talked about earlier. So we're going to make a big investment in AI-driven technologies, and we'll also look at strategic M&A to drive this business going forward.
Of course, this is subject to all your kind of closing conditions, and we expect, hopefully, by the end of calendar 2024 to close this transaction here. Am I okay on that? Okay, so now let's go to the good stuff, what you came for. What we, what do we think we're going to do in the future here, right? So if you look at our non-GAAP gross margin, you know, right now, we're roughly 32%. Our goal here in our long-term model is to get that up to 38%-40%. You know, our goal here is to grow the margins. How are we going to do that? Well, it's, it's a higher margin recurring revenue. It's software, it's services, right? That's what we're going to go drive to get our margins up. Because one thing, as you talked about, we're not a hardware company.
We don't just sell hardware. We sell a solution to customer that has hardware, it has software, it has services, and those tend to be higher margin will drive that up. And that will then translate into higher non-GAAP operating income. You know, we expect to grow that up in the 16%-18% range over the same period of time. You know, and in doing that, we will of course, keep our fiscal discipline that we have in place. We're also going to allocate our resources, right? We're going to put the resources where it matters. You know, where we need the resource, that's where the people are going to go. We're not going to invest in, you know, products or in different things that don't add value to us and our customers.
That's going to be a key as we grow forward, go forward here. And then, if you look at the long-term model go over now, so our goal here is we think revenue will be a high single-digit to low double-digit growth for us here. I talked about the non-GAAP gross margin to 38%-40%, and then a non-GAAP operating margin in the 16%-18% range. You know, so I think we will continue to look at our costs. We'll continue to maintain them, but the whole goal here is to keep increasing shareholder value as we will execute here on our long-term plan. And then, if you look at our capital allocation strategy, what are we going to do with our capital? Well, the first thing we've talked about, I'll probably talk about it some more, but we are going to invest.
We're going to invest in our go-to-market. We're going to invest in our software offerings. We're going to create more software offerings. We have to make sure we have all the software our customers need for up the stack, whether it's on-prem, the cloud, or at the edge. We will invest in that. Excuse me. And then we need to accelerate our. You know, we'll help accelerate AI solutions as we do that as well. A balance sheet. We will maintain a strong balance sheet. Our. If you look at our levels right now, we're at 1.4 times EBITDA, so we'll maintain conservative leverage, and we also will reduce leverage with warranties. So this year alone, we paid back our Term Loan A by $112 million already.
We'll continue to make sure that we keep our debt and all that under control, and then we're gonna continue to look for inorganic opportunities to grow the business, whether it be software, products, offerings as we go forward, services as well. But also, if you look at Brazil, right? We've also shown that we will divest a business that doesn't make strategic sense for long-term value. So we'll keep looking at the company, and if we have things that we don't think makes value, we'll look to divest those as well to make sure we're, you know, on our strategy, and we're executing towards that. So, if I think, you know, if you look at it going forward here, I mean, we're in the really early stages of AI.
We think we're very well positioned here and to leverage the solutions that we talked about today to drive profitable growth here as we get going, right? We're gonna increase our operating margins. We'll generate cash, but we're also gonna maximize shareholder value in doing this. So with that, I'll turn it back over to Mark for closing remarks.
Thanks, Jack. So I think you can tell over the last 90 minutes, for those of you who look at the vantage point of 2020, what a different company, and not only the strategy piece of it but really the production of it, the development of the products and solutions, the customer engagement, the proof points that we're on a path to solve for complexity. Looking ahead, themes that you might take away from today. One, very few companies have what we have in terms of the cultural mindset to help our customers at the level Penguin can, both at a hardware design level, at a software level, and a managed service level. Innovative solutions. Oil-based immersion cooling, pretty amazing. Expand software and services.
Well, today, when you think of the platform we have, coming from HPC, that we modified and developed for an AI world, a platform that we have in our cluster management software is going to enable future growth for us and adding more value to our customers. Beyond cluster management, there's areas like Clusterware that we can invest beyond Clusterware, like in orchestration, for example, or security. We have a roadmap of what we wanna do for software long term that will make us even more sticky and more valuable to our customers. Scale new customer acquisition. Look, every earnings call I've been on since I joined the company, we've talked about high-performance compute and the lumpiness and cyclicality of that business. Our plan over the next 3-5 years is to tease out that lumpiness, add more customers, and we're doing that today.
We will continue to invest in go-to-market and customer engagements, to enable us to grow our base, grow our top line, but also diversify out the noise, hopefully over time, for what has traditionally been a lumpy industry in general. And last, but not least, the theme of being a prudent, managed business. Profitability, operating efficiency, process orientation. Go back to the initial theme I mentioned before we got into the AI piece of the day was execution. As a CEO and with our leadership team here, you have our commitment to manage the company well and to keep on delivering on profitability and cash flow generation for shareholder return. So again, thank you very much for taking the time and joining us today.
We're gonna proceed into the Q&A portion of our agenda, and I'd ask my team members to come on up and join me on stage, so we can take some of your questions. So it looks like we have microphone, microphones available. So if you have a question, please raise your hand, and I'll start navigating, and then we'll get it to the right team member.
Thanks. Thanks, Suzanne. Harsh Kumar, Piper Sandler. Maybe Pete or Mark, you guys could talk about the scale or scope of the Penguin, how many customers you guys have? What is a typical deployment? How long does it take? How long is the relationship typically from start to beginning? But just mostly about what kind of customers you work with over there.
Sure. Maybe I can start, and then, Pete, I'll ask you to join in. So when you think about. Let's start with the customer engagement model. Typically, that model is like a 12- 18 month engagement, and a lot of it is upfront, is to identify the customer need requirement. And then effectively, we go into a design part of our engagement to design the solution for a specific customer environment. After that, something that's kinda gotten a little healthier is the supply chain piece of it.
For a while, it was a little bit extended, but we're starting to see more normalized, not perfect, more normalized supply chain, and by the time we finish, you know, integrating, testing, and then deploying, we're probably out into the, you know, greater than 12-month timeframe, and depending on the level of, complexity, it could be as, you know, as long as 15-18 months, and so that's kinda how the cycle breaks down. From a customer set, you know, Mark showed a chart earlier about all those industries. It's really fascinating. It complements the chart I had about, like, 80% of global enterprises will be deploying GenAI in the next couple of years. It's just staggering.
Where we've found a lot of our success today is in industries such as hyperscale or even tier two cloud service providers, oil and gas, certainly education, financial markets are a big target for us. So, we're honed in on a couple of four or five sectors that are very big for us, and we're choosing them based on our capabilities, our relationship. As some of you may remember, that when we acquired Stratus, they were in 50% of the Fortune 100, and that's something we continue to try to work on and try to scale our measure of importance to those type of customer relationships, and they were very heavy in the financial sector, as well as trading and communications. So, for us, it's been very targeted.
Openly, HPC, for those of you who are share the same hair disease that I have in terms of being gray and older, HPC was a world where it was primarily in education, government early on, research, laboratory work, and it really wasn't so much vertical solutions. It was very focused and concentrated. The difference over time is that's evolved more so into commercial. As a matter of fact, we've gone from what used to be a very balanced 50/50 type model on go-to-market resources, federal, versus, commercial, to we're very focused on, on the commercial. All of our resources adds are in this space because the enterprise demand is so, so significant, and so, that's how we think about the customer opportunity and, very engaged in terms of scaling that.
Now, we're, you know, kind of a smaller company when you compare us to some of the bigger hardware-only players. And the opportunity for us right now is to go expand that way. I would say that, you know, at the beginning of our fiscal year, we talked a lot about expansion and growing customers 'cause these are like engagements. We're not just selling widgets through a channel. We're actually, as you saw today, we're working with customers to solve the complexity of their AI deployment. These are not customers that you add hundreds of customers. We're talking about customer alliance partnerships, and I was very pleased so far with what we've done in our fiscal year 2024 in terms of new adds.
You know, you heard from one of them today, you know, 24,000 GPUs. That's a significant deployment. You know, we probably didn't do a good enough job highlighting that. Everything we're talking about in terms of solving complexity, they were trying to do already. It's not even our hardware. Pretty powerful stuff.
Yeah, I think you covered it well. The one area that we are gonna focus on more and more, and this example Mark's talking about, the Voltage Park video that I showed, that was a software and services deal. That's an example of a customer that we won in a quarter and actually took revenue in that same quarter. And so as we increase our software capabilities, increase our services capabilities, we'll look for more customers that need that kind of help, and we can come in and help deploy and manage their environments much quicker for them. So, you know, Mark covered it really, really well. These 12- 18 month processes can be lengthy. They need our expertise, but we think there's opportunity also to get some quicker wins here with software and services, and we'll target those as well.
I got to ask a little bit about, like, the role of hardware. You know, we have this portfolio to solve the complexity, and our customers are gonna drive our behavior. And so we still are an innovator in hardware. I mean, we didn't talk a lot about our hardware roadmap today, but we've built some prototypes today on RISC-V for different type of AI application environments, and we're gonna continue to lead. One of the deals we won with the government was with a third party called NextSilicon, and we're out developing solutions. So we're on the leading edge, and hardware, we're not gonna apologize. It's just we are a solutions vendor with that type of mindset, and if you think about the best data point I can give you is: look at us relative to hardware vendors.
Our gross margin is substantially higher, somewhere in the area of 2x, depending on a quarter and the data, but our gross margins are substantially higher than our hardware-only, large-scale comp, competitors, and that we're gonna have the discipline to stick to that because that's a good proxy for: are we offering more value to our customers? Question here? Yeah.
Thanks, Mark. Kevin Cassidy, Rosenblatt Securities. Question about just trying to model your agreement, so say, the software agreement with Voltage Park. It's 24,000 GPUs, and you've got 75,000 GPUs under management currently. Does that scale exactly? So is this like a third more of business, or, you know, maybe can you give us structure.
Yeah, I think a couple of things I would comment on. I think it'd be a little bit aggressive to say it's gonna scale one for one per GPU. So, it's not a one-for-one basis. And also remember that the revenue that we got for the first 75,000 was primarily with our hardware, so there's a little bit of apples and oranges in terms of comparisons. I would say this, though, it's a really attractive business for us on the software services. Again, I think the way to look at it, if you were to dissect the anatomy of our customer engagements, there will be customer engagements that they're gonna be one throat to choke. Penguin designs it, hardware, software, and managed services, and it's all in one. And there may be some bigger implementations.
We're talking to more and more customers, so, "Hey, we have bought this hardware, we just can't get it up. We cannot get the hardware working." And so, that will be an opportunity for this other sales motion, which we're talking about. Furthermore, we've actually had people say: Look, we've already deployed, we've got it up, we just don't know how to maintain it and make it long-lasting and sustainable. Then we can just plug in our service model. So the solution set's a module solution set, and as I said earlier, the customers are really gonna drive our directive. So it's really hard to say one for one, you guys are X revenue today at $75,000, 'cause there's different mix of products and solutions in there.
Okay, maybe just on a follow-up with that. When you mentioned your own hardware, would there be a case where you'd decide to use one of your competitors', hardware, just to give the right solution?
You know, it's interesting because I wanna stop from naming competitors, but if you look at the landscape, one of our competitors actually got out of the business, and they're back in. Clearly don't have the software expertise and the scale of it all, as an example. Another one has been really a host of acquisitions and a little bit of a brain drain, and they don't really have the host necessarily to do the capacity to do it all. And the third is just a very efficient operating company, but hasn't really invested that way. So the overwhelming answer to your question is absolutely. Because for us, as an example, the solution set is the customer's mindset of deployment success. Whatever we can do to enable, that's great, and we're having conversations to further accelerate our ability to do that and also scale our go-to-market more efficiently as well.
Can I jump in real quick?
Sure.
I think, too, Kevin, remember, too, this is all recurring revenue, right? So as we get more of the software services, this is the revenue we take over time, so this will help to kinda smooth out the lumpiness as well. We don't get it in one quarter. We'll take it, like, you know, 12 months, 24, 36, whatever the contract is. So for us, it's much better, you know, vision to the future and also the recurring revenue stream, so that's why we like, we like this kind of business a lot as well.
So that is gonna be the model Jack talked about. How we're gonna try to build this business over the next 5 years, is building out more and more of that quality of revenue. Obviously, we're coming from what has traditionally been a hardware system business, and we're starting to see signs of success. I think over 5 years, you'll see more and more of this type of solution sale on a revenue recognition approach that Jack said. We're starting from a hardware business today, obviously, and we've got more and more services. We'll continue to invest in scaling this.
Hey, thanks a lot. Ananda Baruah, Loop Capital. Mark, thanks for doing this. Just sticking right there, when you go in, you can take Voltage Park, for instance, but sort of how you guys envision it, would you, for services and software, would you—do you go after deals on a, on a cluster basis? How is it done? Actually, just if you could describe that mechanism, that would be great.
Sure. I mean.
Thanks.
Are you asking how do we charge the customer? What's the pricing model?
Or, like, what are you actually going after to compete for and to help them with? Is it, is it their whole data center? Is it on a cluster-by-cluster basis- a project-by-project basis?
It's typically on a cluster-by-cluster basis. And we'll charge these customers on a per node, per GPU type, pricing model, but we typically want to quote for a cluster.
Cool, and just quick follow-up: Are there any notable differences kinda scaling from, and this is software and services question, scaling from the traditional HPC business into, into, you know, GenAI with these, you know, sort of growing clusters, larger clusters, you know, much more commercial environment? Like, what are the, what are the pain points? What are the, what are the sticking points? What are the, what are the win points, also relative to what other folks are doing on software and services? Thanks.
Mark or Phil, do you want to take this?
Maybe I'll say a few words, and Phil can chime in. So it is different. If you think about high-performance computing, there were kind of a couple of different classes of application types. Some, though, a subset of what was going on in HPC was architecturally similar to AI, with this highly interconnected workload type that was going on.
For example, weather forecasting, climate simulation, those types of workloads in HPC had very similar characteristics to what we're trying to do here, but were running in environments that did not have anywhere near the requirements in terms of things like information security, you know, the level of encryption that goes on in a lot of these commercial environments that we're dealing with now. So it essentially takes foundational skills that we had, but we have certainly, you know, acquired new skills and kinda driven things up in order to drive value at the enterprise level versus the more educational research that Mark touched on earlier.
Yeah, and, speaking towards, you know, your, the, the difference between the HPC markets, you know, as, as Mark mentioned, you know, government, education, and certain leading-edge engineering organizations, to commercial, I think it's, it's been this, that, that the government had a long history of building, you know, world-class, top, you know, sort of, high-performance data centers.
That, because they were buying these clusters, they were already at the forefront of the power per rack curve. And as we start working with more commercial entities, they are less likely to have a data center that is prepared for these high powers that we've been talking about, for the interconnectedness that we're talking about. So that's where our design services and our ability to be flexible around what the design is, to fit it to whatever the customer is presenting us, or being able to hook that customer up with a data center that they could put this in as a purpose-built, take over a small data center to house a dedicated cluster.
Thanks a lot.
Hi, this is Kyle Bleustein from Barclays. Thank you for hosting today. So I wanted to ask on the long-term operating model, like, if there was a timeframe assumption for when you'd hit the gross margin target, and then also embedded in that, like, what would the mix assumption be between, like, IPS, memory, LED? So assume you need to have, like, a much higher part from the services business in order to get there. So just any thoughts on or color there would be helpful.
I'll ask Jack to take that one.
Yeah, I mean, I think our range we're looking at over the next 5+ years, kind of for the long-term model. I mean, I think look at the business growing forward, right? I mean, Penguin will grow faster than our memory business and definitely our LED business. If we look to the future, that's where we're putting more resources in. You know, so the margins will grow as that business grows. Memory, though, as Andy talked about with CXL, the advanced memory product, those are higher, gonna be higher margin products. That's gonna drive. So if you look at the memory business, the growth of that business will be driven by these new products into the AI space. It also will help our margins go forward, for Penguin Solutions.
Awesome. Then just a quick follow-up on the CXL space. Like, for general purpose, like adoption or at least when you're seeing more customers around, 'cause you mentioned you have a first win in, like, revenue shipping today. Do you expect that more, like, late 2025, end 2025 timeframe, or is that more like... Like, when does that kind of.
No, I mean, we're shipping. We've got more, you know, we've got more than one customer buying product from us today, but yeah, this year, 2024 will be small revenue shipments. Calendar, end of calendar 2025 is when you'll really see the new CXL, I think, come out.
Yeah.
Right, Andy, 3.0?
That's right.
Andy, maybe, might be helpful for the audience. Can you talk a little bit about CXL 1.0, 2.0, and 3.0?
Sure. Yeah, we're right now the industry actually AMD and Intel launched the CXL 1.1 capable servers as early as last year. It's really taken this whole 12 months for the industry to catch up with that. So we're just entering the end of the 1.1 era and going into what we call CXL 2. So this product we showed earlier is a CXL 2.0, so we're just transitioning into 2.0. Next generation Intel CPUs, Granite Rapids, if any of you are familiar with the terms there, AMD Turin, all those are gonna transition to 2.0. So those are all just basic memory pooling or memory expansion products. Once you go to CXL 3, which is around the 2026 timeframe, now you get into memory pooling and switching and sharing.
There was this famous article by Microsoft, you know, about memory sharing and the fact that there's a lot of wasted, wasted memory, you know, just sitting there doing nothing. So what we're moving into as you go to CXL three is the ability to share and pool that amongst multiple CPUs and put the memory where it's needed. That's the most important aspect. You get switchable or shareable memory that's steerable. And software, that's where software comes into play. I mean, we didn't mention it a lot in this talk about memory. Software-defined memory now becomes possible with CXL three. So that's where, again, from an AI standpoint, you want the flexibility. CXL three enables that. Software backs it up.
So I think, one distinction I'd also like to make is, this is a technology that we're actually out leading in the industry. If you think about SMART Modular's history, very reliable, specialty manufacturer, but this is a different approach to memory solutions and, the fact that we've got early-stage customers buying product from us today, our roadmap's solid, our design engagements of the, people who have bought sample products and qualifying our products, it's a different model. We're pretty excited about it.
Thank you.
Sure.
Hey, Nick Doyle, Needham. Thanks for doing this, guys. Can you talk about how you're doing the burn-in testing? You know, just what it is, why it's important, and how your capabilities are different than competitors?
Which side? More on the compute side or the memory side, or both?
Both, just in that design phase.
Andy Phil, could you start with the compute piece?
Yeah. Yeah, that Penguin has, you know, for their 20+ years, we've had a really strong emphasis on the, the burn-in testing that we've done of our clusters as, as clusters running cluster workloads, and have had a long history of a minimum of 24 hours of burn time. On newer products, you know, we would typically do closer to 44 hours of burn. And we continue that tradition, you know, today, that, that part of getting the systems racked up and cabled is so that we can actually test all of those components in the factory before we ship. The memory testing, you know, one of the, one of the tests that we run when we're doing system builds is actually to exercise the memory. And there was an interaction.
Many, many years ago, Penguin bought memory from SMART Modular, and I think this probably predates Zefr because, you know, we sort of showed them what we were doing and how we were finding bad DIMMs, and they took that and ran with it and merged it with their expertise in memory and their deep connections with the memory suppliers so that they could take those same memory tests that we were doing, productized it as a way to test memory modules at scale. So when Samsung or Hynix or whoever builds a DRAM die, they have a set of test, you know, vectors that they test the memory with. But they have to optimize for speed because they're in a high-volume industry, and so they're not really so concerned with getting every last little error.
They just want to get the grossly bad chips sort of culled out. But the Zefr process that SMART now uses allows us to burn in for that 20- 24 to 44-hour kind of level at elevated temperatures with a much more complicated set of vectors and real-world workloads rather than just test patterns, and allows us to capture any single bit event because the strongest predictor of if a memory will have a problem in the future is that it had a problem today.
Right. So that extended testing that the Zefr process does to the RAM gives us that confidence that we've found, you know, however marginal those bits may have been, that we found those and culled those out, and then can work with the memory supplier to help them optimize their bit patterns and help them improve their stuff. So it becomes a very, you know, at all three levels, it's all reinforcing and feedback to where, you know, Penguin is testing, SMART is testing, and we're helping improve even the memory die manufacturers and processes.
Andy, anything to add?
Yeah, and I was going to say, sometimes some of our customers are shocked. I mean, if you go to the broader industry of where we ship Zefr modules to, on just how much fallout there is, when we go through this test. And a lot of people said, "Oh, our memory's fine." You know, we run it through our process and say, "You know, you had 30% fallout." 30%! You know, so you get mixed reactions from folks, but that's because the memory industry was built around shipping modules to a standard enterprise or, you know, basic server market, where they cared less about, you know, memory reliability. AI is running models there for several weeks at a time. The last thing you want is your memory flipping bits, you know, and messing around with that stuff.
It's got to be solid through that period. So, that extra screening we do, and we get extra margin for that, just to be clear. It's not something we do for free, of course, so it's something that adds to our bottom line as well. We can do it both as a service or as a product, literally as DIMs that are pre-tested, or we can take, you know, the customer's favorite DIMM, run it through the same process, and then ship it out to the customer, and they've been pretty happy with the results so far. I think it's been excellent results.
Yeah, and just to remind you, all this happens at our headquarters at Penguin Solutions headquarters in Fremont, two 2 MW facilities there, as well as our surface mount technology, where we do the integration of the memory solutions, five miles away. So we have a pretty tight ecosystem here in the U.S. on the integration of all these components. For example, if you went to our Fremont facility, not only would you see the stack and rack and the integration of the compute side, you'd also see things like this oil-based immersion cooling example that we used earlier. We do that before we ship it to our customers. And that's also a quality framework that we use that they like to see. They see it working before we actually bring it to their physical site.
Yeah, if I could ask a second one, maybe for Pete. Are you seeing more customer engagement for the Origin AI solution or the custom, you know, tailored solution? And then maybe just talk about what kind of customer would approach you for the different solution.
Yeah, so I think the answer is it's too early to call to say whether there's a trend going one way or the other. We actually have customers that have leveraged both the tailored and the Origin AI. Voltage Park's an example of an Origin AI-type customer. I think, you know, if I look out to the future, I think when we start hitting wave two more and more of those enterprise customers, we're going to see more Origin AI. They're going to want to get their systems installed and up and running very, very quickly. They're much more focused on time to value and making sure that they're up and running and delivering services to their customers. So I think Origin AI is really well positioned for that enterprise market.
I think the other thing I'd add there is that, you know, with our engagement model being more higher complex, higher touch engagements, as I said earlier, we've never really been involved in selling this GPU model sideways, what have you. This model, the Origin AI platform, allows us to do more proof of concept stuff quicker, to Pete's point, to get into our customer engagements, and as we get in, then we can tailor solutions longer term. So it is a continuum. It's really too early to call, but we're excited about the portfolio. It gives us the option to engage customers more rapidly to get them on their learning curve.
Thank you, this is Dennis on for Brian Chin at Stifel. We want to ask, so this SK Telecom and infusion is going to accelerate this timeframe, but how many years do you think it'll take for the company to achieve a more balanced mix of, let's say, hardware versus software managed services deployments?
You know, it's really, as we talked about, and I think Kevin asked the question about, you know, kind of go-to-market options as well. I think you're going to see us invest in partnerships. This one obviously a very strategic one, but go-to-market partnerships and be able to scale our business in different ways. As you might imagine, we've come from a direct sales force model exclusively, and we've been selling to our customers that way. Well, given the growth opportunity in front of us, we're evaluating different paths to the customer.
Maybe it's a systems provider who doesn't have our capabilities. Maybe it's a large customer who's influencing a piece of the ecosystem. And so, I think go-to-market investments in terms of partnerships and, and how we engage customers at, at scale, whether it be here domestically or internationally, whether it be, a hardware systems provider, there's just a different set of options we're looking at to grow faster.
Great, and maybe as a follow-up, what is the target customer for CXL-based products? Like, where do you have an advantage versus competition from, you know, direct to DRAM manufacturers, for example?
We've actually chosen a fairly, I would say, unique path today with our add-in card story because one of the first things we identified was this has to be real easy to deploy. So, it's no accident that that card looks pretty much like a GPU slot card, and you can drop it straight in. The power is there, everything's there. So the target customers, to answer your question, are really, obviously, our AI servers, no question. You know, you can expand memory. Phil and I have been in discussions about petabyte level memory systems, where you're having to put these cards in to expand greatly the amount of memory inside a single box. That's the unifying theme.
But if you go outside the Penguin circle for us as a memory group, we're seeing the Tier one OEMs, Tier one servers are qualifying these kind of products. We have active engagements with several of these guys. And we expect, again, with that CXL 2.0 generation coming out probably around February of next year, February, March is when Intel and AMD talk about launching their next generation. We expect that's when, you know, CXL 2.0 add-in cards get serious. The bigger appliance model, I think that's gonna take a little bit longer, but there's absolute interest and desire to move in that direction.
If I could also say, I mean, in the HPC space, there has long been a desire for large memory systems, and there are certain applications in HPC that could take advantage of very large memory sizes. And some of those are in the U.S. government, some of those are in bioinformatics. And so this development of the current generation of CXL has a number of applications in HPC that we can already foresee and are already working to make some proof points so that we can target those market verticals. And as new capabilities come out, there is—there's a dramatic potential in HPC for new ways of writing algorithms that will accelerate HPC computations by virtue of having this sort of shared memory capability.
Well, one thing I will add, too, is one of the benefits that people are seeing is actually, they kinda say, "Well, CXL is kind of a slower form of memory, right?" No, it's actually 1,000 times faster than SSDs. So what you're seeing is 500% improvements in things like SAP HANA databases, you know, databases that are memory intensive. Instead of them having to spill over to disk, which is typically what happens when you run out of memory, you're stopping that spill over. You're spilling into CXL, which is only, you know, a small jump down from your main memory. So there's some great benefits in just expanding to get rid of that SSD tail problem, as it were.
Yeah, I mean, real quick, the customer base could be anywhere from the mainstream server guys, the card that i f you look at Andy's slide, that was a RISC-V inferencing box at the edge. It's gonna use CXL. So it's, it's the gamut. It's not just one set of customers. It goes from AI at the edge all the way back into mainstream servers. So it, you know, we've seen customers going into memory pooling boxes now, servers, a lot of different use cases for CXL. It really broadens our customer base, and really, for the first time, it gets the memory business into the cloud. We never really did much with data centers. This is the first product that really gets memory into the data center, into the cloud for us.
Yeah. It allows us. One final comment. It allows us to build. We've got a long history in non-volatile persistent. If you've heard of Intel Optane, that kind of disappeared and went away. It was very expensive. We spent a lot of time, especially in AI inferencing, where you have to do checkpointing. We're continuing our developments of non-volatile memory solutions, which gives you memory class performance, that when you lose power, data doesn't go away. It stays there. So CXL actually makes that problem a whole lot easier to solve than the way it's been solved before. So it opens up, as Jack said, many different opportunities for us. We're very excited about CXL.
Another question?
Yeah. Peter Wright, P.A.W. Partners. Given the enormity of the opportunity, and the fact that we're maybe in the top of the first inning of the Fortune 1,000 deployments, why are we only thinking high single digits, low double-digit revenue growth longer term? And second, could you give a little bit more insight on how you want to deploy that $200 million that you're getting with this new, Korean, opportunity, and what was some of the thinking behind that relationship?
You know, first of all, I go back to my slide earlier today. When we got up in front of you on the Analyst Day in April 2021, we gave you a long-term model. We take it very serious. It's a long-term model. We take it very serious. The revenue projections contemplate a shift in mix we're talking about. We don't think the hardware is going to be as significant in the future as it is today. So if you build in all of that into a model, as we are projecting, services and software will be a bigger percentage than it is today in our long-term model, and thus, the top line won't be what we're able to say is the highest growing piece of it. The profitability and hopefully falling through to the operating line will be our growth.
Secondly, relative to the $200 million, let me start with this: This agreement that we announced yesterday was all strategic. It was amazing because, as I said earlier, when you think about these type of announcements, there's normally, like, one area of focus to go try to explore together. Between computing and networking and power and cooling systems, memory, we have so much in common with this partner that we're—we've got a number of potential areas for us to go explore together, and the amount of interest they showed early on for what we do as a complement to where they wanna be was uncanny. It was just a, it was an amazing fit for us as a partner and vice versa. As it relates to the $200 million investment into the company.
We're at a really, a once-in-a-generation opportunity for this company, for, for any company. Like, how often does an AI-type opportunity come around? And the investments into software, the investments into expanding services to take on new customers, right? We told you we're a pretty well-run company. You know, we don't have a lot of people sitting on the sidelines waiting for the next five customers. We have to invest as we scale. We'll do it prudently, like we've done in the past. Be able to go out internationally and grow our business over time. That's a key area. M&A, if we can find assets that really allow for further differentiation, we will take a look at that. Another area that we've done is to invest with our technology partners.
We talked about one today in the area of photonics that we mentioned during the course of the year. Investing in new technology so that we can adapt that technology and further differentiate the company. Those are all areas for us as we think about the usage, but I go back to earlier slides. You know, Jack showed you the financial performance of the company, the cash generation, the balance sheet, and what have you. We are prudent managers of this business, and we will continue to act as such.
Hi, Kevin. Kevin Cassidy from Rosenblatt Securities again. Again, thanks for the day today. You touched on a little bit. I was gonna ask about the SK Telecom, and I've seen that SK Telecom has immersion, and is that one of the key technologies that you're working with them on, or is that something.
Well, again, I wanna be careful. We're in the early stages of this relationship, obviously, and so when we said potential areas of collaboration, over the timing of this, these discussions, there was just so many things going on in terms of, like, immersion cooling. Immersion cooling is certainly one of them, Kevin. Another one that's actually very, really interesting for us, they're very heavily in investing in AI. I'm gonna screw this up, but if you Google SK Telecom yesterday, they launched here in the U.S. and Canada, an AI application that they've developed for diagnostic of animal health. I don't know if any of you saw that yesterday.
This company is one of the best-kept technology secrets, and where they're really needing help is in enabling this, to deploy it all, and that's really what they really admired about us, is our ability to do what we do, is serve large-scale customers in the deployment of technology. So, but they've written a lot of software in AI, and so there's opportunities beyond just cooling. There's opportunities in the processor side of the house, as they're looking to invest and develop their own unique approach to the processing for AI. There's opportunities for advanced memory. There's opportunities for networking. These are all opportunities.
When you say working with us, I just want to be careful to mention that we're in the early stages of this partnership, which makes it so exciting because I think our challenge is gonna be, how do we prioritize and map all the resources out? We will have a collaboration team on our side and their side that meets quarterly beyond the day-to-day stuff to prioritize where we should be putting our focus together in this partnership. But it's, you know... Actually, I'll tell a real quick anecdote. Jack came into my office nine months ago, and we were talking about partnerships, and, and by the way, just even more of an anecdote, when I was at Micron, you know what saved that company? Partnerships. IMFT, Intel, partnerships. Elpida, acquisition, partnerships. That company was on the ropes. I know. I was there.
It was $1.87 at one point, and we consolidated the industry, but partnerships saved that company. Now, we don't need to be saved. We've got a great business. We need to accelerate. We need to grow 'cause this is a once-in-a-lifetime opportunity, and things like the question you asked about enabling partners or the investment in advancing technologies, SK is a global company. The group is a $250 billion market cap. We have an opportunity to scale with them as a partner. It's an amazing outcome for us. We have time for one last question.
Yeah, thank you for squeezing me in. It's Quinn Bolton with Needham. I guess maybe more first, as you focus more on software and services only, like Voltage Park, is that more or less efficient as you're deploying on hardware that you've either designed or not designed? Is there a difference? Do you see, you know, a difference in margins, time to deployment, if, you know, it's somebody else's hardware?
Yeah, we're always more efficient if we're designing it from scratch. I think that's sort of an obvious statement. In the case of Voltage Park, for example, it was on standardized hardware that was sort of well known to us, so it wasn't too much of a challenge. But in general, we're way more efficient if we can get there early on, design with the customer, bring that hardware, whether it's ours or even a partner's hardware, bring it into our lab, rack it, stack it, test it, burn it in, and ship it already pre-configured. That's the real Penguin value.
And then does your software run only on x86, or are you processor-agnostic? Could you run on ARM or takes that RISC-V, finds its way into the data centers?
ClusterWare has been ported and enabled on ARM for a number of years. Unfortunately, Ampere is sort of the only server class CPU vendor, and they pivoted to a large number of cores away from an HPC sort of utility. So we haven't had a lot of take-up for the ARM version, but there are some opportunities on the near-term horizon where we think that could be very interesting, and there are new CPU architectures like RISC-V that we've talked about, that having that software in-house will allow us to make that port when it's deemed prudent.
Lastly, for Jack, you set a long-term gross margin target for about a 600-800 basis points improvement from where you are today. I think a lot of that's likely driven by the mix shift to services and software, but you also mentioned portfolio optimization, and so how important would divestitures or just de-emphasis of lower margin products be to that 600-800 basis points improvement?
We will continue to evaluate it. It's not like it's not based on the fact that something's going away, right? But we'll continue to evaluate our products all the time today, right? What makes sense from a margin standpoint? Should we continue to invest in that product? We'll continue that in the next five years, and if we have something we don't like, we don't think it's, you know, it makes, you know, it's where we want to be, we'll look at divesting it, minimizing the investment in it, in the business.
And I know you cover, you all cover collectively a lot of different companies, and I know you've never heard of this, but companies sometimes have a problem. They love investing in new stuff, but they have a real tough discipline of not getting out of the play sandbox of doing other stuff. That's the discipline I'm talking about with our company.
We will be vigilant there because we want to fund our growth partly through our balance sheet, but mostly through, you know, optimizing our business model long term. And, you know, it's a discipline. I mean, it's hard because, you know, everyone loves the new stuff, but sacred cow of touching the old stuff. I know you've never seen it, but it happens. And we're very disciplined about how to do that and, you know, as much as we're talking about all this excitement, we have these conversations in our offsite. What do we need to stop doing? And we'll be committed to that as well.
So, hey, let me just stop. First of all, before I close, I wanted to recognize a few other people in the audience, too. Our Chief Legal Officer, Anne Kuykendall, is with us today in the front row here, as well as our new CFO, Nate Olmstead. Nate joined us, as many of you know, after being the Chief Financial Officer of Logitech, which today is a $14 billion-$15 billion company. I think, you know, it's a great testament to someone on the outside seeing this opportunity and joining us, as is Pete. Couldn't be more grateful and excited to have someone with Nate's experience on the team.
And finally, in the back row, we have our chair, Penny Herscher, who's been with us for the last few years and has just she's led, first of all, a world-class board. Includes people like Mark Papermaster, who's the Chief Technology Officer at AMD, Bryan Ingram, the former COO at Broadcom, Randy Furr, the former CFO at Bloom Energy, Sandeep Nayyar, who's the current CFO at PAE, Mary Puma, who's a former CEO of Axcelis, a $4 billion equipment company, and Max Loeb, who's Chief Marketing Strategist at Bosch. So we've got a team, both at the board and the executive team, I couldn't be more pleased to be around. It's an exciting thing. I mean, I would imagine a lot of this is really helpful today. I hope so. But at the end of the day, you're betting on people. I think we have a world-class team to go execute on the strategies. Thank you very much for joining. Appreciate it.