Today we have Astera Labs, and, you know, they're a recent IPO. They're also a pure play on AI. We're going to hear a lot about, you know, Astera in a bit, but I do want to take this opportunity to thank Nick Aberle, who heads the IR team at Astera, and also we have the Co-founder and President of the company, Sanjay Gajendra as well, so thank you both gentlemen for joining. I know this is a busy conference season for you guys. Appreciate you guys coming over. Maybe, Sanjay, you know, there's a lot of interest in the stock, obviously, since the IPO. There has been some volatility, but you guys have done quite, quite well so far.
Maybe, you know, give us a little bit of background on the team and, you know, what markets you're addressing and what the problem that you're trying to solve as far as AI infrastructure is concerned, and how you think about, you know, the market opportunity longer term?
Great. No, first of all, thank you for having us, and thank you for attending the session. So just to answer what we began with, let me just take you back to 2017 when we started the company. So there were really two main trends that we saw when we put the business plan together. The first one was based on a simple premise that the server that we knew in data center were fundamentally changing from what used to be a CPU-based architecture to one that had CPUs, GPUs, accelerators, and other kinds of compute elements. At that point, it was called heterogeneous compute, was the technology term for it.
So for a heterogeneous compute, what happens is that when you have multiple processors generating and consuming data, the connectivity fabric or connectivity subsystem that's needed to interconnect all the processors is fundamentally different than a CPU-based where CPU is just the king of the motherboard. Now you have multiple kings and queens on the motherboard that all were vying to talk to each other. So the nervous system that's needed in a heterogeneous compute type of server was very different than what the industry had seen for the last, you know, 20, 25 years. So we saw that as an opportunity that we wanted to tap into. The second big trend that we saw was the simple fact that the hyperscalers were starting to get more vertically integrated, meaning they're doing their own chips. Right? You already had Google and Amazon and others that were doing their own chips.
So we saw a change in how the supply chain needs to be developed for addressing this trend of vertical integration. So we built a company that essentially develops connectivity technology for interconnecting all of these processors, accelerators, and so on in a way that is very different than what was done before. That's number one. These products were designed to address the bottlenecks of data, network, and memory in a way that the GPU can be better utilized when it gets deployed because these GPUs, as you know, are not cheap. They are thousands of dollars, tens of thousands of dollars. And the fact that still remains today that applied then is that, you know, roughly about half the time these GPUs are actually sitting idle. People write these big checks to Jensen and NVIDIA.
The reality is half the time they're not utilized because the connectivity infrastructure is not at the same level to be able to move data or move memory. So our technology is designed to address those bottlenecks. And then number three, we wanted to build a company that is more forward-looking, meaning the next, you know, 15, 20 years of industry growth. So we built a team that essentially was equipped to cater to the future of computing in terms of how devices need to be architected. Some of you may know we use a very unique architecture that's software-defined, where our chips are, you know, 50%-60% implemented in software, which means that we can get to fabrication quicker. We can customize our chips quicker. If we have a problem, we can just upgrade the firmware.
Just like your cell phone, you know, you can, you'll get a software upgrade notification, and suddenly problems are resolved. It brings in more features. In other words, our chips keep getting better over time. Right? So there are some fundamental things we did, including implementing all of our chip development in the public cloud. We don't have servers inside the company. What that allows us to do is scale our company very quickly, which we have done, leverage some of the AI tools that are available for, you know, circuit design and circuit simulation and all that stuff. So we have built a company that is really designed for addressing the foundational aspect of how data centers, servers, and computing is designed. We built a team and infrastructure that's really designed for addressing, you know, what you see today, which is this one-year product cycle, two-year product cycles.
You know, we are able to iterate fast. We're able to move fast. And that's allowed us to gain a position of strength in the market where today every AI infrastructure that's being rolled out in the world, you know, there is a good, you know, nine out of ten chance that it features our content, number one. And it's also given us a front row seat where we can continue to innovate and add more product lines. So long introduction, but hopefully that sets the framework for other questions that I'm sure you have.
Yeah, yeah, sure. Thanks, Sanjay. You know, one of the frequent questions that we get is that, look, you know, we all follow NVIDIA. We all know how fast their GPU business is growing. So how should we think about Astera's growth, you know, versus, you know, what the GPU market is? Is it kind of, you know, one to one? Is it like, you know, is there an opportunity for you to even outgrow that market? Or what are the puts and takes, you know, because we all follow that GPU market very closely?
I think we want to believe that we are going to outpace them for the simple fact that, you know, we are sort of that Switzerland, right? So we get to play on NVIDIA ecosystem. We get to play on AMD ecosystem, all of the internal ASICs or accelerators. We get to play in them as well. We, you know, to that standpoint, if you look at our latest numbers that we shared as part of our Q3 earnings call, is that our revenue popped up 47% or 48%, something like that.
That's sequential, right?
Sequential, correct. The reason it did that was because we, you know, we saw a whole bunch of our design wins, which we probably have close to 400 design wins that are for the internal accelerator-based platforms. I've always said I think investors have underestimated how the hyperscalers have reacted in terms of investing in their own accelerator programs. I think that's starting to happen. You saw that in our numbers. You saw what AWS was showcasing last week as part of their re:Invent show. So to that standpoint, we believe because we have exposure from multiple AI architectures, internal third-party GPU, different class of device from retimers to now fabric-class devices, chips, hardware modules, software, which we call COSMOS. So we do have, fortunately, multiple different growth vectors that are supporting our story and growth in our business. So we do expect this story to continue to grow.
In the internal accelerators, we are also fortunate to have two places that we can dip in terms of BOM and revenue drivers. Some of you might know this term. There is something called front-end network, which is the GPU cluster talking to CPU networking and things like that. In NVIDIA, we only get to play in the front-end because the back-end, which is a GPU-to-GPU interconnect, is implemented in a standard called NVLink, which is proprietary to NVIDIA. So us or anyone else don't get to play there. Whereas for the rest of the non-NVIDIA ecosystem, they all use variants of PCI Express or other standards. And that's an area that we get to play in.
It's also a very, very fertile ground, is what I like to call, because what happens in the back-end is that every GPU needs to connect to every other GPU, which means that, you know, if there is a 64 GPU cluster, there are 63 connections per GPU, right? Compared to NVIDIA, going back to your one to one, in NVIDIA, for every GPU, you have one link that we can service. Whereas on the back-end, you have this mesh topology. So we get to play in a multitude of links for a given GPU. So our content is much richer. And therefore, our, you know, overall growth, to go back to your question, is that because of how the server is architected, how some of the ecosystem works, you know, we do believe that we are well positioned to continue to grow our business.
That's great. So obviously, the non-NVIDIA market today is relatively small, and it's probably growing at a faster pace from a low base. But at the same time, it feels like, you know, you have an opportunity to, you know, address even more content per, you know, device or content per XPU or APU, whatever we call that. So maybe, you know, dig a little deeper into your product portfolio. I think retimers account for a vast majority today. There are some emerging opportunities, but you also announced some new products recently. So maybe you can talk about that.
Yeah, absolutely. So I think we have always said this. Our vision is to own connectivity at a rack level. So if you imagine an AI rack, I mean, NVL72 or NVL36 or folks that follow NVIDIA closely, right? So the vision for us is to own the connectivity within the rack, within clusters, and so on. You know, what does that mean for us? It means a few things. One is that we are addressing protocols like, you know, UALink for scale-out, Ethernet for scale-out, PCI Express for peripheral connectivity, and CXL for memory expansion. So we play in all the four areas now. You need products like retimers for each extension. You need controllers that convert from one protocol to other. We have all the CXL controllers.
And then we have the fabric devices, which are switching devices, meaning they take data from one port and then redirect it to someone else because of where the data has to go and so on. So those are the three fundamental devices. The newest product edition is our fabric devices, which I'll touch here in a second. And the third piece of, you know, addressing that rack-level connectivity is to address both copper connections and optical connection. So earlier this year, we demonstrated the industry's first PCIe over optics as a POC. We haven't announced products yet, but clearly the vision that we have is to address both copper and optical media.
But in general, if you go back to the rack-level vision that I was trying to paint, and you think about it from a connectivity chip standpoint, modules and hardware standpoint, and software standpoint with COSMOS, which enables observability, optimization, customization, all that stuff, you can see how we're trying to address the connectivity that's required within the rack, except for the compute trays. The compute trays are the GPUs, accelerators that will come from a variety of different suppliers or hyperscalers. But in general, that's what we're trying to build. So the latest edition that we announced is our Scorpio series. These are fabric devices. There are two specific categories there. One is what we call the P- Series, which is designed for PCI Express protocol to interconnect the GPU to the CPU networking and storage.
Then we also announced the Scorpio X- Series, which is designed for the GPU to GPU interconnection, which is really a nascent field. It's just starting to happen. You know, NVIDIA has it with the NVSwitch, but there's a new market. Collectively, this is about a $5 billion opportunity, is what we are modeling for 2028. $4 billion comes from the back-end networking, which we believe UALink would be the standard that will bring everything together. We recently replaced Broadcom on the board as the technical expert in that standard. We're working with folks like hyperscalers that have joined that, you know, Microsoft, Google, Amazon, and others. We are essentially going, you know, significant investment on the fabric side.
But at the end of the day, what we're trying to address is this rack-level story that includes chips, hardware, software, and provide that comprehensive solution with which can be deployed at scale.
That's great. You kind of mentioned that, you know, the Scorpio switch, essentially roughly a $5 billion type of market, it's fairly new market, right? You know, it doesn't exist today. It's pretty much all green field. But I guess if you kind of talk about, you know, look at some of your other products, you know, where you already are a leader in retimers and then AEC, which is, you know, just starting to ramp. I guess how do you size your overall opportunity? Let's take, you know, I don't know, three- to five-year time horizon.
Yes. So if you start to bring in the retimer TAM into the mix, we believe the PCIe retimer TAM is around $1.5 billion, looking out to 2028, and then the Ethernet side is going to be another $1.5 billion, so if you think about the drivers behind that from a macro perspective, you know, we already talked about, you know, just accelerator growth in general as AI servers and AI infrastructure continues to be deployed at scale, but as speeds and complexity continue to increase across all this infrastructure, it really drives the need for additional reach extension. If you think about it as speeds increase, the distances that you can move, signals get shorter, and you need more retiming tech products to kind of fill those voids, so we see a combination of things driving that TAM.
Number one, as you move to more complex protocols and faster protocols, we'll see an ASP increase. So the price of our products will continue to go up. And you'll need more of our devices as well because you'll need more spots for retimers where you didn't need them before. So we see good prospects for both of those areas to grow as a bigger piece of our total TAM over time. The CXL piece is very similar to the back-end X-S eries equation, where it is largely a greenfield opportunity today. It's a nascent protocol that we see having very big merit over the longer term as infrastructure players look to expand their memory capabilities. And the catalyst for this really starting to take off, I'd say in 2025 and beyond, is data center server CPUs starting to now be CXL capable.
So if you look at the announcements from AMD and Intel over the last month, month and a half, Granite Rapids and Turin have both been launched, and those should both be ramping in volume production into next year. So we've talked about having designs in flight, you know, leveraging both of those processors and also ARM-based solutions as well that will have, they'll also be CXL capable. So you can start to see some of these use cases where memory expansion starts to really benefit the customer and drive a good value proposition, start to proliferate that CXL space as well. So that's going to be another multi-billion dollar opportunity. So when you add it all up, we're looking at about a $12 billion TAM looking out in 2028, including the Scorpio piece that Sanjay mentioned earlier.
That's great. Just to give you guys some context, I think you're guiding for about $113 million next quarter. So still a lot of opportunity out there for growth going forward. Sanjay, you know, one of the things you mentioned, which is interesting, is that, you know, you mentioned that the GPU utilization is about 50% or so. You know, given all the innovation, you know, that, you know, NVIDIA is coming up with and you guys and many others in the industry, that's kind of still a surprisingly low number. So I guess, you know, if you take a little bit of a longer-term view, maybe, you know, a two to five-year view, I mean, is it realistic that we could eventually get to, I don't think 100% is realistic, but what do you think is the right number to think about in terms of utilization?
And then, you know, what role do you play in terms of improving that utilization rate? Because that's going to be very, very critical given the level of investments.
Yeah, no, absolutely. It's funny, like one of our customers had done a math, which actually was an eye-opener for me, because what he was showing is that, look, today we get around 52% utilization with the GPU. If the GPU utilization can be improved to 55%-56%, because they're using our chips to enable more robust connectivity, avoid problems before they happen, right, then our chips are free. In other words, you know, if you, the way to look at the value of our products, and I think it's reflected in the gross margin we get, is that we are in that mode of simply trying to justify the tens of thousands of dollars that people pay for the GPU.
By improving the utilization, our parts are essentially, you know, allowing them to recover or utilize more of the GPU, which was for me an eye-opener because that outlines the value that we bring. I think everyone, there's so much spotlight on the GPUs, which, you know, very important, but today the bottleneck is connectivity, okay? GPUs have figured out architecturally, you know, they can keep adding more cores, meaning doing more things. But at the end of the day, you have to move the data out from the GPU to the networks and to your cell phones that are, you know, running whatever application that you're running. So that is where the bottleneck is. That's where the opportunity is, right? So what will happen going forward?
You know, of course, we all are working hard to make sure that we can keep up with the pace that the compute guys are setting, right? But the connectivity standard, the limitation is it is standard space. There are times that's required. It's all mixed signal analog technology versus digital technology, meaning a GPU can simply add more cores and produce more data, whereas connectivity, because of the, you know, analog space, it takes, it's much more complicated than just running, adding more transistors. So to that standpoint, there will be this gap between what CPU or GPUs can generate as data to what can and how fast they can be moved out. So my personal estimate, just based on the innovation that we are doing and the industry is doing, this will probably grow to, you know, the 60s and low 70s percentage by getting everything right.
But that is probably where the limit would be. That's why you see why the speeds are increasing. Generally, in connectivity space, standards were designed to come out every 24 to 36 months. Now we are down to 12 to 18 months. So the industry is reacting. UALink, for example, has got a 128 gig version, 200 gig version, and those two standards are coming out at the same time. I mean, these things don't normally happen in the world of standards and connectivity, but the point I'm trying to make is they're all reacting to it. And I do expect that, you know, in the next three to five years, things will get a lot better, but it'll probably be in that low 70% range. That's just my guess, by the way.
There is, you know, just based on what I've seen in my life in the industry, and then there is also going to be efficiencies. I think what will happen is models will get more efficient. Maybe you don't have to move, you know, that much data out. Things would be a little bit more managed, so there will be other dynamics that will start playing in, which might overall help in terms of the bandwidth efficiency, and then obviously you have the power envelope and the physical size envelope to contend with, meaning it's not simply a problem of how do you get efficiency on the bandwidth side. If you want to run faster, you have to provide more power. Where is the power coming from? Because, you know, all the GPUs have sucked all the power that's available.
So there are multiple factors that will play in order to solve that problem.
Yeah. And do you think the utilization is any different when it comes to custom ASICs versus the GPUs?
Yeah. So again, the custom ASIC approach is interesting, right? So the hyperscalers are starting to realize where the money is from an AI use case. So they're not trying to replace what an NVIDIA GPU can do. They're simply saying, if I look at my overall infrastructure, it's going to be, it'll have a mix of, you know, third-party GPU-based servers, internal developed accelerator-based devices, and maybe some other, you know, options that they come up with. And they're collectively looking at what is the capital required. And the internal accelerators are essentially focused on, you know, use cases that are optimized for the, for cost based on some common workloads that are being run. That is how they're approaching the market. Therefore, by definition, that can be a little bit more efficient.
Now, if I can add one other point to the previous question I forgot is, you know, we're talking about this efficiency or utilization. It's some of the numbers are mind-boggling if you think about it. Even if you look at the number of AI servers that land in a data center, only about 69% of them actually work, meaning the remaining, whatever, 31% needs some tweaking on day one. That's how complex the systems are. Now, why am I highlighting that and going back to utilization is that one of the unique things about Astera is that we not only provide the connectivity function, retiming, switching, and all that, but we also provide a lot of telemetry and diagnostics within our chips. So we use a software-defined architecture.
Our chips have got multiple microcontrollers, a lot of sensors, a lot of electrical protocol, as well as environmental like temperature sensors in our chip. One single chip will have like, you know, eight different temperature sensors. We can detect if a fan stops working in one corner of our chip. A cable is inserted, but it's not fully inserted. Something else is heating up around our chip. We can detect all of that. Why is that important? It goes back to making sure that the infrastructure is as operational as possible because if something breaks in the link, guess what? With GPU-like training algorithms, if one link goes down, the whole training stops because if you have 64 GPUs in a cluster, they all behave like one GPU. One goes down, everything goes back to the previous checkpoint and restarts.
Takes 45 minutes today, meaning every time something goes wrong, you know, it takes 45 minutes for the AI training to stop and restart. So it gives you an idea of, you know, the stage of the infrastructure. And I keep comparing this to, if you guys remember, in the internet days and you had a dial-up connection and you're trying to figure out what you can do, that's kind of where things are today, meaning there's a lot more innovation left to be done and a lot more problems to be solved.
Interesting. I'm going to pivot to some of the short-term questions that we get. I guess, you know, a couple of potential concerns about the story. One, competition. Obviously, you've done extremely well. I think you have north of 70%, 80% market share in retimers, but you compete with some, you know, very large suppliers like Broadcom and Marvell, et cetera. So if you can speak to the sustainability of your market share, you know, what are you doing that's different? What's unique? You know, how confident you are about your market share, you know, in retimers?
Yeah. So I think, I mean, obviously it's a big market and you will have competition. You need to be worried if you don't have competition, right? So I think to that standpoint, what I would say is go back to what I noted in the initial part of the conversation where I talked about how Astera is built from the ground up for servicing what is to come in the next 15, 20 years, right? Meaning whether it's a software-defined architecture, cloud-based infrastructure, you know, hiring the right people for doing what is needed with some of these challenges that you see, right? You know, these are all, might seem simple, but this is foundational. I was in TI. I was a general manager of a large group.
When the cloud transition happens, it was a nightmare because the 300 people in my team were designed to service the likes of Dell and HP and all that stuff. They were not used to the terms like hyperscalers. I said, I don't care about standards compliance. I own both the sites, right? That kind of mindset is very difficult to kind of teach when someone has been doing that for several years, or I don't want to use a cloud-based server because I'm used to this, but guess what? I only had 200 servers for the entire team, so the point I'm trying to make is that large company doesn't mean that when you have an inflection point like we're seeing right now, it could be a liability, especially when you look at the product cycles which are becoming one year, one and a half years.
So the way we have set up the company is giving us the edge. Now, having said this, these are all large companies, capable engineers. We have to be mindful about them. But what we have tried to do always, and it's worked for us, is to follow like a three-point formula that we always talk about. One is listen to customers, right? You have to always listen to customers. Two is innovate, right? There's no other way of getting ahead of the competition than innovating. And third one is execution. You know, it can't be underlined enough that today the main differentiator competition is execution. And that's what we're focused on. That's what we've been able to do is to get the products out when the customers need it with the right features and work with the right tier-one customers, the right design starts.
That's what we have done with some of the partners that we've had. And that's what has helped us. And I do expect that it'll continue to help us.
Yeah, yeah, that's great. And it's quite obvious from, you know, your margin profile as well. I think it's 77% gross margins in the most recent quarter. And the other question, I'm sure you got this many, many times, or concern for some investors is, you know, the content opportunity as we go from the Hopper generation to the Blackwell generation. You know, if you can address that, you know, both in terms of the retimer content and also the emerging opportunities, you know, with your new products within Blackwell.
Yeah, absolutely. So Blackwell, I think, created a lot of confusion for folks that are not familiar with Astera or, you know, sockets like retimer. Retimer fundamentally is used to extend the reach of the signal, okay? Meaning it's solving the problem physics. Depending on the size of the board, size of the speed of the signal, you know, you may need it, you may not need it, you may need a bunch of them, you might need few of them, you might place it in board number A versus board number B because it's really solving the problem of reach. So wherever there's a reach issue, you will see retimers. Hopper generation, it was simple. There was just one motherboard that NVIDIA promoted called the HGX board, eight GPUs, relatively big board. Retimers featured on that board, eight of them. Life was good. Everyone saw some same board.
Blackwell, what happened is they released three different form factors, three major form factors. One is HGX, the same board, and our retimers are very well prominently on the Blackwell-based GPU, HGX thing. So everyone is happy there. Then you had the GB200, now GB300 type of form factors. These are smaller boards. They have only two GPUs on them. Going back to retimer again, it is a smaller board. Reach is small. You are looking at about this big board compared to something that was much bigger. So the retimer content we have is not on that board for, and not even on the reference design that NVIDIA puts out. Where we are is all the hyperscalers develop their own NICs, or most of them do. So we are on the customized version of the NVL racks, and that is where you see our content from retimers, fabric devices.
If you're at re:Invent last week at AWS, some of you might have seen the customized GB200 servers that they were showcasing. That's where the design wins are, right? We will see our content show up in the customized racks of GB200, GB300, and MGX, which is a third form factor. I mean, it's designed for enterprises. There is a one-GPU version, four-GPU version, two-GPU version. It'll be all over. There are probably 100-plus configurations there, so there our content would vary. What we have categorically noted is that our retimer shipments in 2025 will be bigger than 2024, okay? Meaning there is no those sockets. It'll probably be placed in different places across different kinds of servers, of course, including, you know, the custom ASIC-based servers, but in general, that business will continue to grow.
Next year, we expect the Scorpio business to contribute at least 10% of our overall revenue. We expect CXL to get to production in the second half of next year, and then we expect the customized NVL racks to start showing up in the second half of next year, so in general, and then you have the AEC business that we talked about, both for PCIe and Ethernet. So overall, I think we feel good about where we are. I think the market will need to, you know, learn some more about how the systems are configured and how the retimer business is working, but for us, like you noted, we are thinking bigger picture, fabrics, optics, controllers, and solving the connectivity at the rack level. That's truly the space, and like Nick said, it's a $12 billion TAM. We are nowhere close to that.
I do feel good about the opportunities ahead of us.
That's great. And that's all the time we have, unfortunately. So we have to conclude this. Thanks everybody for joining us.
All right. Thank you guys. Thank you.