Investor Meeting

Mar 20, 2024

Slides

Jihyung Yoo

Head of Investor Relations, Broadcom

Good morning and welcome, everyone. I'm Ji hyung Yoo, Head of Investor Relations at Broadcom, and welcome to Broadcom's Enabling AI Infrastructure event. On behalf of Broadcom's executive team, I'm pleased and excited to welcome our in-person attendees and our virtual audience. As a reminder, today's presentation includes forward-looking statements. Please see our recent filings with the SEC on risk factors that could cause our actual results to differ materially from those forward-looking statements. Today's presentations are being webcast live, and the slides and recording will be available on Broadcom's IR website after the conclusion of the investor meeting. With that, I'll walk you through the agenda. We're going to start with an overview with Charlie Kawwas, Broadcom's President, Semiconductor Solutions Group. Following, we'll have Ram Velaga, General Manager of our Core Switching Group, who will discuss scalable AI networks.

Jas Tremblay, General Manager of our Data Center Solutions Group, will talk about AI Server Interconnects. Near Margalit, General Manager of our Optical Systems Division, will talk about Optical Interconnects. Following will be Vijay Janapaty, General Manager of our Physical Layer Products Division, talking about our foundational technology for AI Interconnects. Frank Ostojic, General Manager of our ASIC Products Division, will talk about custom AI accelerators. In conclusion, we'll have closing remarks from Charlie Kawwas. Now with that, after a short video, I'm pleased to introduce Charlie Kawwas, President, Semiconductor Solutions Group at Broadcom.

Speaker 9

In the ever-evolving landscape of technology, there's a legacy that spans decades. Welcome to Broadcom, connecting everything. Born through the Bell Labs heritage, Broadcom invented the transistor in the 1950s to start the digital era. Pioneering fiber optics in the '70s, we set the stage for a new era in data communication. In the '90s, Broadcom revolutionized edge connectivity with DOCSIS, DSL, and PON solutions, and the internet with our networking, storage, and optics products. The 2010s marked a pivotal moment as we powered the smartphone revolution with innovations in Wi-Fi, Bluetooth, RF filters, and touch controllers. Today, at the heart of the AI revolution, we lead with cutting-edge connectivity solutions from the data center to the edge. As we look ahead, Broadcom's technology leadership is shaping the digital future, connecting everything. Broadcom.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

Good morning. How is everyone doing? All right. Well, first of all, welcome to Broadcom Semiconductor headquarters in San Jose, California. We're at the heart of Silicon Valley, and I'm very excited you're here with us. This is the first time we actually have such an event here in our campus, and I'm glad you'll be here with us to actually visit our labs later on. First of all, thank you for taking the time. Some of you told me were in London yesterday. Others came actually from Asia. Others traveled across the country. So thank you for taking a couple of hours this morning to be with us. And for those who are on the webcast, including my 82-year-old mom who dialed in internationally, hello, Mom, I love you. Thank you for being with us.

With that, I have to tell you we're at an inflection point at this point in time with incredible technology that everybody's talking about. By the way, not only this week, I think we'll be talking about this for a while. This is going to be an inflection point in the industry that will change our lives and how we work. You're going to see us here with my colleagues committed to pushing boundaries and pioneering breakthroughs on enabling AI and infrastructure. And with that, the video that you just saw, the script of the video, the images of the video, the actual video and the voice, all was generated using a generative AI tool that's based on Broadcom technology. It is incredible what this technology will be able to do for us.

Now, as the video said, our heritage and DNA is based on technology innovation, and it goes back more than half a century, actually more than 60 years ago. The transistor that powers every single chip that we use in our daily lives was invented by this company, by Bell Labs. The laser that connects all the data centers, the clouds in the world, was also invented by this company, by HP Microelectronics, which is the grandparent of Avago. For the past 20 years, we've been driving several acquisitions. I would say up until 2016, we've actually acquired all of the semiconductor franchises that I'm going to talk to you about specifically in networking. Since then, we started acquiring software companies in 2017, starting with Brocade and up until November last year, the latest acquisition, VMware. That created the Infrastructure Software Group.

Today, we're going to be focused on the semiconductor side and specifically AI. Now, the interesting thing is in the last eight years, we have not acquired a single semiconductor company. And over that period of time, we have driven organic growth. Just to take a look at the last five years, the business was $17 billion in 2019. We finished last year at $28 billion. That's all organic growth, and that is all based on the heritage and DNA that you saw in the video, and you're going to see throughout the day today based on technology leadership. That represents about 13% CAGR, much faster than the semiconductor industry. And that's all based on large scale of investment, over $3 billion of R&D that we invest in this business. Over the last five years, that's $15 billion of R&D that drove this organic growth.

Not too many semiconductor companies can say they do that. Now, how do we deliver this organic growth? We have actually a very simple strategy that Hock and I apply across Broadcom, and we've been doing this actually for the last two decades. It actually has three pillars. The first pillar starts with the market. Not what many people think of Broadcom when we do and implement this strategy. So we choose markets that are durable, and we look at these markets for a span of 10 years. And the first question we ask ourselves, will this market be there in 10 years from now? Now, most CEOs that I worked for in the past basically look for green shoots and markets that have hockey sticks. We are not interested in that if that's all the markets you looked at.

We're actually very interested in markets that'll be there in 10 years. And by the way, some of the markets that we're in actually decline low single digits. And by the way, that's actually very interesting for us as long as the market is durable. Now, in the case of AI, which we've been investing in for almost a decade, we just happen to hit a market that's actually growing significantly, and we're happy to be part of that. But that's not the criteria. Now, the second pillar, which is the most important pillar, is technology. And that is the heritage, the DNA of Broadcom, and especially the Semiconductor Group here. That is the area that we're determined in that market that we will bring leadership and innovation.

You cannot do that without investments, R&D investments, and you cannot do that without the engineers who you see here on this campus walking around. These are the engineers who actually bring us this leadership and bring the technologies we're going to share with you today. That is the most important thing for us at Broadcom. The third thing is, as you play in that market for 10 years and bring the best technology over that span of time, ultimately in these spaces that we play in, with excellence in execution and seamless capabilities, we tend to be number one in each of these categories. As a result of these three things, we have coined a term that's called sustainable franchise. That is the core definition of every business unit or division we have. Today in Broadcom, we have 26 of them.

Inside the Semiconductor Group, there's about 17 of them. Now, out of these 17, we've selected five end markets that we play in. Networking, which is the largest market for us and the largest business for us. Wireless, server storage, broadband, and industrial. As I said, out of the 26 sustainable franchises, 17 of them are part of this group. Today, we're going to focus on networking, and inside networking, we're going to double-click on the subsegment that is AI and how do we enable AI in infrastructure. Let's start with the market. That's the first pillar. The way we look at this market is it is composed of two distinct markets. One is the consumer AI space. This is a space that has very few players. But these few players have billions of users.

The way they make money is based on ad and user and consumer engagement. The interesting thing over the past few years that we've learned with them is that engagement is directly coupled to the amount of investment that they put in artificial intelligence and machine learning. And as a result, each of these consumer AI players, who are the majority of the market today, are investing tens of billions of dollars in this space. On a personal note, I see that working with my kids. You probably do the same. And it works. And the larger the cluster they build, the better the engagement, which means the better financial returns. The other market is on the enterprise side, which could be cloud or on-prem.

In this space, a lot of people are trying to invest in AI, but to be honest with you, the business case is yet to be proven. I think there's lots of initiatives, AI initiatives. I've spoken to several CIOs yesterday. It's still wait and see. Each of these folks are building small clusters to trial these. Even the cloud guys are trialing some, but there is no real tangible business case as we see on the consumer AI. So let's go to the second pillar, which is the technology. Which products do we believe play in here? From a Broadcom vantage point, we focus on 2 products. One is what we call the AI accelerator. Many people call it GPUs. Some people call it TPUs. Others call it NPUs. For today, let's refer to it as XPUs. There are 2 ways you actually can develop these products.

One, you can develop a general product that fits everybody's needs, which is great. However, if you're a consumer AI company and you're building these large-scale platforms, these general processors or GPUs are actually too powerful in terms of power consumption and too expensive to actually deploy into their networks. Some of them have no choice today because they don't have the ability to do a custom capability, but the few that have the scale of billions of users generating hundreds of billions, over $500 billion in revenue, have that capability. That's why we coin that as custom XPU or custom AI accelerator. Underpinning all of this, you have to connect all these XPUs, and hence you need a networking technology or AI connectivity. The de facto today and for the future will be a merchant play.

So today you will hear my colleagues and I focus on the consumer AI buildout, the very large-scale buildout. From a product point of view and a technology, you will see us focus on the custom AI accelerators, and then ultimately showing you the entire AI connectivity portfolio that Broadcom has. And in both of these categories, our execution over the last 10 years has been stellar, number one in each of these categories with amazing execution. So when you look at the strategy that I showed you, we're selecting the market to be in, consumer AI on the semiconductor side. We're selecting the 2 products and technologies that we're going to invest in billions of dollars. And ultimately, our track record of execution, plus what we're going to show you coming as well throughout the day today, will keep us in that leadership position.

Let's talk about that journey. We have not started looking at AI in the last one or two years. We've been in this for a long, long time. The revenue of AI in the semiconductor has been less than 5% for the longest time, up until 2022. We're actually started seeing these consumer AI starting to spend a lot of money. As a result, that jumped by more than 2x to 10% of our revenues in the semiconductor space. In 2023, we said we'll hit 15%. We hit 15%, and we predicted and targeted that we will do 25% for 2024. Well, guess what? It has accelerated. At this point in time, we revised our forecast and we said we'll actually do more than $10 billion of revenue in these two categories that I shared with you. Now, with that, let me start with the first category.

That first category, because it's very few players, it requires very deep strategic and multi-year, as well as multi-generational engagement with these very few consumer AI players. And we're very proud of these engagements. We actually do them outside the franchises in AI. That's what we do with many of our customers. So that's in our DNA. We've applied this with the first customer for a decade. And my colleagues will show you that engagement. They'll actually show you all the XPUs we've built, are building, and planning to build. And that's obviously been in production for a while and will continue over the next few years. The second customer, we were pleased to share that with you recently. I think many of you celebrated that addition of another customer. And we just started ramping earlier this year and in production today.

Very pleased to have had at least four years of engagement with that customer, multiple generations that we've built, and it is in production at this point in time, and will continue for the next few years. Well, since you're all here today and you traveled here, we wanted to actually share with you that we actually have a third customer. I don't hear any excitement. Come on. So we're very, very honored and pleased and happy to tell you that third customer is also in the consumer AI space. And we are in the ramp phase. And we will be shipping products in the next few months to that customer. And this is something that we believe will continue as well over the next few years. So in this space, the key customers around the world are probably can be counted on one hand.

Three of them, we have deep multi-year strategic engagements that we're very proud of, and my colleagues here will share more details on that. Now, this is a buildout that's been happening for a few years, and this buildout, if we go back to 2 years ago, started with a cluster at the time that was state of the art with 4,096 XPUs. The XPU at the time was a couple of hundred watts. And to interconnect 4,000 of these was fairly, compared to today, simple, single-layer networking layer using our Tomahawk switches. And, at the time, we were pleased to actually have achieved that with a single customer. In 2023, we actually built a cluster that is using this XPU that is over 10,000 nodes of XPUs, and it requires two layers of Tomahawk or Jericho switches to achieve that.

This is the lowest power XPU in the industry today, merchant or custom, less than 600 W, and using the latest technology. That's been shipping since last year. Now, as you go towards 2024, we were going to extend this to over 30,000, and then the plan and the objective of many of these consumer AI customers that we have is how do we take this to a million, hundreds of thousands, and a million. As you could imagine, we're going to need breakthrough technologies to do this. So while this is what we have done and are shipping today in massive volume, this is what we're working on, and I wanted just to show that to you, to show you the contrast in terms of the size of XPU and the capability that this team and these engineers in this facility are innovating.

Here's, and it's hard to see this here, but during the demo session, you'll be able to not only see but touch some of these cool toys that we have. I call them toys because many of our engineers, including me, really are driving through some of these breakthroughs. So a couple of things about this cool XPU. It has four of these things on one, not two. Four of these things. It has 50% more than what anybody else has announced they're going to be able to do in terms of bandwidth and in terms of memory. You can actually see the memories here. There's 12 of them. These engineers were able to fit 12 of these HBMs, the latest and greatest high bandwidth memory, on a single chip. Here we have six. Others have eight. We have 50% more.

The chip-to-chip connectivity that you see with the four core dies versus two can do 25% higher speeds than anybody else can do in the industry. Remember, the tag name from the video was "We Connect Everything." Connectivity is what we do. It's in our DNA. So when you have to create an XPU that has core dies that need to connect to each other, these core dies have to connect to HBMs, and then these HBMs and core dies have to connect to a chiplet that takes all of the bandwidth out. That is what Broadcom does. That's what we're good at. This is our heritage, and we can do it better and faster and in much lower power than anybody else. Now, with that, what might be interesting here, I thought, to share with you is how do you build a cluster?

How many of you have seen how to build a cluster? Just raise your hand. Nobody? Impossible. Okay, well, let's build one together. So it starts with an XPU. Typically, there's eight of them, unless you're Broadcom and building custom XPUs, and you'll see a server, an open server, in the demo area that can have 12 or 24 XPUs. Not four or eight. We can have as many as 24. Because when you customize it, you can actually significantly cut the size and power. You have to connect these together, and that function is called scaling up these processors. That can be done through just directly meshing them, or having a PCIe switch, or a proprietary switch, or even an Ethernet switch.

After you do that scale-up function, you bring x86 or ARM processors that you need to use as the control plane to interconnect with these XPUs, and that interconnect is done through PCIe switches. And to get these to exit this server, you need network interface cards, and these are the NIC cards that you see in here. This basic building block is called the AI server or the AI node. That is the basic building block. Now here's the cool thing. Broadcom's color is red. Anything you see in red is part of my SAM. This is what we're interested in today. So if it's a custom consumer AI, these XPUs are part of the SAM we play in. When you scale up, that's part of the SAM we play in. You have PCIe switches and NICs, that's part of the SAM we play in.

We don't build processors, x86 or ARM processors. These are blue. We do not play in that SAM. Now, we have to take many of these servers and scale it out. That's the next step of architecting a cluster. The cool thing about doing this when you scale it up is you're going to have to use the best networking technology. We believe that the best networking technology is Ethernet. As you scale up, typically in 10,000 or 30,000 nodes, you need at least two layers of leaf and spine. You will hear my colleagues here talk about the flexibility and architecture of how we do this. We're the only company that actually can support multiple ways of doing it, either based on the Tomahawk leadership products we have or the Jericho leadership products we have. Scale up and scale out is the back end.

But look, all of this as a training cluster has to go to the Internet. And to do that, you need top-of-rack switches, spine switches, also Tomahawks typically, could be Jerichos, and that's called the front end. Once you do that, you need to interconnect all of these things with cool golden colors. These are our optics products. And guess what? Now, everything that you see that's in red and has gold colors is part of our SAM. This is a beautiful picture. Now, it comes with some challenges. If you scale this beyond 30,000, the hundreds of thousands, and a million, guess what? The number one problem you're going to have is not money. Even even though your CFOs will not be happy with any of you to spend that type of money, even if they give you unlimited funds, power is the number one problem.

Just to give you an example, remember the lowest power XPU that's in the industry today is ours. That's about 600 W. The next one that's coming out, either from Broadcom or others, would be in the range of probably 1,000 W. If you want to do 30,000 of these this year, just the XPUs will use 30 MW of power. Well, guess what? Most data centers, that's the maximum power they can give you. You have not put in power supplies, you have not cooled it down, you have not added networking. You will not be able to run that with the power strategies that exist today. To complexity, this is a heterogeneous system, and this system needs to figure out a way to scale amongst multiple players in the ecosystem. It cannot all be built by a single company.

There's no single company in the world that can build everything in that data center or cluster. There are people that are required to work with each other. Some companies might have the most important and critical capabilities in here, but not all of it. So at the end of the day, it has to have some support at the ecosystem level. And to solve these, there are three things that we are investing our technologies in. One, we believe this important inflection point in the industry has to be open, and it has to be driven by open standards, like Ethernet, PCIe, and other standard capabilities at the memory level even.

I'll give you an example in here where some of my colleagues at foundational technologies level have supported the IEEE standard of 100 Gbps, meeting the spec of 2 m reach, but we want far beyond that standard to double that, to significantly lower power and cost, and enable new architectures that the standard today cannot support unless you actually can exceed that. Yet, we're fully interoperable. Two, scale. How do you scale to a 1 million cluster? The most important thing in these architectures is not going to be just the XPU. Actually, our vision and the way we see this moving forward is going to be centered around the network. As you move beyond 10,000, 20,000, and 30,000 XPUs, it becomes a distributed compute challenge. A distributed compute challenge will not be solved with the best networking architecture that will be out there.

That is a commitment that we're putting in place in scaling up and scaling out and interconnecting these networks. And lastly, but most importantly, power-efficient technologies. And to do this, we're back to our DNA: technology, innovation, and leadership, and delivering it in a sustainable way. So my colleagues next, Ram will cover the scalable networks Ethernet piece. Jas will talk about server interconnect and PCIe. Near will show you some of the coolest and greatest optics technologies. And foundational technologies such as SerDes and DSPs will be covered by Vijay. And to bring all of this together, we have Frank covering custom XPUs here. And with that, I'd like to pass it to Ram.

Ram Velaga

General Manager of our Core Switching Group, Broadcom

Good morning. My name is Ram Velaga. I run the switching and routing group at Broadcom. Been doing this for about 12 years. Prior to that, I was with a large OEM in the networking business for another 12+ years. So about 20+ years of experience in this business. Before I get started, a couple of things I really want you to think about. Right? One is, are we ready to go back to the age of the mainframe where everything is vertically integrated? Chips, hardware, software, application stack. But more importantly, back into the world of the mainframe, where you're hoping that you can get hyperscale volume. That's one. Right? Or would you believe in an actually an alternate thesis?

An alternate thesis that basically says, look, to achieve scale, the history of technology basically says you are able to take individual blocks that all interoperate across a common set of standards, and you're able to build them with economics that you could only see by taking individual blocks. That's thesis number one. Versus two that you need to think very hard about. The second thing I'd like you to think about is this is a distributed computing problem. If there's one thing I'd like you to take away from here today is in a distributed computing problem, it doesn't matter how big a GPU you make, because it's not big enough to have the entire workload run on one GPU. You have to take multiple GPUs and connect them together. Right?

Whether they're connected inside a data center or across data centers, you cannot get around the fact that you have to connect multiple GPUs. Once you accept the fact that it is a distributed computing problem, and you need a network, then I would make a very strong case for you that the best network in the world, over multiple generations, over and again, has been Ethernet. Right? And I want to give you a few examples on thinking about why it's Ethernet. About 10 years ago, we had these large cloud customers who were building large-scale data centers. They came to us and said, look, we need a lot of bandwidth in the switches. Even large OEMs said, look, you don't need this much bandwidth. We are not building those switches.

They came to us, and as a merchant silicon vendor, every 18-24 months, we doubled the bandwidth of our switches. Right? And today, when you think about almost all the cloud deployments in the world, they're all based on merchant silicon switches. Six years ago, large telcos like AT&T came to us, and they said, hey, we want routers because we want the same economics that our cloud customers are deploying, or our cloud competition is deploying. Can you help us? Most people in the world said, no way in hell that you can have a large service provider telco network like AT&T's core network running on merchant silicon. Six years later, everything is running merchant silicon. 60% of traffic of AT&T's core network today runs on merchant silicon. Why am I telling you this?

18 months ago, I've heard a lot of people tell me, InfiniBand is going to rule the world. Every machine learning implementation is going to be based on InfiniBand. There's going to be no Ethernet. I can tell you today, some of the largest deployments are based on Ethernet, and they're continuing to get deployed in the next, you know, year over Ethernet. As Charlie mentioned about it, what we're scaling to is a scale of 1 million-plus clusters. Right? Now, why do you need 1 million-plus GPUs in a cluster? It could be anything. It could be a large language model or whatever the reason is. But we take it for granted that this scale is needed. Right? When this kind of scale is needed, really what ends up happening is the only way you can connect this is to have a network.

By the way, so this is a picture of Google's cloud about 20 years ago. Okay? And you look at this picture here. Do you see a mainframe? Anybody sees a mainframe here? No. Okay? What do you see here is a bunch of commodity servers, and they actually had the, you know, foresight back then to saying they didn't pick the most expensive or the highest performing CPU. You can go back and read literature on this. Instead, what they picked was the CPU that offered the best cycles per given cost. And they took that CPU, built these boards, and they connected it over the network. You see all these blue, you know, yellow, orange cables? That's all Ethernet. So they built the world's largest distributed computing system that was not just inside a rack, but inside a data center, but actually extended between data centers. Right?

That's how you build the world's largest compute systems. That's where they actually used Ethernet to build it. Right? Now, if you think about it, now, Sun actually coined this term, you know, network is a computer. I didn't understand it 20 years ago. But as you actually start to think about the scale of the problem you're facing, this really comes to fruition. The idea that the network is the only way you can actually distribute computing and build a very large compute. Now you'd say, okay, great networking, but why is networking so important? Right? This was a slide that was presented about 18 months ago at the OCP by Meta. What they showed in here is when they are running these large workloads, anywhere between 18%-57% of the time, the traffic is just sitting in the network.

That means during this period of time, the GPUs are actually sitting idle. Right? Now think about it. If on an average, somebody is charging somebody between $20,000-$30,000 per GPU, and you've got, you know, 100,000 GPUs, you're talking about a $2 billion-$3 billion infrastructure. And if $2 billion-$3 billion infrastructure is sitting idle for 18%-57% of the time, that's a lot of money. Right? So the whole idea that, look, we have this expensive infrastructure, but it's sitting idle because traffic is sitting in the network. And we need to fix it. Then you say, okay, what about the machine learning causes this traffic to sit in the, you know, network versus actually kind of the GPUs being superactive? There's a couple of things. One, okay?

Most of you, if I ask you, hey, what do you think the bandwidth of a CPU is? If you just go buy a CPU, what the bandwidth is? You'd probably agree that the fastest CPUs are at best pushing 50 Gbps. 50 Gbps. Okay? The GPUs today are pushing 400 Gbps. We expect that sometime next year, late next year, these GPUs are going to be pushing 800 Gbps. So think about it. You're going from 50 Gbps to somewhere between 400 Gbps-800 Gbps. That's 20 times more bandwidth. Okay? So that's number one. GPUs push a lot of bandwidth. Number two, these GPUs are very, very idiosyncratic in the way they work. Okay? What happens is they all want to talk to each other at the same time.

When they talk to each other, they actually want to talk a lot at the same time. Okay? So to put that into context, you know, just think about it. In the morning, everybody wants to get to work, assuming you actually want to get to work versus working from home. And you want to get to work at 9:00 A.M. Okay? So you decide to drive to work, everybody, at 9:00 A.M. And then you also decide, I'm not going to take my car. I'm going to take an 18-wheeler, and I'm going to drive an 18-wheel truck to work. Okay? Number two. Number three, then you decide, all of us are going to use the same one lane, even though I have an eight-lane highway. What do you think is going to happen?

Everybody at the same time, 18 wheelers, all using one lane, all the remaining 7 lanes are empty. Right? Creates congestion. The congestion starts to back up. And once it starts to back up, then you essentially have all of these GPUs sitting idle, waiting for the network to be effective. Right? So these three attributes are extremely important. High bandwidth, the way these guys talk to each other, wanting to talk at the same time, and, you know, just creating a congestion. And one of the things that's worth noting here is these workloads also last for a very, very long time. It's like saying, look, you actually are going to go to work, but your commute's very long. You know, it's a 100-mile commute. And then if your car breaks down in between, you gotta go back and restart again.

And that's exactly what happens here, which is these workloads run for a very long time. They checkpoint frequently. But if there's a failure, they gotta go back and start from the checkpoint. So if you've lost, like, you know, you're 70% into your job and there's the failure, what do you end up doing? You gotta go back and start again, and this tremendous amount of infrastructure is wasted. So these are the things you gotta really think about as you think about machine learning and these very, very large-scale deployments. Right? So we've thought about this problem for a long time. And to a certain extent, we also stumbled into it by, you know, luck, so to say. Right? But luck favors, you know, helps those, you know, who are prepared. And here's how we solve this problem. Two different ways.

One, because first, you know, let's just make sure we all break here. It's all about traffic management. You want to be able to actually decide how traffic gets onto these highways, how they're spread across the different lanes, and they go from point A to point B. There's two ways we do it. One, what we call the endpoint scheduled. And what we mean by the endpoint scheduled is really a dumb network. Okay? This is a fancy way of saying a dumb network. And what we mean by this is there are customers like Amazon, and they've been very open about it. They have their own NIC. That's called the Annapurna NIC. And that NIC actually does all the traffic scheduling.

The NIC says, okay, I'm going to put the traffic on this network, and I'm going to send it at path A versus I'm going to send it on path B and path C. And then all we do with our Tomahawk class of devices in between is listen to what the NIC is saying, and we just forward the traffic as fast as we can. So that's what endpoint scheduled is. The other one is what we call switch scheduled. Because customers have a heterogeneous, you know, NIC environment, or they may not necessarily have the NICs which are capable of this traffic management, they'll say, hey, look, I don't want to do it. Let the network do it. And that's where we have our class of products which are called switch scheduled.

The switch essentially takes care of the path that the traffic takes from point A to point B. It manages all the congestion so that effectively, all the lanes in the highway are well utilized, and you don't have traffic being dropped between point A and point B. So those are two approaches, and we actually have our products in both of these. Right? In the second one is what we have our Jericho3-AI chip. We've actually announced this product a couple, you know, about 18 months ago in production now. And you can actually deploy large clusters of up to 32,000, you know, GPUs. It might be hard to see from where you're sitting, but this is our, you know, Jericho3-AI chip. Right?

And by the way, these chips, they're 800 millimeters square, the largest that you can build in a reticle. And they have multiple HBMs. So building large chips, large chips with HBMs, advanced packaging comes very natural to us. Okay? This is the chip, by the way, and this is the architecture that actually Meta just recently published a paper. And, if you want access to it, I'll make sure Ji gets it to you. They compared two clusters. One cluster based on InfiniBand because there's a raging debate that somehow InfiniBand is magical. Right? Other than the fact that it's super expensive. And that Ethernet would not work. So they ran two clusters of 24,000 GPUs each. They tested it. Guess what? Ethernet works. It actually works fine. And it's half the price or cheaper. And it doesn't melt down. Right? It's a lot of good stuff.

It actually works. And more importantly, you can actually see it's 10% better performance than other alternatives. Now you could say, hey, 10%, big deal. Right? But think about it. If you have a $2 billion-$10 billion infrastructure, 10% is about anywhere from $200 million to $1 billion. And I'd be lucky if I got paid that much for the network. So that's infinite returns. For every dollar I spend, I save a dollar and more. Right? So I challenge anybody who tells me InfiniBand is better than Ethernet. This is one. Two, you look at what we're doing on what I call the dumb fabric. Now, just because I call it dumb, please do not expect it to be cheap. Okay? What I mean by dumb is paying respect to those who build the NICs. I want to be nice to them.

There we have the Tomahawk class of devices. One thing I can tell you is every 18-24 months, we're doubling the bandwidth. Okay? By the way, we are never in the habit of announcing products before we ship them. Never. I have seen others who announce products, you know, two years ahead of us, and we're shipping a year ahead of them. So you might be sitting there wondering, hey, somebody announced a 100 Tb switch. Where is yours? I'll just say, we don't announce. Let history speak for itself. All of you are very good at drawing linear regressions. You can figure out where things fall. Right? So this is our Tomahawk 5, you know, device, 512, based on 100 gig SerDes.

One of the things, actually, by the way, worth pointing out, and without taking too much time from my colleague Vijay, who'll talk about it, making 512 100G SerDes work, and actually all of them not interfere with one another and deploying this in production is both an art and a science that we've actually, you know, perfected over multiple generations. When we say a chip comes, it comes, and it works. Right? Then you look at this and say, okay, you guys, you know, are good at switches. What else are you doing? We understand that eventually for the switch to get the traffic between the GPU and the switch, you need a NIC.

If there's one company that's going to control the NIC and says, oh, my NIC's not going to work with somebody else's switches, you know, we'll have to say, what do we do there? So we decided we are going to build a NIC. And this NIC, by the way, is called Thor 2 because we also have a Thor 1 before it. But here's what this NIC is not. It's not a SmartNIC. Okay? Please do not confuse it at all. It's not a SmartNIC. And it's not a SuperNIC . I've heard this new word called SuperNIC now. Okay? Soon you might hear the word called Cosmos NIC. We are not any of those things. We are a performance NIC, which basically means we're focused on two things. Bandwidth. Right? 400 G, 800 G, 1.6 Tb, so on and so forth.

Performance of RDMA, because it's all about RDMA traffic. But as we do this, and the reason we've chosen this architecture is because the power of the NIC becomes extremely important, because as I mentioned before, the GPUs need a lot of bandwidth, and the NIC has to keep up with the amount of GPU bandwidth that's coming out. So if you build a very high power requiring NIC, you won't be able to scale. So our entire focus is on a performance NIC with very high RDMA performance that we can consistently scale from 400 G to 800 G to 1.6 Tb and so on and so forth. And that's our, you know, approach here. Now here's another thing that we're doing. When we do this NIC, we realize that customers are going to use it in different ways.

There are those that will take a NIC in a board and they'll plug it into anybody's GPU. Or there are customers or partners who say they're building their GPUs internally, and they actually want some kind of a chiplet interface that they can use our NIC because it has some of these networking capabilities. Or they might just decide, hey, just give me that IP, and I'm going to put it in a big chip that we are building, and we're going to make this NIC available on these different form factors to anybody who wants, you know, to be able to leverage it across these form factors. Right? So what does this all mean? Ethernet works extremely well. And the beautiful part about Ethernet, it's all based on standards interfaces. And you can pick any of these building blocks from us.

You can buy our Jericho switches. You can buy our Tomahawk class of switches. You can just buy our NIC. We will work with everybody. We are not going to scare you by saying, this is not going to work if you have to work with somebody else. We won't do that. That's not how Ethernet has been built. Right? To just kind of double click on it, to show Ethernet's performance, somebody might tell you, InfiniBand is the best thing for ML. But what if I told you I'm a 10% higher performance than InfiniBand at half the price? Then they might come and say, I'll give it to you for free because I'm going to collect it with money someplace else. But what this tells you is across different packet sizes, consistently, we can deliver a higher performance than InfiniBand. When InfiniBand works, even better.

If you think about it, if you just even build a cluster of 4,000 GPUs, you need 8,000 optics, or actually slightly more, about 9,000+ optics. Okay? Optics, even with the best electronics in them, they are flaky. And generally, you get to see about at least a 2% failure rate per year, if not 5%. Okay? That's why actually some of these large mega scale cloud customers will tell you that they have a bone pile of optics, you know, when they have these large deployments. When you have that kind of a failure rate, you could experience as much as 15 failures per month. Now think about it. A job that runs for a very long period of time, connected with optics, and these optics have very high failure rates, and you have to keep going and doing checkpointing.

That combined with the fact that InfiniBand takes at least 30 times longer to converge than Ethernet basically means you have to ship an army of people standing by with cables and standing there to make sure this thing actually runs. That's why they actually say, if you're going to deploy InfiniBand, make sure your cable length is measured. Make sure the tape is in the right place. Make sure it's powered exactly the way it should be. Because it's fragile. Okay? So this is what actually ends up happening, which is why is Ethernet converged so much faster? Because it's based on distributed protocols. For the longest time, you have things like BGP and other protocols that are looking at things like, what does my neighbor look like? Right? What's my link health? And all of this is happening both in software and hardware versus InfiniBand.

Everything that happens is a centralized controller. Okay? This is one of the biggest Achilles heels for why people will not deploy. And then as we kind of just think about these very, very large, you know, networks. Right? You have the GPUs, you have the NICs, then you have the optics, and you have the switches. Optics actually end up consuming a lot of power and cost. So what are we doing in our switches and actually in a lot of our technology is we're saying, avoid the use of optics as much as you can by extending the reach of copper. Okay? And we can actually have the reach of copper go 4 m. That is twice what the standard asks for. And just to give you a context of what 4 m is, it's the size of an elephant. Okay?

Then we say, okay, go as far as you can on copper. And if you no longer can use copper, use optics. But when you use optics, avoid using too many electronics inside the optics, and which is what we call direct drive optics. And by the way, eventually there'll be a period of time when the amount of bandwidth coming out of these switches and accelerators is going to be so high that you no longer can just use pluggables, and you'll have to use something that's called co-packaged. And we could do these co-packaged optics that first will give you very high density to at a fraction of the power and the cost. So these are the things that we're doing to actually make the entire interconnect come together. Right?

So what I'd like to leave you with the next few slides is one, is no question in anybody's mind, with an exception of one customer who still happens to be an InfiniBand, but eventually I think in the next year or two, we will move them to Ethernet. Ethernet is the de facto standard for these large machine learning clusters. And this is not the front end network. We already have got the front end network. This is the back end network. Okay? And then you may say, what are the sizes of these clusters? I'm only sharing with you the data that actually is publicly available. You do Google searches and find. So this is what is publicly available. Amazon has clusters based on Ethernet that are over 60,000 servers. Oracle, over 30,000 servers. Meta, over 20,000. Tencent, over 10,000. Right?

Some of them are bigger than these, but these are the numbers that they've actually shown publicly. This is all back end. This is all Ethernet. This is all machine learning. Right? Now, we know we can do 10,000, 20,000, 30,000 60,000, 100,000 today. But there was also this consortium that was founded by Broadcom and a couple of others, you know, about two years ago. And the idea is, let's actually take this to a 1 million-plus nodes. Okay? And when you start thinking about 1 million-plus nodes, the biggest issue that needs to be actually solved is RDMA. And you've probably heard about RDMA, which is remote direct memory access. Right? RDMA came about about 25 years ago. And then the idea was two CPUs want to talk to each other and share their memory.

So it was built for two machines to talk to each other. And then slowly it scaled from two to 16 to 32, 64, 128, 512. But it was never built for thousands or hundreds of thousands of CPUs or GPUs talking to each other. So there's a whole bunch of, you know, things that actually break down in RDMA. And what we as an industry are doing is actually making significant enhancements to RDMA so it can scale to a 1 million-plus clusters. And by the way, this is not something you're going to say we're going to say you're going to see it five years from now.

This is something you're actually going to see as products in 18-24 months from now, fully interoperable solutions across different vendors having very high scale RDMA between those who are building internal accelerators, those who are building, you know, custom silicon accelerators, and everyone in between. Right? So why is this so important? I think this is probably the biggest takeaway slide. You will not have a world where you have millions of GPUs where there is one mainframe solution being sold. The only way this always has played out in history is you have multiple vendors, multiple solutions. And when you have these multiple vendors and multiple solutions, what you need is a fabric that interconnects all of these together because this is a distributed computing problem. Just going and saying I can build the biggest GPU doesn't solve the problem.

You build the GPUs that can scale and can be networked across a very, very large fabric. And Ethernet is the fabric, will be the fabric, and you can actually hold me accountable for it. Right? And lastly, not only do we believe in Ethernet, but we also believe in actually making Ethernet based on a very open ecosystem. We don't go and say, hey, we're building the whole box, by the way, along with the cables. Do you want to buy it? Nope. What we do is, we have the silicon. We have a whole bunch of vendors that build hardware all around the world. Then we have a whole bunch of partners who actually build software on top of it, along with all the management and stuff that goes on top of it. That's the approach we're going to take.

Build the best networking devices, make it available to a very, very broad ecosystem, and believe in this idea that this is a distributed computing problem. The only way you're going to solve this problem at scale is by not building a mainframe. With that, I'd like to thank you for your time. Hand over to Jas, please.

Jas Tremblay

General Manager of our Data Center Solutions Group, Broadcom

Good morning, everyone. So I'll give you guys five seconds to cool down your fingers. I've never seen so much frenetic typing. So, my name is Jas Tremblay. I've had the privilege to work for Broadcom for 18 years. I'm currently the general manager for the Data Center Solutions Group, which focuses on server connectivity. And today we're going to talk about not how we network tens of thousands of AI servers, but how do we create a network inside the AI server. So going back to the mainframe days, silicon to software systems, everything vertically integrated, you did not need to worry about the connectivity. You just built a connectivity for your own solutions. And the same company would build all this technology. But what happened over the course of the years is in the data center space, more companies wanted to innovate and participate.

You need to come together at the connectivity level. One of the most important protocols for that has been PCIe. PCIe is the most used protocol inside systems that go inside data centers. The governing body for PCIe is called PCI-SIG. It was founded in 1992. We've been through five generations of PCIe. There is quite a large community of about 900 companies, extensive plug fests, plug fests. We come together to make sure our products work together. PCIe is the protocol to interconnect inside the server. The data center has been built up of compute servers. Ram showed the picture of the Google data center 20 years ago. It was compute servers. Within that compute server, you'll find the CPU.

To the CPU, you attach peripherals, Ethernet NICs, NVMe drives, storage adapters, multiple types of technology, and they come together with PCIe as a protocol, in a point-to-point fashion. Here you're not building a network. You have the CPU at the heart. And you have peripherals attached to it. And the majority of the cases, they connect together with PCIe as a point-to-point networking protocol. Now fast forward to the AI server. And we'll have in the demo area, we'll have three instances of different AI servers. They're big. They're complicated. They're pieces of art from a mechanical cooling perspective. What you'll find inside these AI servers is multiple CPUs, 12 NVMe drives, 8-11 Ethernet NICs, 8-12 XPUs, and other types of devices. The point-to-point methodology does not work.

You actually need to build a network inside that AI server. The network of choice for that is PCIe. It's very low latency. It's ubiquitous. It's standards based, and it allows companies to bring the pieces that they need together. In fact, having an open internal fabric inside the AI server is key to freedom so that you can pick the components that you want. If you're building your own NIC in-house as a cloud provider, if you want to use different types of accelerators, having an open fabric allows you to pick and choose the components that you want and build an AI server that's more adapted to your needs. The other element is, if you're a server OEM or ODM, it's very hard to build a complete system for every type of accelerator.

So you want to have an architecture that you can support both merchant, custom, and different types of XPUs inside the AI server. So that's where the use of PCIe switching as the internal network inside the AI servers is very important. Okay. So picture on the left, we've got a rack. This is an OCP AI rack. It's composed of multiple AI servers, pictures in the middle. And this AI server is about 15 inches tall. And we'll show you. We actually have this system in the demo area. And there's three trays inside this AI server. Top tray has compute, CPU, and the fabric. Middle tray has peripherals. That's where you'll stack up all your NICs, your NVMe drives. And the bottom tray, where a lot of the power is and you need a lot of cooling, where you can put your XPUs.

And this one can support custom NPUs and multiple providers of merchant NPUs. So it's really an open platform. And it has a PCIe as the internal fabric and Ethernet as the scale out fabric. So let's take the top tray, which has CPUs and the fabric. And let's double click on this a little bit. So you can see at the top there, there's four heatsinks. Each of these heatsinks has one of these. Let me get it out. One of these little chips inside. This is a Broadcom PCIe Gen 5 switch. It's 4.6 Tbps . You can attach up to 72 devices to it, either an Ethernet NIC, an NVMe drive, and so forth. In this specific server, there's four of these 144-lane PCIe switches. They're interconnected together.

Each of them aggregates 1 CPU, 2 NICs, 2 XPUs, and 4 NVMe drives. There's this picture here. It's replicated four times in the AI server. That's a building block of the AI server. That interconnects to the scale-out network utilizing Ethernet. This network needs to be ultra-low latency. We're talking 120 ns latency. It needs to be high bandwidth. But most importantly, it needs to be trusted. It needs to interop with many, many types of devices out there. It needs to be standards compliant. It needs to have advanced telemetry and diagnostics. If you're deploying tens of thousands of these AI servers in your network, you need to have capabilities inside the network, inside the AI server to tell you what's going on, or are the devices behaving properly or not.

So we invest in performance, lowest power, and advanced telemetry and diagnostics. Okay. So we talked about the switch. The switch is the core element to that network. But in some we need to run this over effectively backplane. This is not a wired network. It's a backplane trace network, very low cost, low power. But in some cases, the server's 15 inches tall. You need to go from one tray to the other and maneuver yourself around. You may need to go further than the PCIe spec in terms of distance. The first way we do this is because of our SerDes that we'll show you a little bit later. We can typically go 40% further than the standards from a SerDes perspective. But sometimes you need to go further than that, and you need to have a retimer.

A retimer basically extends the reach of the PCIe switch using that protocol. It's a companion device to the switch. So we've got a portfolio of switches and the companionship retimers. Two weeks ago, we announced the industry's first 5 nm retimer companion chip to the switch. If you use our switch with our retimer because of the same SerDes on both sides being standards compliant, we can go 40% longer reach with 35% less power. So this is really important, as if you might have 4 switches, 12 retimers in one of these systems, you really want to optimize it for power and cooling. Okay. So Charlie talked about the franchise and the fact that we invest in markets that will exist for decades. So we've been doing behind the scenes PCIe switches for quite some time.

In fact, we introduced the first PCIe switch in 2003 when there was no AI server, when people did not need a lot of PCIe switches. We were first to market in 2003, and we've been first to market with PCIe switch for every generation of PCIe for the past 20 years. So first to market for five generations and market leader for five generations of PCIe switching. But now, starting in Gen 4, that's when people started to build up AI servers and needed an internal fabric. And we are taking this franchise that we've been investing in for 20 years, but doubling down on it, increasing the investments for AI. So AI needs faster, faster. And what I mean by that is we need to accelerate the pace of cadence and go faster to the new protocol speeds.

We need more bandwidth, more connectivity, more capabilities inside this fabric. So we're shipping in volume PCIe Gen 5 switches today, which power the vast majority of the industry's AI servers across the industry, across custom and merchant accelerators. This is the network of choice inside the AI server. We are going to announce our retimer in Gen 5, 5 nm. We announce our retimer in PCIe Gen 6, 5 nm. We also announce that we're going to be sampling PCIe Gen 6 switches at the end of this year. Another thing that we did is we're accelerating the cadence from if you look at Gen 3 to Gen 4. This was CPU speed in that point to point model. The importance of PCIe performance was not that critical. Eight years between Gen 3 to Gen 4.

Four years between Gen 4 to Gen 5, and that's where AI really started. Two years between Gen 5 and Gen 6, and then we're going to one-year cadence. We need to speed things up. That network needs to be extremely high-performance, low-latency. Now, the other thing so I've been talking about building the internal fabric to connect CPUs, NICs, NVMe drives, and the XPUs, that internal fabric. The other aspect is the scale-up fabric, high-performance networking from XPU to XPU. There's different ways to do this today, but we believe we need an open, low-power, high-performance, low-latency way to do this. And we've partnered up with AMD. AMD had the MI300 launch event in December of last year.

For us, Norrod and myself announced that we're partnering on building a scale-up solution that would we would Broadcom would be building the switch. AMD would be building the accelerators, and we're going to work together on this in an open way and bring this to standards bodies. So today, our switches are used in internal fabric. And with this, we're extending it to be used as a scale-up. Okay. So on that note, you're building one of these complex AI servers. You need an internal fabric. PCIe is the protocol of choice for that. It allows openness and choices with customers. We introduce, with no fanfare, the first PCIe Gen 1 switch in 2003. And for the past 20 years, five generations, we've been first to market 5 times, and we've been market leaders 5 times.

And now, with the dawn of AI servers, we're doubling down, increasing investment in this space, and increasing the cadence of innovation between generations. So thank you very much. And I'll pass it on to Near.

Near Margalit

General Manager of our Optical Systems Division, Broadcom

Thank you, Jas. Good morning, everybody. I'm Near Margalit. I'm the general manager of the Optical Systems Division here at Broadcom. I've been involved in optical components for over 30 years. I'm really excited to talk to you guys about some of the optical technology that we're developing here at Broadcom. So we'll start with a slide that Charlie laid out really well, you know, what do these AI clusters look like? We know we want to get to bigger and bigger clusters. So everything inside the rack you'd like to do in a copper fashion and be able to interconnect with PCIe or direct attach copper across these things. But once you go any kind of scale or distance, you've got to start looking at optical lengths. And these are the golden lines, both the front end network and the back end network.

We need to be able to scale to really large bandwidth. These AI systems are continuing to consume more and more bandwidth across the system. So, we need the optical technology to be able to support that, both scaling and cost, and being able to provide the higher level bandwidths. So within Broadcom, we're going to talk about three core technologies that we have under our group. The first one is the Vertical Cavity Surface Emitting Laser . This is kind of a workhorse for AI technology across the industry. It can be used on Ethernet, InfiniBand, NVLink technology. It does have limitations in distance because of the multimode fiber itself, limiting to 100 m or so, but very low power, low cost technology, and it's being deployed widely in most of the AI systems in the world today.

When you go a little bit farther, you need to scale to bigger and bigger clusters, going to hundreds of thousands or millions of units. You want to be able to travel beyond that 100 m reach. So, there you look to Indium Phosphide -based technology or electro-absorption modulated lasers, and that gives you the reach. And again, in both of these markets, we're a leading supplier for this technology. And we're going to talk to you guys a little bit more today about a new technology that we're putting together, which is co-packaged optics, which is the integration of high-speed silicon photonics directly integrated on ASICs, whether it's switches, PCIe switches, or accelerators across the system. And that really provides the future generation for both power and cost leadership for these future generation systems.

So, you may have seen last week, we did a press release on the VCSEL and EML technology, where we said we've shipped more than 20 million channels of 100 G per lane technology. So really demonstrating core leadership in our optical technology. So, just going to a little bit, we've got a long history of leadership in this area. We own our two fabs that make these optical components, one in Pennsylvania and one in Singapore, one making VCSELs, one making the Indium Phosphide technology. And we've been in this business for a very, very long time, with the VCSELs going back to HP days, we were one of the original people building VCSEL technology. I think back when I was in grad school, they started doing these VCSEL technologies, and they've maintained, we've maintained leadership through the decades in all this technology.

In addition, the Indium Phosphide technology, it's the original Bell Labs fab that we own, that has the Indium Phosphide technology. And again, we've continued to maintain leadership from way back direct modulated lasers to the highest state-of-the-art 100 G per lane electro-absorption modulators. So where are we at today? We're obviously talking a lot about AI and the AI data center world. So we continue our leadership in this area. We're the only people shipping high-volume 100 Gb VCSEL technology. Very complex technology. For a long time, people thought this was not even possible to do it at 100 Gb. And we were told this is this is VCSELs are dead.

You don't need to continue any more work on them because that you just can't get them to work at the data rates that you need in the future. So not only are we shipping in mass volume millions of units, the 100 Gb VCSEL technology, we're actually going to demonstrate you today 200 Gb VCSEL technology. Again, a really, really, large technological barrier. This is not ready yet for production, but it's something that we're excited about and think that we can continue to deliver on our long history of leadership in this market. And with the Indium Phosphide technology, we are ready for mass production at 200 gigabit EML. We announced last week that we're ready for mass production on this. This will be able to go with all the future 200 Gb per lane links, right?

The line speeds continue to go up in all these AI data centers to reduce power and reduce cost. We are ready for that technology. And we're continuing to work even on future technologies of EML to be able to run even higher speed links. And we'll talk a lot about that technology. We talked about co-packaged optics. We announced also last week the first commercial shipments of our Bailly system. This is the combination of our 51 Tb Tomahawk switch with complete optical links directly integrated onto the package. So all 512 lanes directly optically attached to the switch itself. And we'll talk about why that's important. We continue to work on future technologies. We think this is a foundational technology that could scale for all kinds of applications where optics are needed.

So, first and foremost, why why why why do co-packaged? Obviously, pluggable transceivers have been around for a long time, have been very effective and served the industry extremely well. Well, with these AI systems, the bandwidth, the amount of components, continues to scale, and the cost of the optics continue to be a problem in that scalability. So how do you solve a roadmap to continue to reduce cost in optics to be able to scale with these larger and larger clusters and GPUs? Our solution to that is integration and looking for integration specifically into silicon photonics to be able to put more and more components directly onto an individual chip. That has historically been the way that semiconductors have reduced costs, and we believe in optics that is the correct way to go.

So we see continued CPO providing not only today, but also in the future, the lowest cost per bit capability. The second benefit you get for co-packaged optics is that the actual optics are right where the signals are. So you get rid of the complex electrical lanes between the ASICs and the optics. These lanes burn power, add cost, add complexity to the link. That's unnecessary if you can get the optics to be directly on the substrate themselves. So what we see we're giving you an example here on the right. Just how much power does that save? It actually is quite a bit of power. So typical 800 G pluggable transceivers today in the market today are 14 W. We're showing the Bailly system that we're shipping right now at 5 W.

So almost a 70% savings over a typical deployment being done today. And does it scale for 200 G? Yes. It's going to, it's going to scale as pluggable transceivers, scale to 1.6 Tb, 25 W. Again, we're going to have similar proportional savings. So it's actually a big deal in terms of the power savings and the optics. I think we talked a lot on optics and power being an important issue. This is a pathway to reduce the cost significantly. And the last thing that's not so obvious is what about reliability? People are saying you integrate everything together. Can reliability be there? And we think actually you can enhance the reliability of systems with co-packaged optics because you are integrating more and more components into core silicon elements.

So, we see the ability to integrate components on silicon photonics itself as a way to enhance reliability. Now, we're not ready yet for building lasers directly on silicon. That's still science fiction. So, what we've done with our system is still maintain the laser system itself, which is not core silicon component as a pluggable component in the system. So we still maintain the serviceability for the element that's not core silicon. Everything else is built on core silicon technology similar to the switches and PCIe and a long history of very reliable. So we actually think CPO is a way to enhance reliability of optical links. I think Ram had in his talk 2% failure rates for pluggable transceivers. That's a pretty bad failure rate.

And we see pluggable optics, sorry, CPO systems being offered a way to get rid of that as kind of poor reliability of transceivers. And just looking at what integration means, just giving you a little bit of visuals of what this looks like. When you do pluggable transceivers, traditionally we've been in that business a long time ago. You integrate lots of different heterogeneous components all on a PCB. It's all put together. And the industry has recognized that. And actually, several suppliers have started to look at how do we integrate more of these components into silicon photonics to improve the cost and reliability of these systems. And people have done this on putting it onto pluggables. We know several suppliers have put silicon photonics on pluggables. And that's actually a good direction. And it actually has moved the industry forward.

But Broadcom's kind of aimed a little bit further ahead and say how do we put these Silicon Photonics truly in a high density fashion, not four or eight channels. We're shooting for 64 channels on an individual photonic chip. So what that allows you is when you provide that density, you can actually start moving the optics off these subsystems that are pluggable transceivers and be able to move them directly onto ASIC substrates. Again, getting the benefits of that technology. And what can you put it on? So obviously our first product that we're announcing putting that on is on our Tomahawk 5 switch, 512 lanes of optical connectivity across the entire device. All 512 lanes. Not only are the SerDes working, we've got full optical capability. But that doesn't end there.

We see this technology being applied to a lot of different areas. Frank will show you the ability to put it on custom accelerators if you want optics directly, coming out of your end nodes, for high bandwidth connectivity. So, again, lots of applications for this. And it's a pretty foundational technology that will hopefully drive the industry forward. So how do we build this? What have we done? So, the first and foremost is we focus from day one on high density silicon photonics, not just low density silicon photonics. And we're showing you here an optical chip that's got full 64 lanes of 100 G capability and about the size of a quarter. So if you think about, you'll see in the other demo room what 128 optical transceivers look like. This is replaced with these little tiny engines.

So, we integrate the muxing technology, the modulation, the photodiodes, the optical coupling all onto these individual chips. We couple it with advanced node silicon, CMOS technology for drivers and TIAs to maintain lowest power and lowest cost capability across the system. We then use advanced packaging techniques to be able to stack these chips together to provide the best performance and best reliability. That those optical engines can then go on any kind of substrate that you have on ASICs on. And of course showing here, the 51 Tb system with eight optical engines around the size of the die. And finally, how does the end customer consume this? Right? We have ODM partners that we announced that are putting these systems into boxes that look very similar to a Minipack 3 , right?

This is like what it looks very much similar to a standard pluggable optical box. But what's the difference? We gain tremendous amount of power savings. We gain tremendous amount of cost savings. So you can consume it in the same format you've been doing, pluggable transceivers, but you gain the power, the cost, the reliability advantages of this system. So, it's a very very exciting direction. Not only that, we're going to show you guys a little video. This we're focusing on integration and cost. And we're going to show you a short video on how this all gets put together. We're showing here all the different steps of assembling these co-packaged engines. This is an example of like bonding the electronic die to the photonic die. Of course, there's optical fibers that have to be attached to these engines. They can't just attach themselves.

So we have robotic assembly of optical components to allow you to attach fibers directly to the optical engines. Once all that process is done, we obviously do testing at an individual die level for photonic engines and ultimately place it on the end substrate or end product here as an example of eight optical engines. And the key thing to focus when we do these videos is integration, robotics, minimizing human touch. Why? Because we know that that improves reliability. We know that improves costs. It improves scalability of these systems. So we're really excited, as this being our first product with co-packaged technology, and we hope to see a lot of future technologies use this, use this capability. And so we'll end my portion of the talk today just going through three key points.

We've shown industry leadership on optical components for over a long history. Specifically now at 100 G per lane, we're doing extremely well in delivering for AI applications. We've shown the ability that we're going to continue to scale at 200 G, both the VCSEL technology and EML technology. We're looking to scale that. We're also shipping the first commercial system of a co-packaged optics with a pluggable laser. This is a really exciting technology that provides both cost and power benefits with up to 70% reduction in cost and 30% cost savings. So I'm going to pass it on to Vijay. He's going to talk a little bit more about some of our foundational technologies.

Vijay Janapaty

General Manager of our Physical Layer Products Division, Broadcom

Good morning. Can you hear me okay? Yeah. Good morning. My name is Vijay Janapaty. I'm the general manager for a division called Physical Layer Products Division. I've been here with the company for more than 25 years now. I joined as a young engineer and rose up the ranks. Today I think our presentation I'm going to do is focused on foundation technology. In particular, we are actually going to focus more on high-speed links. And these links are typically sort of made up with these cores called SerDes cores, also these DSPs that are used in the pluggable modules. Okay? So before I do that, let me actually set up what the problem statement is. I think Charlie and Ram talked about the 1 million accelerator cluster, right?

If you go into this 1 million cluster, at a typical case, you will find there are 10 million high-speed links. And these links are 400 G going to 800 G. Maybe two years down the road, it's going to be 1.6 Tbps, right? So, these links kind of made up, today, let's say 400 G is made up of four lanes of 100 G. And then tomorrow, it will be either four lanes of 200 G. And then thereafter, it would be eight lanes of 200 G at 1.6 Tbps, right? So these are very fast links. And the interesting thing about that in this market, and in other networking markets, the bandwidth of these links actually doubles every two years. So we have to come up with a new, faster link every two years, right?

And if you look at the sheer number of these links and the fact that it is doubling in speed, these links are the number two source of power and cost in an AI cluster. So any saving you make here is saving that, helps bring the power down or some more power available for the AI accelerators, right? So it's very important to sort of focus on the power and cost of these high speed links. And that's what we do at Broadcom. Okay? So I'm going to double click on these links now, right? High speed links. So if you took those 10+ million links, a predominant portion of that links today are copper. They are either on a backplane or they are on a direct attached copper cable within the rack. And their reach is about 5 m or so.

They have the lowest power and lowest cost. You know, literally they're free, right? Now, if you look at if you can't do 5 m, if you have to go beyond 5 m, then of course you have to use optics, right? When you have optics, the reach is much greater, but they have the highest power and highest cost. And so it's very important to sort of figure out how do we reduce these, these power and costs of these optics, right? So on the copper links, the technology that we use to drive these copper links is SerDes. And these SerDes cores are embedded inside a Tomahawk switch, or the XPUs that Frank's going to talk about or the NICs that, you know, Ram talked about. So they're embedded in that. So on the DSPs, they're embedded inside the pluggable optics, right?

I think most of you know what these are. So for us, what are our objectives? Our objectives are how do we make sure that as many links stay on copper as possible because it's free, right? And then second, whatever links don't stay on copper, how do we reduce the cost and power for those links? So those are the two objectives that we drive our development philosophy with. Okay? So before I go into sort of how we are going to do this, let me just review the history. Broadcom has history and a legacy of driving the best SerDes in the industry for at least four generations. So if you look at 10 G, 25 G, 50 G, 100 G, we have been the leader in SerDes. A nd that powered our products not only to double the bandwidth but be the first in the market.

And I'll talk more about what that leadership is in the next slide when I talk about 100 G SerDes. But the foundation of that leadership comes from what we believe are four key elements. The first is we have very deep high-speed analog expertise. It's not low-speed analog, high-speed analog expertise. Then second, we pair that with a DSP expertise that is sort of custom built for that high-speed communication. So that we have loads of those, right? Then third, we have to take those two and go into technology nodes that are leading edge, right? So today's 3 nm technology, for example. And not only do we actually do in that, new technology, but we do it actually concurrently with the foundry.

When the foundry is developing the process, we actually develop our SerDes technology so that when the foundry is ready, we are also ready for our products to be first in market, and have the right, you know, power and cost characteristics. And last but not least, with the scale of operation that we have, we are deploying our SerDes in hundreds and hundreds of systems with, literally, you know, hundreds of cores, right? You know, for example, Tomahawk 5 has 500 of those. And some of the XPUs have hundreds of those cores. So it is the system know-how that we have gained over time. And it is generationally we have gained these four elements. We are the reason why we are the number one in the world. And today what I'll do is I'm going to click into 100 G to show you what we did.

Okay? So in 100 G, as I mentioned, you know, the objective one is keep everything on copper as much as possible, right? So, that's what you see on the top two there, 45 dB backplane channels. So once you do that, you have a larger fan out, you can connect more chips together. And the second thing we do is we do actually 4 m cable, right, for DACs, direct attached copper. And that's not only in the rack, but also inter-rack too. So you can actually go some links to the next rack on copper. And that's something that allows a lot of these links to stay in copper, which is going to give you a lot of power and cost benefit. Now, we didn't stop there.

In this SerDes, which we actually call Peregrine, which is our leading 5 nm 100G SerDes, we did something very unique. We built a native equalization capability for optics so that you don't have to have a DSP or a retimer pluggable. You can actually drive optics directly from the SerDes itself, from the switch, from the XPU, from the NIC. And so what that does is, it enables very disruptive use cases like CPO. You know, you saw CPO from Near. He talked about that. It also enabled for the first time an LPO or an LDO, you know, people call it different names. And both of them actually dramatically reduce the power and cost of optics. Okay? So this SerDes is available in every product that Broadcom will build in 5 nm for 100 G. It may be switches, routers, XPUs, NICs.

And all of them carry the same benefits. So you can connect any of them to an LDO or to a CPO or to a 45 dB backplane or to a DAC cable, right? So that's, so we cut and paste across all the products. And we have, you know, a fantastic deployment with this SerDes. So today we have actually some new news that we want to share with you. It is our next generation SerDes. And this is a code name inside our company: Condor. It's built on 3 nm, not 4 nm. And it has the same benefits that I talked about in the 100 G. Very long reach, 45+ dB, 2+ meters of DAC cable. So everything in the rack is pretty much covered without any retimers. You don't need active in the DAC in the rack.

And again, it does provide the same benefits. CPO, linear optics, it's available now. All of our product teams are designing with this right now. And again, I think with these specs we have, we absolutely believe and we're very confident that customers are going to be delighted with this. And we will be the number one SerDes again in 200 G. In the demo area today, we are actually going to show you a demo of this SerDes running on DAC cables, which is the hardest thing to do. And, you know, please do see that demo. I'm going to turn to DSPs now. Yes, after doing all of this, there is still going to be some DSPs, right? And still going to be pluggable modules. And these pluggable modules, of course, are very prevalent today.

In April of 2022, I did a tech day for investors where we talked about our renewed effort in this area with an innovation to integrate as many components as possible to drive the again cost and power, right, for pluggables. So we are integrating drivers, which are typically in a non-CMOS or technology TIAs, which are again non-CMOS. We brought everything into CMOS, integrated everything into a single device. That was going to drive cost and power in the module lower. We actually made that promise in 2022. So today I'm very happy to tell you that 400 G optics, we did that. We achieved what we wanted to achieve. We also did that on 800 G in 7 nm. We've actually overachieved. You know, we got even lower power by taking all that and moving into 5 nm.

So we have some of the lowest, the modules built with Broadcom's integrated DSPs are some of the lowest power and cost in the marketplace today. And we can drive both multimode and single mode. And that's really to help customers reduce their overall spend as well as the power that they consume for these high speed links. Okay? So on 200 G, actually, we started investing even sooner. And, of course, as you know, at 1.6 T, it is eight lanes of 200, right? That's how you get to the 1.6 T. And we have this family of chips called Sian. We have today a demonstration of Sian. And that demonstration will be with a device that has the driver integrated. Later this year, we're going to have a device that will also have the TIA integrated.

And again, we support multimode and single mode. One of the good things about this is, our performance of this Sian-based modules is the best in the industry today. We have at least three decades of margin over competition and, even specifications. So we have the best producing, best performing module. Why is that important, right? Why is that important is the failover rate that Ram talked about is something that is very clearly proportional to the error rate that you actually get. Secondly, if your performance is very high, you actually can bypass a lot of error correction. That gives you lower latency. When you have that lower latency, it's going to be better for training workloads and things like that. So the better, better performance is going to be very good for the AI industry.

So we are happy to show you that also today. We have modules with both EMLs based on Broadcom's EMLs as well as Broadcom's, you know, CW lasers, which is a silicon photonics-based technology. So we will have both of those in the demo area. And we would like you to take a look at that. Okay? So to wrap up, on foundation high speed links, we believe we delivered some of the best in class 100 G per lane ecosystem, driving the objectives that I've outlined. We are absolutely on track to lead the industry again at 200 G, both on SerDes and DSPs. And our, as I also mentioned, our differentiation really comes from core expertise in analog DSP as well as our system know-how and the scale of deployment that we have been doing for many, many generations. Okay? Thank you. I will hand it out to Frank.

Frank Ostojic

General Manager of our ASIC Products Division, Broadcom

Thank you, Vijay. Good morning, Frank Ostojic. I've been doing custom silicon since I was born. It's been a long time. That's not my high school graduation picture. That's when Hock Tan asked me to run the custom division like 16 years ago. It's been, I loved every minute of it. But most important, let me tell you about my team. My team and I, we came from Hewlett- Packard. We used to do whatever Hewlett- Packard had was that was hard and custom. We did those chips, whether they were compute, printers, graphics, that's the kind of stuff that we did. And then shortly after that, we became Agilent. So we started to do, on top of the Hewlett- Packard chips, we started to do some analog chips and some really cool stuff for Agilent.

Then after that, we became Avago, and Hock Tan showed up, and then we acquired LSI. So I met some incredible good engineers, custom engineers from LSI, and we integrated into my team. Several of those sit on the third floor right there in this building. And, we gained scale, we gained more customers, and we've been working on that. After that, you know the story. We acquired Broadcom, and we got some amazing analog engineers and system engineers and other engineers that allowed us to have some crazy IP, some beautiful investments. So let me show you what my team and I, I have been able to accomplish. And, thanks to all that effort, we are the, we've been the No. 1 custom silicon for 10 years.

I give all the credit to those amazing engineers and to those great customers that have been loyal to us and been working with us for a long time. Now, something happened, in 2014. We met a customer that decided to do something really cool in AI. And we developed an AI chip for them, and we started shifting resources and our focus to AI. And that's what we're going to talk about today. First question, why do these consumer AI customers want their own chips? Why do they want to partner with us to create these XPUs? Why can't they use GPUs, merchant chips? What's the benefit? So let's discuss that. Okay? The benefit can be explained by a simple equation: performance divided by total cost of ownership. What is total cost of ownership?

It's the cost of the chip, the cost of the power, and the cost of the infrastructure that put it together. So let's digest it. Let's go ahead and zoom in. When you take an XPU and you are one of these consumer AI companies, you have some internal workloads that are very important for your revenue generation and for your applications. So if you customize your architecture of your accelerator and your bandwidth and the ratio of bandwidth accelerated to an I/O, you might be able to do that specific workloads or those very specific workloads that you care a lot about much more efficiently than when you have general hardware. So what happens? We work with our customers to customize the architecture that they have, which comes from them, to make sure that they can maximize performance, maximize performance for what they care about.

Then when you look at efficiency and optimizing, there's another really good effect, which is when you optimize hardware, you make it smaller, you make it cheaper, you use less real estate. So when these companies start using these developed designs, then they save millions of dollars and billions of dollars of CapEx because they're designed exactly for what they want with the right ratio of memory, with the right ratio of I/O. So there's another benefit, which is when you optimize something and you optimize the energy, which is power, picojoules , right? Pico joules per bit or pico joules per terabits, whatever it might be, you are optimizing cost, lower power, lower cost. And as you heard, power is an amazing commodity. You may or may not build the data center depending on what power footprint you're going to have.

You might determine where you build the data center or whether you're going to do it at all. So there are simple economics that are extremely strong that drive this business and this investment from the type of customers that we have been working with. So let's zoom in. Let's talk about what we're doing. All right. So you've seen the presentations from Vijay about SerDes, from Ram about the optimized NICs that we have, from Ram about the Jericho3-AI. By the way, switching was one of the huge assets that we acquired when we acquired Broadcom. And that is available. You saw Near on what we're doing with co-packaged optics. And then in my division, we have a large investment in advanced packaging and buffer memory IP. Charlie discussed, t his is the type of investment that we have. We're putting our money where our mouth is.

We've been investing. No doubt the priority of this $3 billion is number one AI for this type of market. So as I mentioned before, what's this for? Super simple. Lowest power, best performance for optimized workloads in these XPUs allows us to have the best performance by TCO. That is what we're doing. Very focused. That is the mantra of my division. We want to do one thing and do it right and pick the right customers. So let's zoom in a little more. Now, let's do a little dissection of these chips. What are they about? What are what matters in this chip? Here's a little diagram that shows the different aspects of an XPU. We're going to just touch on them. Number one, compute. Number two, memory. Number three, the network I/O.

You can see them in the top and the bottom as chiplets. Number four, last and absolutely not least, one of the hardest parts is a reliable packaging technology, right? So let's talk about that. Number one, the architecture comes from all those brilliant geniuses that work for our customers that are looking at the workloads, that are looking at what they want to do in three years, in five years , in 10 years. That's where the architecture comes from, the accelerator. We've been working for those folks for a long time. And then we have developed a flow that allows them to optimize the construction of that compute.

We have several software engineers in my team that all they do is optimize the flow to build those accelerators really small, really fast, and with a minimal area so we can go ahead and reduce the cost and improve the TCO and obviously get it done really fast. Number two, so in other words, compute is owned between the customer and us. It's a shared responsibility. Number two, memory. This is something that we own in Broadcom, which we take this ability to have HBMs or other memory solutions and have the right files, the right connectivity, the right cooling, the right testing to make sure they're reliable and they're ready to go.

As Charlie discussed, we're going to demonstrate today how we're running these interfaces significantly faster than any standard out there or any competitor that we have with proprietary techniques that we have about everything from power distribution, from testing, et cetera, et cetera. So number three, well, you listened to my friend Ram Velaga, right? We bought this incredible asset from Broadcom, and we're using that asset to its full extent. This, at the end of the day, is going to be dominated by the network difficulty. So all the cool IP that Ram has, we have created software tools that allow us to put together chiplets that can make these I/Os wider, thinner to match the exact type of precision and ratio that the customer wants for their workloads. So we can do a 200 G, a 100 G PCI Express.

If they need another one next time, we just change it, and we can go to production quickly. So a lot of flexibility due to software automation that we've done for all the years that we've been working with them. That's the network I/O. It's done with full solutions. Not only that, we have the hardware, the firmware, and the software. We can give that to the customer before they even start doing the XPU so they can emulate, they can simulate, they can solve all the problems. Again, packaging, and my favorite is we can do 2.5D. We're working, we, I'm going to show you some cool stuff on 3D and silicon photonics. This is really hard. I'm going to talk about some of the things that we have resolved there. Great. Let's continue.

Now let's talk about experience. We talked about investment. Now let's talk about experience. So a few days ago, I landed in San Jose. I'm from Colorado. And I saw that in San Jose, all the signs are AI for this, AI for that, AI tester, software for AI. You go to the bookstore, there's books on AI. You're going to go to Starbucks, you're going to have an AI latte. It's just everything is AI, right? Everybody's putting the AI badge, right, to participate in this cool stuff. Not us. We've been doing AI custom chips since 2014. And we've been implementing a flow mechanically, electrically, thermally, and then also for design to make sure we do it better and better and better on the chips that we have taken to production, on the chips we're developing, and the chips that we're discussing architecture. One decade.

All right. Now we talked about that we're selective about customers, right? So let's talk a little bit about them. With one of our customers, we did 10 years of chips and 10 chips. We have learned our mistakes, their mistakes, our vendors' mistakes. We have coded in software all the solutions for that type of a flow to make sure that we're doing like a machine with a lot of discipline, with a lot of automation to avoid errors. We have taken that cool stuff that we invented, and we have gone together with another customer. We have been working with them for about four years. We have done about four chips for them. There's time to go to production, right? Of course, there's some chips that we're developing with each one of them, and there's some chips that we're discussing architecture.

As Charlie discussed, we're excited that we have a third customer. We've done a chip, and we're starting to take it out to production. We have one customer that we've been in a long time, midterm, and we have a new one. Very exciting but focused, and they all have similar goals. All right. How do we do this? How about the time to market? All right. Let's look at this. In this chart here, I want you to look at this point right here. That's when the XPU officially starts. What we've been able to do, and let me show you a couple of examples here. This device, we taped out in seven months. This device, we taped out in nine months. It's not because we're working on another planet and overnight.

It's because we have created a flow that's automated and debugged. We have all these tools ready for plug and play in all the IP. We have pre-qualified for thermal, mechanical, and electrical, the packages that we need. The code development, we engage with them. They're familiar with us. We're familiar with them. It's the same flow we've been using, but we've been improving for a decade. We take it to the fab. Because everything is proving, the only thing that's new is the guts of the computer architecture inside, which we together quickly check and verify. Obviously, we emulate before that. We can start production levels at three months. That is basically hard work that happens long before. Let me give an analogy. I'm sure some of you like cool cars. Let's talk about a Corvette, let's say.

It's like building a custom Corvette for a very custom track. You know the pictures of the track. You know the lengths, the curves, and all that kind of stuff. The customer's working on the engine. While the customer's working on the engine, we got the brakes ready. We have the chassis assembled, the tires, the music's on, everything's ready, and the hood is open, just waiting with the pit crew to put it in, hit the gas, and we go. Speed is incredibly important in this market. So you can have the performance by TCO correctly at the right time. So this is what we've been specialized in, one thing, this type of devices. Now you've seen the chip that Charlie showed. Pretty cool stuff. This is what we're doing for our customers that can enable 12 HBMs with a lot of silicon.

You can see we have two NICs here and two cores. Our customer can fill that with the most precise elements that they need for their internal workloads, right? We're going to talk a little bit now about the architecture phase. What happens in the architecture phase? This is for the future, right? We work with these customers for a long time. We know what they need. We know their struggles. We know our struggles. We have a huge R&D investment on technology for the future. That's what Charlie showed you, right? We have the silicon photonics. I'm going to show you here. This is the chip that Near was talking about. This is a 2.5D package with HBMs and an accelerator and the silicon photonics connector. This can save 80 W of system power.

Imagine putting 1 million of these together. What is the green savings? What is the OpEx savings that you can have with that? And you're going to see this chip working in our demo with real traffic and real testing conditions. And we're hitting everything we have: optics tests, reliability tests, thermal tests to make sure everything's ready for production, right? Now, on this big chip that Charlie showed that looks like a coaster where you can put your drink, it's an actual chip. And you'll be able to see it. Tremendously difficult to get the warpage correctly, to get the mechanicals to make sure it doesn't crack. This is not our first try on this chip. We've done several, and we fixed all the mistakes that now allow it to be a production product. So that's the investment. Now, this is the part I'm really excited about.

This is, I'm sure I can capture with the light, a 3D wafer. And it has a close to 800 square millimeter chip in the bottom with a close to 700 square millimeter chip on the top. And we're going to put two of those right here plus the NICs. You do the math. That's north of 3,000 square millimeters available for networking, for I/O, for acceleration, for whatever are the right things that our customer needs to put. But we have to invest years ahead. And it's not just showing up with a flag that we can do it. You have to put all the engineering and all the capability, right? So you'll be able to see this when we have the demos. I'll be right there. And finally, I was trying, how do I show complexity?

How can I show complexity, in a way that makes sense? So I came up with this chart, which is pretty simple. It looks intimidating, but it's pretty simple. This is time. And the size of the bubble is the complexity measure of an XPU. So if a bubble has about twice the area as another bubble, you can assume the bandwidth's probably twice and the content of silicon's probably twice. And mechanical problems are probably twice as hard to do, okay? So, to make it easy to see, I coded this one's green. You can see the first chip that we did. It seemed hard back then, by the way. And then they have a little bit of blue, and then the purples, like, look at bowling balls, they're going to hit you. So what do you see here? What trend do you see? What's interesting?

Before I do that, let's analyze the equation. Complexity is a function of compute performance, network bandwidth, memory bandwidth, power delivery, thermal integrity, and mechanical reliability. These three last ones, they take a long time to figure out. Very difficult to do, right? So some observations, as you can see, there's more content. There's more complexity. And do you see any green balls around here? They're all big. So let me give you an example. I am a surfer. I surf with my daughter in Santa Cruz. I try to keep up with her. And we, my daughter and I, look at the waves in three different ways. That's a 2 ft wave, 2 ft- 3 ft wave. You can learn how to surf on those, have fun, catch a wave with a longboard. These are 6 ft waves. You got to know what you're doing.

You could get hurt, and it's really hard to go through those to go to surf. These monsters are 15 ft-20 ft waves. You don't mess around with it. These are very difficult, right? However, when I see this, I see that the surf is favorable for Broadcom because we're good at surfing those big monsters. That's what we've been doing, right? And it creates all a very difficult barrier to entry, right? So, and for those of you that are not surfers, basically imagine skiing green, blue, double black with skeletons, okay? That's basically what it looks like. And that's what I like to ski with my kids in Colorado, and I cannot keep up with them. All right. So in summary, pretty simple. Number one, focus. We want to do one thing right, and we've been doing that for 10 years.

Do these consumer AI difficult chips just like we did when we started in Hewlett Packard 30 years ago. Investment, $3 billion, focused primarily and prioritized on AI. Experience, 10 years of fixing things, of learning mistakes, of improving the flow, and being diligent on that discipline to use the same flow. And a three-year investment or four on future items that we need, we think our customers are going to need. I look forward to seeing you in some of the demos. Thank you so much. Now I'm going to give the time to Charlie.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

All right. Thank you, Frank. I brought my skiing gear as well as my surfing gear. Hopefully, you can join us later on at the beach or the ski hills over there. I hope you've enjoyed all of the innovations that my colleagues and I have shared with you. I know it's been a little bit longer than what we planned, but hopefully, this was worth it. We've really covered how do we enable AI infrastructure with silicon. But we all know silicon alone is not enough. And in order for the silicon to run, we need software. So I've decided to bring a special guest, another colleague of mine from VMware, Paul Turner, who's a special guest. If you can join me here, Paul. Welcome, Paul. Ladies and gentlemen.

Paul Turner

VP of Products, VMware Cloud Foundation Division, Broadcom

Thank you so much. And great to catch up with everybody.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

So Paul is a colleague who just joined us on November 22nd. He's the VP of Products for VMware VCF. And he's the product boss. So if you have questions for Paul, obviously stick around. You better catch him before he leaves. But more importantly, what Paul did earlier this week is he announced at the beginning of the week a very cool thing. It's the first release of Private AI Foundation. So it has AI right in the middle of this, just as Frank was saying, everything has AI. So Paul, tell us a bit more about this announcement.

Paul Turner

VP of Products, VMware Cloud Foundation Division, Broadcom

Sure, Charlie. Actually, delighted to. So, you heard a lot about consumer AI. And the other side to the AI picture, of course, is enterprise AI. And enterprise AI is slightly different because what we are, what we've done with Private AI Foundation, which we first released actually with NVIDIA, is we've realized that you can take foundational models, ones that have been done probably on custom silicon and optimized by all of those, you know, those top-end companies out there. But you can take those now as open-source models, bring them into your enterprise, and actually use, optimize them just for your use. Do the fine-tuning, do the prompt-tuning, do the RAG and retrieval augmented generation so that you can optimize those models.

And you can do, you can deliver new applications within months by just optimizing with only tens of GPUs for these customers because 98%, 99% of the work is already done for them. And they can build up those foundational models. So yes, we just released VMware Private AI Foundation with NVIDIA, which is actually a jointly engineered solution with our NVIDIA friends. So yeah, first, first to market, we announced it back at VMware Explore, and now released into the market.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

Awesome. Well, this is exciting. But I, I want more clarification. What does private in Private AI mean?

Paul Turner

VP of Products, VMware Cloud Foundation Division, Broadcom

Yeah. Good question. So our customers are some of the biggest enterprises of the world, and some of the most secure government agencies and data centers in the world. Their data is their IP. Not only is it important IP to them, they're actually very concerned about that data and that IP and the privacy of that data. And so what Private AI is doing is really working out how can we bring the foundational model, how can we bring the Gen AI capability to the data versus bringing the data to the Gen AI models. And that's what our customers want. They want to actually have this ability that inside their data center, inside their secured environments, they can actually optimize on top of these models and do it quickly and iterate really quickly. So that's the path.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

Awesome. So as you've heard me talk, one of the things that we really believe in is open solutions in AI. We talked about open, scalable, power-efficient. The open piece, I think, is something that VMware, since the beginning, I think it's over two decades, has led the world in, in the data center. So as you release this out, tell us a bit more about your partner's ecosystem and how you enable, continue to enable in a Private AI world the open ecosystem.

Paul Turner

VP of Products, VMware Cloud Foundation Division, Broadcom

Yeah. So it's a, it's a huge factor for us. So we've, we've more than 300,000 customers out there run their data centers on VMware. So open for us means that we must support the ecosystem that's inside in that data center. That's the application ecosystem and, of course, the hardware ecosystem. So, you know, in this, we have actually done integration, the first releases.

We've done integration in with NVIDIA. We've tightly coupled in with their NeMo framework and so that you can do that optimization and tuning. We've integrated in with their DPUs and GPUs. But we don't stop there. We've actually gone and we're working very closely with other hardware partners. We're working with Intel, AMD. We actually have a whole set of software partners. So think of people like Hugging Face. Hugging Face is very interesting. Think of it as like the Git model repository for all of the open-source models that you can use. Build your optimized model to a particular vertical, bring it down, do the fine-tuning of it. We work with companies like theCUBE. We work with, you know, a whole set of companies to actually help customers, not just on the hardware side, but also on the software side, innovate.

So go online to VMware.com and you can actually go in and look at our AI partnerships. And importantly, with every one of those, we build best practices. We give free guidance out to customers to actually help them understand this breadth of this ecosystem because honestly, it's totally confusing to them. So how do we make it very easy and give them the best recommendations in terms of tool sets, tool chains that are right for them to use? And then be able to support it on any of the hardware ecosystem that they support.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

Awesome. Well, I'd like to thank you for taking the time and joining us today, even though this is a semiconductor event, but he is part of Broadcom. And you can see now Broadcom is not just focused on the consumer and semiconductor side of it. Obviously, the enterprise and an open enterprise solution is key. Thank you so much.

Paul Turner

VP of Products, VMware Cloud Foundation Division, Broadcom

Thanks so much.

Charlie Kawwas

President, Semiconductor Solutions Group, Broadcom

Appreciate it. Thank you. All right. With that, let me just wrap it up. Remember, this is the slide that we've talked about earlier on. My colleague Paul here discussed it as well. On the semiconductor side, we're focused on these big, shiny blue squares that you see here. The market is consumer AI. We're very excited about this. There is a business case that makes sense for this today. As the enterprise model evolves, our brothers and sisters on the VMware side are already investing and taking products out to help the CIOs leverage what we can on the AI side as that business case develops.

Remember, the second pillar is technology. You've heard from all my colleagues how we're number one in each of these categories. We will focus and we are focused, we've been focused for 10 years on custom XPUs. We are the market leader for the last 10 years. And the same thing on the AI connectivity. We've been the market leader on connectivity for over 10 years, almost two decades. And with that, we've talked about the two customers. You've heard my colleague Frank talk about the experiences and these blocks that he was showing on these roadmaps are actual chips that we've built and either have shipped and are in production or in co-development, the middle ones, or in full production, second to none in that space.

And the super exciting news today for us is now Frank has a third consumer AI customer that's just ramping and will be shipping in volume this year. If you look at the broad portfolio that we have, this portfolio of AI connectivity and networking is second to none. It's been second to none for over a decade, leading the industry, not just in the past, but more important today and in the future. In Ethernet, you heard Ram with Jericho AI and Tomahawk absolutely taking the industry to the next level and continue that leadership with all of the hyperscalers, especially the consumer space. You've heard Jas talk about PCIe. We are not just delivering PCIe switches. We realize in order for this system to work, we will deliver the total end-to-end PCIe solution.

That IP is not only in the switches and retimers, it actually goes also in the XPUs, it goes in the switches. We are, we have been and are still the number one for five generations, and we think we will be in the Gen 6 timeframe. On the optics side, very cool technology that Near shared with you. Number one in VCSELs. We are showing you today the impossible in the labs. Our MIT and PhD engineer said two years ago, nobody can do 200 G VCSELs. Literally. Today, you're going to see it working in our labs. That is mission impossible. We're the only one shipping 100 G VCSELs. Every system that uses 100 G VCSELs today, it doesn't matter who supplies it. We play into that space.

EML, we're the leader in 200 G. The cool thing that's coming out now is how we will change the power equation with CPO, not just for the switches, which we announced last week. We took Tomahawk 5 and these CPO tiles and have created the first 51 T CPO, which saves over 70% of the power, more than 30% of the cost. But now we actually are having these consumer AI customers tell us if each of these tiles, as Frank showed, you can save 80 W and you have to put 1 million of these. That's 80 million watts. That's huge. That's the size of two data centers today just by using that technology.

And then lastly, but definitely not least, all of these things would not happen, including the custom side, without Vijay's foundational technologies. It starts with the SerDes, it continues with the DSP, and we enable copper single mode, and multimode capability. And with that, that is the market we play in. Remember, all the red squares, the golden lines, that's where the money is. We play in an open platform to enable AI infrastructure across any size cluster in this space. At the end of the day, as I was just chatting with Paul, we believe in an open, scalable, and power-efficient system that can allow you to get to this million cluster. You cannot do that without the best networks in the world, and that's Ethernet.

The only way we believe we will continue the success that we've done over the last two decades is through deep and large and scalable investments in our technology and our engineers across all of these technologies we shared with you. We will continue that innovation. And rest assured, and I'm hoping you will join us with the later on with the demos to see that. But at the end of the day, even if you're in the right market, even if you have the right technology, that sustainable franchise does not get completed if you do not seamlessly execute on that plan. I'm very proud to say that the teams we have on the semiconductor side are the best in the world. In these spaces, we continue to execute.

With this, thank you again for taking the time with us. I personally and the entire Broadcom team here appreciate you taking the time with us. Hopefully we can, spend a bit more time with you later on. Thank you again.