NVIDIA Corporation (NVDA)
NASDAQ: NVDA · Real-Time Price · USD
200.50
-8.75 (-4.18%)
Apr 30, 2026, 12:16 PM EDT - Market open
← View all transcripts

Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025

Jun 10, 2025

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Good morning, everyone, and welcome to Rosenblatt Securities' fifth annual Age of AI Scaling Tech Conference. My name is Kevin Cassidy. I'm one of the semiconductor analysts at Rosenblatt, and it's my pleasure to introduce Gilad Shainer. He's NVIDIA's Senior VP of Networking. Also, we have Stewart Stecker. He is NVIDIA's Senior Director of Investor Relations. On NVIDIA, we have a buy rating and a $200 12-month target price. We're bullish on NVIDIA, not only because of their leadership in AI, but now their ability to expand into full rack scale deployments, including scale-up and scale-out networks. We're fortunate to have Gilad speaking with us today. Gilad is a networking expert. Gilad joined Mellanox in 2001 as a design engineer and has served as Senior Marketing Management role since 2005.

Of course, NVIDIA acquired Mellanox in 2020, and Gilad also serves as Chairman of the HPC AI Advisory Council Organization, and he's President of the UCF and CCIX Consortiums, and is a member of IBTA and a contributor to the PCI SIG, PCI X, and PCI Express specifications. Gilad also owns or holds multiple patents in the field of high-speed networking. With that, first, I'll turn it over to Stewart to go over some of NVIDIA's disclosures.

Stewart Stecker
Senior Director of Investor Relations, NVIDIA

Thanks, Kevin. Thanks, everyone, for having us. As a reminder, the content of this call may contain forward-looking statements, and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business. Back over to you, Kevin.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Thanks, Stewart. Yeah, so I'll kick off the fireside chat with a few questions, and we'll take questions from the audience also. To ask a question, you click on the quote bubble in the graphic on the top right-hand corner of your screen, and I'll read the question to Gilad and Stewart. Keep in mind that this is a fireside chat working towards the understanding of NVIDIA's network strategy. Gilad will not be taking questions around financial guidance. With that, thank you, Gilad, and great to see you again.

Gilad Shainer
SVP of Networking, NVIDIA

Thank you very much, Kevin.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Maybe we'll start with a very high-level question of what is the strategic importance of networking in AI data centers?

Gilad Shainer
SVP of Networking, NVIDIA

It's a good start. It's a good start. First, the data center is the unit of computing today. Previously, it was an element or it was a CPU or a GPU, but today it's not the GPU, it's not the server, it's the data center. The data center is the unit of computing that we use. Now, networking defines a data center. The way that you connect those computing elements together will define what that data center can do. It could range from just building a server farm all the way to building an AI supercomputer that can run a single workload at large scale and to do amazing stuff. The networking, or it's used to call networking, I'm not referring to networking anymore. It's more like this is the computing infrastructure. It's much more than a switch. It's much more than a NIC.

It's a computing infrastructure. That is why it has become so critical or so important. That infrastructure will determine what kind of workloads you can do, what will be the efficiency of the data center, what will be your return on investment, how many users, how many workloads you can bring in, how many tokens you can support, how many end users you can host on the data center. This is where the networking or the infrastructure is so critical. Now, when you go and design a networking for AI data centers, it's a completely different task than designing networking or infrastructure for the traditional hyperscale clouds. Here, we're not talking about single server workloads. We're talking about distributed computing. We're talking about workloads that need to run on over multiple compute engines, which could be hundreds and thousands and tens of thousands and hundreds of thousands.

You need to make sure that every GPU here gets the right throughput. Every GPU needs to be fully synchronized. The data that goes over the network needs to hit every GPU at the same time. If you create skews on network, if you create what we call tail latency, then one GPU is going to finish later than others. We all know that when you're running an AI infrastructure, the last element to complete the task will determine the entire performance of the data center. It's the tail latency, the throughput, the latency cross, is making sure there is a congestion control. There's a huge amount of elements that are in that infrastructure. That infrastructure will determine what you can do with the data center that you built. That's why it's so important.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Great. When I talk to investors, they've been hearing the terms now that they're just a little confused on, but maybe as you talk about connecting the entire data center and even data center to data center, but the terms of scale-up and scale-out networking, that's new to some investors. Maybe if you could just give that explanation of what's the difference and why are each important.

Gilad Shainer
SVP of Networking, NVIDIA

Yeah, and I'll try to make it maybe a little bit simple because I see that there are terms and people try to define what is scale-up and what is scale-out. First, we can start with examples. When we design an AI supercomputer, our scale-up infrastructure is NVLink, and our scale-out infrastructure could be InfiniBand or Spectrum-X. Those are the examples. Now, what's the difference between them? Scale-up is your ability to build a larger compute engine. In a scale-up infrastructure or connectivity, we're taking those GPU ASICs, let's call it like that, or GPU packages, and we want those GPU packages to behave like one. In order to build that one, you need a scale-up infrastructure. That's what the scale-up network does.

It takes those components, making sure that all of the balance between them, kind of the right message rate, the right connectivity, the right elements are there in order to make those engines behave like one. This is why if you see Jensen Keynote, he says that his GPU is the rack. His GPU is not the ASIC. It's like we have NVLink 72, so that rack is the GPU. Scale-up network enables that. Scale-up enables to build larger GPUs out of the different ASIC components. Now, once you define that larger GPU, now you need to connect those GPUs together. How many GPUs you connect together depends on what kind of workloads you're going to run. What is the mission that you want to achieve?

Connecting those GPUs together in order to form multiple GPUs that will work together and run those larger missions, this is where the scale-out network is needed. There are different requirements from a scale-up infrastructure versus the scale-out infrastructure. One's creating a larger compute engine, and the other one's connecting multiple compute engines in order to support the different missions that you want to run on the data center.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Just as an example, it was only last year that NVIDIA was networking scale-up eight GPUs, and this year it's 72 GPUs.

Gilad Shainer
SVP of Networking, NVIDIA

Correct. We talked about 576 in keynote. Jensen talked about that as moving forward. It has all been determined according to what are the workloads that you need to support. Obviously, as workloads continue to evolve and new workloads continue to emerge and you need to solve new things, everything in a data center is being added or changed or progressed. One of the examples is that unit of computing that was maybe a single GPU then becomes eight GPUs on NVLink. Now it is 72 GPUs, and it is going to go to 576. It is all in order to support what kind of workloads you need to run today or you need to provide today.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Maybe you touched on it, just the workloads. What is happening with the AI workloads and applications that are influencing some of the network requirements?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah. What you actually do is you build a data center. That data center is aimed to serve workloads that you define or the workloads that you want to run on the data center. Essentially, everything needs to be connected together. There are different elements that you can look at. First is building or co-designing the network with the compute, for example. This is important because you design a data center, you do not design components. I'll give you two examples of what does it mean co-design. One example is that in the traditional world, let's say, there were compute engines that were doing compute, and then there were networking elements that were tasked to move data. That was the separation between them.

When you design a data center to serve the AI workloads, and you have the ability to decide where to put what, now there are no boundaries. For example, we took compute engines, kind of traditionally run on compute components. We took compute algorithms and running them, and we're running them on the network. For example, what we call SHARP in InfiniBand is doing data analysis on the data on the network. The network is not just moving data. It's actually participating in the compute cycles. Why are we doing that? Because once you do the reduction operations, for example, on the network, you can save half of the bandwidth that you need to run, and you can complete things much faster. This is an example where you move things from compute to the network.

On the other side, traditional network topology was building a concept of a top of rack switch, which means that all of the NICs will go to the switch on top of the rack, and then you will connect those switches together. This is the wrong thing to do if you build an AI data center. Because you mentioned, for example, in previous generation, we had eight GPUs connected on NVLink. So those eight GPUs already communicate between themselves on NVLink. They do not want really to continue and talk with themselves again. Why would you put all of those GPUs on a top of rack switch? It does not make sense. You want to spread that connectivity and have every GPU connect to other GPUs in a fabric. This is where we created a multi-rail topology.

Now the network is designed the way that the compute is running, the way that the compute algorithms are running. Then we're taking some of the compute algorithms actually running on the network because it's much more efficient to do it there. This is one element of AI workloads required to actually design a full data center, design that unit of computing, and then you want to do that in a full synergy, in a full co-design. That's one thing. The other thing, of course, is that AI frameworks continue to evolve. That's why every year we have a new compute engine that's coming out. There is new network infrastructure or computing infrastructure coming out. There is a new GPU. There are new NICs. There are new switches in order to serve scale because we see increase in scale.

You're moving from thousands of GPUs to tens of thousands. A year after, you go to hundreds of thousands of GPUs. People are talking about now the million-scale GPUs. You need to actually be able to grow that element. You have so many routes. Just think about it. With all those GPUs, every GPU communicates on the network. There are so many routes that you need to make sure that you send them in the right direction and no one's going to collision with another one. There is no congestion on the network. There is so much complexity in that network. That's why you see that every year there is new generation, new capabilities, new elements that are brought into the compute infrastructure to support the full data center design and to support the different kind of workloads that we see.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Okay. You're connecting these hundreds of thousands now of GPUs. Maybe if I even roll it back a little bit with the Mellanox acquisition, at Mellanox, you had both Ethernet and InfiniBand. I guess what's the difference in the two offerings? Is there scale-out? If we go into depth in talking about the scale-out networks, how do you decide whether InfiniBand is the right solution or Ethernet is the right solution?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah, actually, we let our customer decide what makes sense for them. Maybe I'll start a little bit in the history. Yeah, Mellanox did start with InfiniBand. InfiniBand was built for distributed computing workloads. It was built in a sense that first, it is lossless. Unlike traditional network, that lossy was fine, meaning traditional network works in a way that if there is collision on the network, you don't try to solve that collision. You just drop packets because it's okay. I can retransmit the data. When you deal with distributed computing applications, if you drop data, you're going to retransmit the data, but you don't just retransmit the data to a single GPU, for example. The fact that you retransmit the data to a single GPU and that GPU becomes late in the whole scheme of the workload, now everyone else is waiting.

You do not want to retransmit data. You do not want to drop data. InfiniBand started as a lossless network. You do not want to drop data. You do not want to create the latency. InfiniBand, in order to do that, brought congestion controls and adaptive routing elements later on and so forth. It was great for scientific computing, great for HPC, and essentially, it is great for AI because AI is distributed computing. Today, InfiniBand is still the gold standard for AI. Everyone that builds a network always compares its network to InfiniBand. Even when we did Spectrum-X, kind of creating Ethernet for AI, we compared it to InfiniBand. That is the gold standard. It is the gold standard. It brings elements that no other network exists, and it is a great solution.

If you build an AI factory, single job running large scale, InfiniBand, there is nothing better than InfiniBand. It is the gold standard. Now, NVIDIA also brought Ethernet. We designed Ethernet for AI. You can ask, if you have InfiniBand and InfiniBand is so great, why did you guys bring Spectrum-X? The reason for that is that we believe that AI is going to go everywhere. Every data center will run AI. Therefore, there will be AI clouds, multi-tenancy, multi-workloads, multi-users. There will be AI in enterprise. We are talking about enterprise AI. We see a lot of enterprise now adopting AI. Those areas are being built by people that are familiar with Ethernet for many, many years. They build their software stacks. They build the management tools all on Ethernet.

If they continue running with Ethernet and keep their management and keep how they support their enterprise company and so forth, that would be much better for them. AI is evolving so fast. Therefore, start learning how to handle InfiniBand, for example, manage InfiniBand, meaning they're going to lose the train. We wanted to help them. We knew that it's going to go to everywhere. Everywhere means that we want to bring Ethernet to AI. We want to enable Ethernet as an option for AI. For people that build AI data centers and they are familiar with Ethernet, their software depends on the software ecosystem of Ethernet, all the tools that were created, their own management infrastructure that is there and was built over the years and progressed over the years and run on Ethernet. We do not want them to recreate it again.

For them, Ethernet is a great thing. This is where we built Spectrum-X. Now, one important thing to kind of know of what we did in Spectrum-X. Spectrum-X is the first generation of Ethernet for AI because nothing in Ethernet fits AI before Spectrum-X came. Spectrum-X is actually not the first generation, and the reason is that what we did is that we brought things from InfiniBand, from the multi-generations of InfiniBand that continue to evolve over the years. We brought those elements to Ethernet. That is why Spectrum-X, on one side, is kind of the first Ethernet for AI, but what we brought inside has years of development on the InfiniBand side. That is why it came in very mature, very quickly, and actually completely aimed to solve the problems of AI on Ethernet.

For example, we brought lossless to Ethernet because we do not want to drop packets. We brought lossless. We brought adaptive routing capabilities. You have lots of flows between GPUs. You want to make sure that every flow will go in the best available path. It is like solving the routes, in a sense. We brought the congestion control to Ethernet. No collisions. You want to make sure no collisions, that one application cannot impact another application by creating collisions on the network. We brought many things from InfiniBand into Spectrum-X and actually created Ethernet for AI. Now you have InfiniBand, which is a gold standard. If you are building a supercomputer in single job, if you know how to manage InfiniBand, use InfiniBand in the past, there is nothing better than that. If you are running Ethernet, you can keep running Ethernet.

We brought the best Ethernet for AI with Spectrum-X. A good example for that is that Spectrum-X is running more than hundreds of thousands of GPUs, more than 100,000 GPUs in a single data center for a single workload. There is no other Ethernet technology that managed to achieve what Spectrum-X did. The reason is Spectrum-X is built for AI. We have a great Spectrum-X. There is great InfiniBand with Quantum. Now people can choose what makes sense for them based on their workload, what they need to serve, what they're building, what's their familiarity, what is their software ecosystem, and so forth.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

That's just to make you can put those features into Ethernet and that more of a physical layer features that you're doing, and so it doesn't affect the customer's Ethernet that they have been running for years.

Gilad Shainer
SVP of Networking, NVIDIA

It's not in the physical. It's a combination. It's the physical. It's the link layer, the transport level. It's the way that the NIC runs with the switches. One of the things that made InfiniBand so great is that it's a platform. It's not separate NIC and separate switch. It's like it works together. The NIC gets information from the switch network in order to determine the flow of data. The switch element knows how everything in the data center behaves. You need to know not just your own status when you do routing on the network. You want to know the status of your neighbors. The neighbors could be the NICs on the switches. If my neighbor's switch has some issues, for example, I don't want to continue sending data to the same area. There is a global load balancing that happens.

The NICs work in congestion with the switch. It's a full end-to-end. This covers everything from PHY to link to transport. On top of that, you have all the management stack. You have all the cloud management tools, for example, and the hosting, multiple tenancy, and stuff like that. That runs on top of that infrastructure, not on the network. What we brought into Spectrum-X covers all the infrastructure limit. Everything that runs on top of that could be the same. This is where it goes easily into people that build Ethernet or data centers and build their software ecosystem. Now it actually goes directly there. You bring them the elements of the infrastructure that are needed for running AI training or AI inferencing.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Okay. So your Spectrum-X could be mixed in with standard Ethernet. Some other racks could be running standard Ethernet.

Gilad Shainer
SVP of Networking, NVIDIA

SpectrumX is Ethernet. It is interoperable with any other Ethernet devices. If you build, for example, an AI data center, you build a unit of computing. That means that SpectrumX will be the scale-out infrastructure, for example. It covers the full stack. Now, that data center can be connected to other parts of your infrastructure. It can connect to storage. It can connect to another data center. It can connect to users, to their desktops, and stuff like that. This is where you might see other kinds of Ethernet. Connecting to desktop, traditional Ethernet is great. Of course, you can connect that traditional Ethernet into that data center that has SpectrumX for the scale-out infrastructure.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Okay. Great. Maybe if we could touch on, too, you had mentioned the NIC cards and your DPU or the BlueField. Can you talk about the importance of that, of having control over the DPU within that same network?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah. So the DPU actually brings another element of that infrastructure. And so when you build a data center, there is no one network. Traditional world, there was one network. If you go to hyperscale clouds, there is one network. You build an AI supercomputer, different story. You mentioned it already because you asked, there is scale-up and scale-out. Here are two networks. There is at least two networks, scale-up and scale-out. Now, there is also access network, meaning users need to access the data center. That is a third network. Now, there is also storage access that might be even a fourth network. There are multiple elements in that AI data center. There are different components to each. If we look on NVIDIA AI data center, we use NVLink for scale-up. We use Spectrum-X for scale-out or InfiniBand.

That scale-out includes the switch and includes what we call SuperNIC. That SuperNIC has a compute element inside of it in order to determine the injection rates and process telemetry from data from the network and so forth. We have the DPU on the access network. What the DPU enables you to do is to move the data center operating system from the server compute engine into something else. That greatly helps with security. If you build a data center and your hypervisor, for example, your hypervisor can run on the same CPU that hosts the user applications, you have a security threat because a user can get access to the hypervisor. Now you can control the entire data center. In order to make it much better, you want to separate the infrastructure domain from the application domain.

The CPU will host the users on the system. You're going to run the hypervisor, for example, or other elements of the infrastructure operating system on a different element that's completely separate from those applications where the applications are running. This is where the DPU plays a role. We're running the DPU is being used in order to run the data center operating system, to provision the servers, to do the secure access to the user that's coming into the data center, and so forth. DPU is what we call north-south, kind of the access network. SuperNIC and the switches or Spectrum or InfiniBand are part of the scale-out infrastructure. Some people call it back-end network, or some of them call it compute network or compute infrastructure. You have also the scale-up, where there is another element of NVLink.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Okay. Great. The DPU also is kind of freeing up the CPU for doing cycles of what it's good at. The DPU does. Yeah, maybe if we switch over and talk a little bit about the scale-up network, the NVLink. You have NVLink and NVSwitch. There are other topologies out there. I guess what's the advantage of NVLink over, there's UALink, and even Broadcom on their earnings call, they've been talking about just using Ethernet as the scale-up. Can you kind of give the gives and takes of each one?

Gilad Shainer
SVP of Networking, NVIDIA

I can definitely talk about what we do. First, scale-up is not easy to do. It's very not easy to do. It needs to take those GPU ASICs and make them one. It needs to form one unit out of a lot of ASICs together. Therefore, it's not just the huge amount of bandwidth that needs to run between them. You need to have a very high message rate that everyone, all ASICs, will connect and communicate together as one unit. You need to have a very low latency between them. It's a very tight network. Because of that, we are trying to put everything in a rack. We can use, for example, Copper for that connectivity. Copper, first, it consumes zero power.

Because of the huge amount of bandwidth, if you would do it on something else, there would be a good amount of power being consumed there. You want to make it very resilient and so forth. We want to maximize Copper. That's why we want to put everything in a rack, in a closed rack. This is where density becomes an interesting element to deal with. This is where we bring liquid computing into the game because we want to pack everything to increase the density so you can maximize Copper and build like a one unit. There is a good amount of complexity in actually building NVLink elements. One thing that is obvious that NVLink brings, it's working. NVLink, it's in its 5th generation. Essentially, what made InfiniBand so great is because it had many generations in it. It continued to evolve.

It continued to be better and better and better. That is what makes Spectrum-X so great because we took all the 25 years of InfiniBand and put it on Ethernet. Putting an idea of a network, it says, "Okay, the first shot, my first shot, is going to make it so great." In reality, it is not the case. This is a complicated element. There is a huge amount, just thinking about NVLink 72 is like 130 terabytes per second in a single rack. It is like the entire peak Ethernet traffic is just running in a single rack. This is what you need to support in that sense. It is few generations. It continued to evolve over the years, connecting more and more GPUs. We brought SHARP into NVLink. Actually, there are compute engines. There are compute algorithms running on that NVLink when you are running everything together.

This is kind of NVLink. I tried to give you a little bit on the complexity of it. Obviously, having the 5th generation, it just showed that you evolve from GPU to GPU. You bring more elements, more capability. You need to adjust to the workloads. I'm not sure that I mentioned it, but the reason that we're annual cadence on the infrastructure, not just on the compute, is because there are elements that you need to bring into the infrastructure, including those data algorithms that are being added from generation to generation because the workloads are different, because the workloads are being modified. As the workloads are being modified, the compute algorithms need to be modified. That impacts what you put on the infrastructure, which includes NVLink and the rest. This is where the cadence and being robust, it's working. It's amazing technology.

It's liquid cooled. It's the dance. Fully copper. And that's what makes NVLink NVLink.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Great. You also announced the NVLink Fusion. You opened it up that it is not a closed network. Can you talk about the advantages of NVLink Fusion?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah. Once you get the sense of how complicated scale-up is, and people may say, "No, it's easy." No, it's not easy. There is a huge amount of complexity in it. Why not help our customers that want to build their own custom accelerators, for example, leverage what we invested for years building the best scale-up infrastructure with the liquid cooling, with the density, with all the aspects of that and all the performance of that? Why wouldn't we let our customers leverage that huge amount of investment and make it easier for them to take those accelerators that they build, those custom XPU that they build, custom accelerators, and actually leverage our infrastructure to build a solution for them? We design a data center. We design it as a whole. You can take pieces of it. You can take the GPU.

You can take the CPU. You can take them both together. You can also take the infrastructure if you want to. This is where we build or are working with ecosystem, which includes MediaTek and Marvell and Alchip Technologies and Astera Labs, for example, and CPU suppliers like Fujitsu and Qualcomm and working with them so they can leverage what we do. Infrastructure, we start this talk with saying the infrastructure becomes a key element. Essentially, by having NVLink Fusion, we enable that key element to be used by people that need or want or require to build their own accelerators. Now they can leverage what we did, what we designed, and actually get a great data center for their own custom elements.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Maybe just to understand that, is that NVLink Fusion, you mentioned Qualcomm, just use them as an example. If their CPU wants to connect, do they pay a license for the NVLink? Or do they just start using your NVSwitch?

Gilad Shainer
SVP of Networking, NVIDIA

I think that there is an element of NVLink that they will need to connect to. Essentially, they need to get the interfaces. They need to get, for example, an NVLink dialect that the CPU can connect to it. Once they have that, they connect to the NVLink switch. They can acquire the NVLink switch. They can acquire the entire elements that also come there with the liquid cooling, all this stuff. They are taking elements from us. They're taking the API from us. Obviously, we work with them. They can build their own system.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Okay. Tyson, we are getting lots of questions from the audience. People are, you've introduced the Silicon Photonics switches at GTC. People are asking, NVLink, when does that go fiber? Also, when we talk about scale-out, is that already fiber? What happens with Silicon Photonics tied into all of this?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah. Different elements here. First, on the scale-up, let's put it like that. Copper is the best connectivity. Copper is the best connectivity. It's zero power. It doesn't consume power. It's very reliable. It's very cost-effective. You would like to use Copper as much as you can, as much as you can, for any connectivity that you can. Therefore, we are trying to put as much compute density in a rack, because within that rack, we can use Copper. That's why we're investing a lot to increase the amount of compute in the rack so we can use Copper and run that, because there is nothing better than Copper. Now, when you go to the scale-out, this is where you talk about distances, because now you have racks that need to be connected, and you're out of the reach of Copper.

You need to go and use optics. You need to use optical connections. In traditional data centers, the amount of connectivity between racks was very, very small. There is not much optic transceivers or optical connections that were there. When we look on an AI factory, every GPU has a NIC out. If we look on Blackwell, every Blackwell has an 800 gig NIC that goes out. The scale-out infrastructure, actually, there is a good amount of optic connectivity. We need to use around six transceivers for every GPU. If you build 100,000 GPU data centers, that's like 600,000 transceivers. Now the power that's associated with the optical network becomes something that can consume up to like 10% of compute. If I'm building 100,000 GPUs and I can add another 10,000, that's not a small number.

Now the power becomes something that you want to look at how to improve it. We all know that the limiting element in building data centers is power. It is not really space. It is actually power. As much as you can save power and you can redirect it to compute engines, that is a great thing to do. The second thing is that data centers increase in size. It goes fast. Like two weeks ago, we talked about 16,000 GPUs in large data centers. Now you are talking about hundreds of thousands of GPUs. So 100,000 GPUs, 600,000 transceivers. It takes time to install that. It takes time to manage that. You might need to replace elements. There are so many components that you need to deal with.

This is the right time for improving optical network for the scale-out. The way to improve that is to introduce Co- Packaged Silicon Photonics. Co- Packaged Silicon Photonics, what that means, it means that instead of having the optical engines in every transceiver, I will take those optical engines and put that next to the switch and package it together with the switch. Now, what did I do here? First, I reduced distances. If the optical engine is in the transceiver, it needs to go the distance through the transceiver, the cage, the PCB, the substrate, to go to the switch. I reduced the distance. With that distance, I reduced the power. Now, on the same ISO power, on the same ISO power, I can put 3x more GPUs.

On the same ISO power of the network, I can connect 3x more GPUs. That's huge. Now, I'm reducing transceivers. Now, I have one transceiver per GPU, not six. Think about how many elements you reduce from the data center, which means it's not just I increased the resiliency of the data center because now there are less elements. I also reduced the time to operation. I can build the data center much faster. CPU brings such a greatness element. We started with the scale-out. Because again, it's like 10% of compute power. I can increase that number. It's huge. Reduce the number of components. There is a huge amount of benefits bringing Co- Packaged Optics into the scale-out infrastructure. Now, on the scale-up, as close as I can use Copper, I'm going to continue to use Copper. We increase the density.

We use Copper because there is nothing better than Copper. As long as you can use Copper, we use Copper. This is where we continue to use Copper. We announced that we're having 576 GPUs on Copper. NVLink. Scale-out, it's multi-racks, distance, optics. This is where Co- Packaged Optics will be a great thing.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Do you have an idea if we can get to 576 with Copper? When do we have to cut over to optics?

Gilad Shainer
SVP of Networking, NVIDIA

It's a good question. Over the years, there was always people saying, "Oh, this is going to be the last generation of that." Yeah. It's like, "No, it will be the last generation of that." It's like every time when people say it's going to be the last generation, apparently, there is another one. As long as we can pack, we'll pack.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Okay. Great. I'll see if I can answer. Let's see if there's a question we haven't covered yet. If you can answer this, if you're winning in the market, are you displacing what Marvell and Broadcom solutions are or add solution providers like Coherent? Are you replacing their designs?

Gilad Shainer
SVP of Networking, NVIDIA

First answer is not really. The reason is the following. First, there are many infrastructures in the data center. There are many areas that require and need to use transceivers. On the scale-out infrastructure, we're going to introduce Co- Packaged Optics. North- South Network, for example, requires transceivers. We put transceivers on NIC and so forth. Since the data centers are growing and the market is growing, there is enough for everyone. Therefore, we're not replacing anything. There is different infrastructure. There are infrastructure areas that require transceivers. That's one thing. The second thing, we are working with that ecosystem of partners. They are part of our CPO infrastructure. They are contributing into what we're doing on CPO. They're bringing our elements. We're working with the ecosystem. For example, we announced working with TSMC on packaging.

We're working with a lot of vendors that you mentioned on lasers and optical arrays and the different elements that we need for connectivity. They are contributing to our CPO infrastructure as well. They have more or a good amount of transceivers to continue and support. Data center is growing. AI is going everywhere. There is enough for everyone.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Great. I think we're out of time here now. I would say, in summary, you've got the scale-up and scale-out networks. You're tying together hundreds of thousands of GPUs to act as one big GPU. If people want to come into the NVIDIA network, they're open to do that. You think you've got the right solutions. Maybe if you want to give a closing remark also.

Gilad Shainer
SVP of Networking, NVIDIA

Yeah. I think you had short questions. I had long answers. Sorry for that. In the past, people's data center budget was focused on, "Let's buy as many servers as we can." If something left, we may connect them together. If something left after that, we may do some storage and stuff like that. I think now folks realize that the infrastructure is key. It's not just network elements, buying a NIC and buying a switch. No. You're buying a spaceship. You're buying a supercomputer. You're buying something that requires the kind of to be fully synchronized with the data center. That infrastructure will determine what data center will do. That infrastructure will determine if those compute engines are just a server farm or that's an AI supercomputer for training or inferencing. It's a key element. Its importance will continue to increase.

We'll see innovative technologies coming into the infrastructure. It's something that keeps us excited. Yeah. This is where the infrastructure is. I think more people are interested in learning about that. I'm happy that we were able to talk today. I hope that we provided people with more or better understanding about the infrastructure that we built.

Kevin Cassidy
Semiconductor Analyst, Rosenblatt Securities

Yeah. That's great. Thank you. Thanks, Stewart. Thank you, Gilad. Thank you very much.

Stewart Stecker
Senior Director of Investor Relations, NVIDIA

Thanks, Kevin. Thanks, everyone.

Powered by