NVIDIA Corporation (NVDA)
NASDAQ: NVDA · Real-Time Price · USD
200.50
-8.75 (-4.18%)
Apr 30, 2026, 12:16 PM EDT - Market open
← View all transcripts

Piper Sandler Webinar: Networks for AI

Jun 28, 2023

Moderator

Okay. Good morning and good afternoon, everyone. Thank you. A huge thank you to everybody that's signed on for this webinar. We understand and realize that your time is extremely important, and we appreciate you spending some of that valuable time with us today. Also, a special warm welcome to the NVIDIA team, Colette, CFO, whom a lot of you know already, and Gilad, SVP of Networking. Also a special thanks to Simona and Stuart, who made this all possible. Before we kind of get into the networking piece of it, I did have one or two questions for Colette, based on some sort of speculation that's percolating in the media, about U.S. and China.

Colette, maybe in light of last night's press articles regarding potential new export control on AI chip shipments to China, what can you tell us about the potential impact to your business at NVIDIA?

Colette Kress
CFO, NVIDIA

Thank you so much. Thanks for the question. Let me see if I can provide a little bit of understanding. We are aware of reports that the U.S. Department of Commerce is considering further controls that may restrict exports of our A800 and our H800 products to China. However, given the strength of our demand for our products worldwide, we do not anticipate that such additional restrictions, if adopted, would have an immediate material impact on our financial results. We do not anticipate any immediate material impact on our financial results.

Over the long term, restrictions prohibiting the sale of our data center GPUs to China, if implemented, we will result in a permanent loss of opportunities for the U.S. industry to compete and lead in one of the world's largest markets, and the impact on our future business and financial results is there.

Moderator

Well, that's about as clear as it can be, Colette. Thank you for that. Could you maybe one of the questions we're getting a lot this morning is some context around the percentage of your data center revenue that is driven by sales to China.

Colette Kress
CFO, NVIDIA

Historically, it has a little bit of a range in terms of what we've seen historically. We believe the contribution of sales to China has been in the range of approximately 20%-25% of our data center revenue. Keep in mind, this includes all of our compute products and systems and also our networking.

Moderator

Okay, great. I did have one last one that we've been getting a lot. The switch to, you know, you guys were able to pivot very quickly to the A800 in a matter of a few weeks. People almost believe that that was a software change. I just wanted to clarify, when you guys made that switch to A800, was that a software change or was that a hardware change?

Colette Kress
CFO, NVIDIA

It was not a software change, our movement to A800. It was absolutely a hardware change, that we made to create the A800, as well as what we did to create the H800.

Moderator

Thank you, Colette. This was supposed to largely be a networking session. With that, I think, Simona, if you want to turn over, or Aaron, if you want to turn over the presentation to Gilad, we can go ahead and get into the networking piece of things. Thank you, Colette.

Gilad Shainer
SVP of Networking, NVIDIA

Yeah, thank you very much. Nice, nice to be, nice to be here. I'm Gilad, SVP of Networking. Came to NVIDIA through the acquisition of Mellanox, and been before that at Mellanox, almost from the beginning, 20 years plus at NVIDIA, at Mellanox, and now, I think three years at NVIDIA, more or less. I started as a network designer. Some of the early InfiniBand devices, I was part or I did part of the design of them. Actually started looking at the entire platform, the software capabilities and so forth. Excited of being here, and I think we can move to the next slide.

I think this is, this is going to be the slide that has most number of words on it, and make sure that we have much less words on coming slides. Our presentation, I need to say this statement. Our presentation may contain forward-looking statements, and please do refer to our SEC filings for risk and uncertainties facing our business. With that, we can move to the next slide. We're here to talk about networks for AI, obviously. You know, when we talk about AI, it's not just a matter of a network, right? It's like not just design components. You need to look on the entire system. You know, what are we designing the network for?

Essentially, we want to build or design a full accelerated compute network balance system. It's not started at the NIC. It doesn't start at the switch. You know, it starts, you know, GPU levels, you know, memory I/O on the GPUs all the way to the network. And we need to essentially start at the application framework, the application level. It looks on everything there. And that's the great advantage of NVIDIA. You know, if you look on NVIDIA as a networking part, you know, it's the only networking entity that actually build and design full AI platforms and use those AI platforms. Looking from a software perspective, the platforms, the SDKs, the libraries, there is huge, there is ton amount of software which is part of the system.

There is the case that connects between the network and the GPUs, for example, and then the full hardware capabilities, the compute, the accelerated compute, the DPUs, the switches, the NICs, and so forth. Be able to build or see the ability to build the entire system, give you the opportunity, unique opportunity, to place the right data algorithms in the right place. There are data algorithms that you don't want to run on a GPU. We actually want to run them on the network, and that's what we call in-network computing. There is elements that traditionally you may do it on the network, but if you can, you probably want to do that on a GPU because it's going to be much more effective.

We are able to move algorithms, we're able to build a very effective system that delivers the highest levels of performance at very, very large scales. With that, understanding that, you know, you don't just build a network, you actually build the entire data center. Let's talk about the data centers, and then dive into the network. We all know that the data center is the computer today, right? In the past, CPU was the computer, and then the server will become the computer. Now, the data center is the computer. You know, we're just putting down data centers to run workflows, to run AI applications. Now, if you look on the data center, there is a big collections of GPUs, and actually, the way that you connect those GPUs define the data center.

The way that you connect the GPUs define what you can do with those GPUs and define what kind of workloads you can run on those GPUs, essentially define a data center. We can look on different examples. The first example, let's look on cloud. Traditional cloud. Traditional cloud, it's a data center that is built to support many, many, many users and to support variety of workloads, and most of them are very small. Even a single node workloads, a lot of single node workloads. Such a traditional cloud is being connected today with traditional Ethernet network, because traditional Ethernet network is good enough for those kinds of platforms. Lot of users, lot of applications, almost all of them are very, very, very small. We're actually facing the creation of new kinds of clouds.

New kinds of clouds, new class of clouds. Clouds to support GenAI workloads, cloud to support AI workloads. AI workloads are different than the traditional workloads that runs on traditional clouds. AI workloads are not just running on a single node. AI workloads needs to run on multiple GPUs. They need to run across nodes. Even more important, when we talk about AI and AI workloads, we actually start talking about distributed computing. Distributed computing is completely different than the, than disaggregated computing. It's completely different than hyperscale. It's something new. That something new in the clouds actually requires elements to support distributed computing. Now, you start talking about latency needs and day latencies and effective bandwidth. Those are completely different kind of requirements. Traditional Ethernet is still fine for the north-south traffic of GenAI clouds.

You need the user access, the services, the control of that cloud, but you need a new class of Ethernet to support the new class of workloads in that new class of clouds. That's exactly what Spectrum-X is. It's the first Ethernet ground-up design for AI. We start internet interface so people can enjoy the or utilize the internet ecosystem, and essentially combining both. If our data center mission, it's actually to run massive large-scale workflows, massive large-scale applications, large language, large LLMs, large language models, and complex training, to deal with complex training. This is a new kind of data. This is a different kind of data center, right? We're not talking about many users and variety of workloads.

We're talking about much less number of users and workloads that are gonna consume the entire GPUs in that system on a single application. It's not a matter of how many GPUs you can connect, it matters how many GPUs a workflow can consume with that network. It's completely different thing. Now, if you want to support workload, that's gonna consume tens of thousands of GPUs or hundreds of thousands of GPUs, the only option, and that's why it's become gold standard, is the combination of NVLink and InfiniBand. NVLink and InfiniBand are not the same architecture as Ethernet. It's a completely different kind of architecture, specifically designed for distributed computing, being optimized over the years, and that's the network that can support running workflows on a large scale of GPUs. Okay?

This is where we have different kind of networks. It's not one network fits all. This is, by the way, shows you that both Ethernet and InfiniBand coexist, and they will continue to coexist. Because in every system that you build, you do need a network to do the user access, you do need the effort to run control and north-south traffic always, and Ethernet is great for that. For AI infrastructure, you need a network for east-west. You need the network for the compute, for distributed computing. This is where NVLink and InfiniBand are actually gold standard.

If you go to the next one, in the coming slides, I'm gonna refer to a couple of terms, and we want to make sure that everyone understand those terms, so it's gonna make life easier. And in particular, NCCL and SHARP. NCCL, what's NCCL? NCCL is a short of NVIDIA Collective Communications Library. It's a software SDK. It's a software SDK for AI communications for multiple GPUs. Essentially, this software framework supports mainly two multi-GPU arithmetics or co-communications. One of them is reduction or reduce operation, and the other one is all-to-all communications.

NCCL essentially enables the connection between the GPU side and the network side to support those two operations: reduction operations and all-to-all operations. NCCL, you know, if you want to measure AI networking performance, or networking performance for AI, NCCL is a great option to test the performance with. You can look on what's my performance for NCCL reduction operations? What's my performance for NCCL all-to-all operations, for example, and that will demonstrate the impact of the network. It's a good way actually to test the network or to measure the network with. SHARP, it's a technology part of in-network computing, NVIDIA in-network computing. It's a technology that implemented in the InfiniBand switch ASICs.

It's not something that runs on a CPU or something. It's embedded within the switch ASIC, that enable the switch network to perform data reduction operations on the data as the data is being transferred within a data center. Previously, those data reduction operations, which are part of NCCL, were done on the host. Running them on the host is a big toll on the host. This is kind of part of NVIDIA advantage, the ability to move algorithm from one side to another side and to run them in the right place. Moving the data reduction operation to be run on the switch network reduces the amount of data that you need to send on the network by half. It's a huge impact.

It means that a 400 gig end-to-end InfiniBand network with SHARP, it's better than an 800 gigabit per second end-to-end network without SHARP. That's amazing capability of InfiniBand, that's one of the elements that enables InfiniBand or make InfiniBand the gold standard for AI factories. If you look on what's this NCCL, the impact on NCCL with SHARP, you can see that on the right. We're gaining 1.7x higher performance because of SHARP, because of running NCCL, because of running reductions on the switch network, compares to the best other network you can actually get. If you compare it to the best theoretical performance on the internet, it's 1.7x.

This is one of the key things that actually make InfiniBand again the gold standard for large-scale AI, for AI factories. Now we can go back and talk about the different networks, and we'll start in the cloud, and then go to Spectrum next. In the cloud itself, in the cloud, there are two kinds of internet networks, essentially, two worlds of internet. There is a network that is doing the north-south connectivity. That's the control access. That's the user access. Those are the cloud services. Cloud services or user access, those are usually couple applications. You typically use TCP for that traffic. Jitter is fine, you know, because there is user access, there is jitter, and jitter is okay. Latency is actually not critical.

Predictive and constant performance of bandwidth, it's not important as well. What's important for you is to deal with heterogeneous traffic. You need to deal with multiple loosely coupled processes and process them and enable them to run on the network. This, this is where, this is where traditional internet has been used, right? This is where internet was designed. This is kind of the cloud network that we all know about. Now, in the cloud, there is a second network, which is the compute network, what we call east-west. In traditional clouds, there is no much east-west traffic because most of the workloads, most of the user are running on a single node even.

Therefore, in a traditional cloud, you can take the same north-south network and use it for east-west network, and that's fine. That's okay. That works. Now, if you want to host AI workloads, if you want to build clouds for generative AI applications, now, east-west network needs to be completely something else, because now east-west networks needs to deal with distributed computing. This is distributed computing is very sensitive to latency, but even more, it's sensitive to tail latency. You know, in distributed computing, you run application across multiple GPUs, many GPUs, in a sense. If one GPU communication is gonna be late, only one, you know, let's say that I'm running on a 500 GPUs. If one GPU communication is gonna be late out of that 400, just one.

the entire workflows will be delayed. The entire workflow will be delayed. Tail latency is a critical element for AI performance. It's completely not relevant for north-south traffic, but it's critical for east-west. Effective bandwidth is important, and you want to provide constant performance. You cannot have changes in performance levels, and you need to deal with burstiness. The requirement for distributed computing are completely different, I would say the opposite of what you need in north-south. Now you cannot use a traditional internet for east-west. You need to do something else. You need to have a different class of network to support the new needs of AI applications in the cloud. That's the reason we did Spectrum-X.

That's the reason we designed Spectrum- X, because we needed a new class of Ethernet for these kind of infrastructures. Next slide, please. Now let's, you know, let's look on Spectrum- X. Spectrum- X, you know, on the left side, you see all the stats, you know, 51.2 tera, the number of ports, essentially, and so forth. On the right side, by the way, you can see a snapshot of the software that is being developed for Spectrum- X. There is tons of software. SDKs, you know, there is DOCA SDKs that runs on the DPU on BlueFields to provide the network virtualization, the isolation between the application infrastructure, the applications and the infrastructure. There are the Spectrum- SDK for the switches. Magnum IO , the SDK that includes the NCCL framework that I mentioned before.

You have the operating system that runs on the Spectrum-4 switches, which are SONiC and Cumulus, and other aspects like that. There is tons of software with it. Now, Spectrum-X was essentially designed ground up for AI. We built new capabilities, actually designed new capabilities for Ethernet. And some of those capabilities are including, first, lossless Ethernet. What's interesting here is essentially the combination of those elements. I'm gonna go through them. First, lossless Ethernet. You don't want to drop packets. Dropping packets mean you're creating jitter, and creating jitter, and now you're reducing AI performance. You don't want to drop packets. On top of those of lossless Ethernet, you want to support adaptive routing. Not a flow-by-flow adaptive routing.

You know, we do see flow-led adaptive routing in internet switches, in traditional internet switches. Flow-by-flow means that you need to run a stream of data, and you don't change the path of that stream of data before that stream ends. That's not good for AI. You know, in AI, you want to do fine-grain adaptive routing. You want to do packet-by-packet adaptive routing. That's an element that it's enabled by actually doing lossless. Even more, you want to do the packet-by-packet adaptive routing on lossless network with shallow buffers, not deep buffers. Not deep buffers. There are internet options out there, for example, that sometimes referred to as fabric, not internet, because, you know, sometimes they don't run actually internet. Those depends on deep buffers, you know, big buffers in the switches to big shock absorbers.

If there is congestion, they can kind of hold data and stuff like that. Deep buffers mean long tail latency. Long tail latency is not something that is really nice for AI workloads. You don't want deep buffers. Now, the IP here, it's combining lossless Ethernet, fine-grained adaptive routing, and shallow buffers. That's the combination. That combination does not exist in traditional Ethernet. Completely does not exist. This is one part of Spectrum- X advantage. Second part is doing congestion control. You know, you need to eliminate hotspots. We design in Spectrum- X, congestion control, which is based on first telemetry information, but also have unique capabilities in the network in order to identify latest latency changes, so you can react to hotspots before they can impact performance of applications.

This is important because this is the key to provide traffic isolation. This is the key to eliminate noise, to make sure that noise cannot impact AI performance. In a cloud, in a cloud, you run many, many workload. Many workloads. You want to make sure that those workloads, you know, especially the, you know, the small-scale workloads, will not impact the large-scale workloads. You know, they're running on the same network, but you want to make sure that you isolate the noise from on the small workloads, that they will not impact the AI workloads. That's exactly what we're doing with congestion control, the telemetry-based congestion control, and the capabilities to identify latest latency changes and identify hotspots before they actually can do a negative impact. What it give us?

That give us 1.6x higher AI fabric performance over traditional internet. We're talking about not just 95% effective bandwidth at scale and under load, but keeping that performance constant. Predictive performance, keeping that performance constant. Even that you have a lot of other work workloads running in the same environment, because we did it without having the security, the virtualized network, everything is part of that. Now Spectrum-X actually bring the speeds and feeds that you need for AI, but it does it with an Ethernet interface. People can leverage the Ethernet ecosystem for services that were built for Ethernet, for cloud services and things of that sort. Now they actually have an Ethernet that was designed for AI. Next slide, please.

As we're looking to support larger scale of AI workloads, this is where we go to InfiniBand. InfiniBand start to see on the left side on the latest generation there. One thing that you need to understand, InfiniBand is designed based on a different kind of architecture versus Ethernet. Ethernet was built for, you know, wide area networks. Over time, within the data center, more and more algorithms were designed for Ethernet. More and more algorithms were designed for Ethernet. PFCs, for example, BGPs. There's more and more algorithms that were designed for Ethernet. Ethernet, it's a very complicated protocol, and when you build an Ethernet network, you need actually to choose between features and performance.

You need to choose between features and performance. That's why in Ethernet, there is no one switch fits all. You see a variety of switches coming from different kind of entities, and the reason is that no one switch fits all. There are switches with shallow buffers and more ports, but not much of good performance for distributed computing and just supporting kind of, you know, cloud interfaces. There are switches with deep buffers in order to support, you know, sometimes telco and service applications, but that comes in issues of tail latencies and reduced number of ports and so forth. You need to choose. You know, you need to choose between features, performance, and other stuff.

In Spectrum-X, we actually designed that to have the right elements that you need for AI, actually created things that doesn't exist in traditional internet. InfiniBand, when you look on InfiniBand, this is a different kind of architecture. They are not using the same architecture. InfiniBand was designed from the beginning to support distributed computing. From that reason, InfiniBand protocol is very simple. It's lightweight. It's very, very simple. Because it's very simple, there is no meaning in InfiniBand for leaf and spine. Kind of thinking in terms of Ethernet, there is a leaf and spine, and in Ethernet, you're trying to build two level of network, two level of switches, don't go beyond that two level of switches and stuff like that. It doesn't exist in InfiniBand. There is no such thing in InfiniBand.

InfiniBand, you can use as many switch layers that you want. You know, even more, most of the large-scale systems out there are using three levels of switches in InfiniBand. Some systems even use four. You know, if you want to use five, use five. There is no performance penalty there. There is no issues around that. You can build any size of system that you want. It's like, you know, you're designing here a formula race car. You know, if you're designing a formula race car, how many seats I'm going to put in that car? Who cares? It's a different kind of design. If you're looking on three level of switches with InfiniBand, which is what most systems use today, that can go all the way to 65,000 GPUs.

If you go to four levels, and we have several four levels or multiple four levels already out there, you can go to two million GPUs in InfiniBand network. If you want to go five, go five. There is no limit of how many GPUs you can connect together, and it's even more. We didn't see a limit of how many GPUs you can use for a single workflow. That's the important part. There is no limit there, and that's why InfiniBand is called standard for large scale AI. InfiniBand pioneered RDMA, obviously, so there's a lot of elements in RDMA, but InfiniBand pioneered and lead with in-network computing. You know, we saw the impact of SHARP. You know, SHARP gives you 1.7x on ECO, compared to the best Ethernet network you can build.

It's a pure software-defined network. It was designed as an SDN before people knew what SDN means or what SDN is, which means is that you can control the entire routing from a single place. You can optimize the routing to the workflows. You can build different kind of network topologies. You can treat with changes in the network quickly. You reconfigure the network, you know, ports is down, ports are up. You can reconfigure it very quickly. There is a huge amount of benefits in a pure software-defined network. What that gives us? You know, if we're looking on the total performance, it's more than 2x. Kind of being gracious here. As many GPUs that you want.

Building a network that have the lowest latency, again, under in, you know, in large scale and under load. Very short delay latency, extremely short delay latency. We know the impact of SHARP on ECO operations. Nearly 100% effective bandwidth scale. It's amazing network. It's really amazing network. It's being developed over more than 20 years, right? It's every generation bring new capabilities. The upcoming generation, you know, Quantum-3, the things that we planted, we were planning, they are amazing, completely amazing. Those will take, you know, InfiniBand to a completely next level compared to anything else. Now, on InfiniBand, also, there's tons of software, right?

We have the SDKs, elements there. Magnum IO and NCCL are two, obviously, management of the network, be able to simulate everything. There is, there's tons of software as well. That's why it's important to do end-to-end. That's why we're doing end-to-end design. Next slide, please. This is where we look on the impact of the network. The network is essentially a small part of the data center, very small part of the data center expense, but it has a huge impact, a huge impact on AI performance. And essentially, the network pays for itself. The network pays for itself. InfiniBand offers the highest scalability out there. Again, you can build any size of system that you want with it.

Three levels, four levels, 5 levels, you know, there is unlimited number of GPUs you can connect together. If we're looking on performance, and we took NCCL here again, 'cause NCCL is a good indication of the network performance for AI. First, we see Spectrum-X. It's completely a different design for Ethernet. It's enabled the Ethernet ecosystem, right? If you want the Ethernet ecosystem and you need performance for AI, it's Spectrum-X. If you want to build a system that's gonna go to scale, you know, if you want to get the highest level of performance, you can also bring InfiniBand to the cloud. You know, there is no reason why not to.

If you look on InfiniBand, that's kind of amazing on top of that. If you look on the impact of the total AI performance, you know, the network essentially is free, completely pays for itself. Even if, even if someone's gonna, you know, offer me traditional internet for free, completely free, it's not gonna be good enough, right? I'd rather, I'd rather actually pay and get InfiniBand because I'm gonna get much better, much better from that, right? Essentially, I'm building an AI infrastructure. The network there is essentially free. With that, to the next slide. I think this is the last one. Yeah. If you're looking on networking revenues, read the networking revenues. The revenue more than doubled since the Mellanox acquisition.

Within that, you can see the breakdown between InfiniBand and Ethernet and other. InfiniBand more than tripled. More than tripled. It's growing very fast, continue to grow, will continue to grow. Then Spectrum-X. Spectrum-X is new class of Ethernet. New class of Ethernet, that it's needed for a new class of clouds. Therefore, Spectrum-X will boost the cloud AI network market and will increase the Ethernet revenues as we're moving forward. Overall, we see that essentially, we believe that every data center will become an accelerated data center in the future. There will not be data centers that are not accelerated, right?

You know, we used to be in a situation that we got 2x performance every two years, just doing nothing. That doesn't work anymore. That doesn't work anymore. You know, if you want to be able to increase capabilities, you know, it's accelerated computing, and therefore, every data center will become an accelerated data center. Every server will have a DPU processing unit. Every data center will have an element there. As such, we are, you know, talking about a $60 billion market opportunity for NVIDIA on the networking side. With that, first, thank you for listening. You know, it took some time, and happy to answer questions.

Moderator

Hey, Gilad, thank you so much. That was extremely informative. Actually answered a whole bunch of the questions that I had before. One of the ones that I do get is, you know, investor concern around the fact that they already come to NVIDIA for compute, and then they come to NVIDIA for networking now, based on the merits of what you, for example, just talked about. We get a lot of questions, you know, we're basically tied to NVIDIA a lot. I think people, as you know, in semiconductor business and in IT, they always want options. Could you maybe talk about what's... Is there a workaround to that, or are you the only ones that makes InfiniBand, or is it farmed out to other places that do it for you?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah, well, InfiniBand, it's a standard technology, right?

Moderator

Yeah.

Gilad Shainer
SVP of Networking, NVIDIA

It's not a proprietary technology standard. It's the same like Ethernet. Ethernet is also standard in that sense. You know, companies can definitely create InfiniBand devices. Actually, there are some companies that build InfiniBand devices for different kind of applications. You know, there's company building devices for long-haul connectivity. There is InfiniBand elements for FPGA things and so forth. Of course, you know, InfiniBand is open, so everyone can use that. There's always alternative for networking, right? You know, if you don't want to use InfiniBand, use Ethernet. If you don't want to use Ethernet, you can use InfiniBand. You can always choose between them.

Moderator

Okay.

Gilad Shainer
SVP of Networking, NVIDIA

The question is, I understand the concern, essentially, especially when you look on AI, you look in AI requires data center scale. If you look on that, then actually you want to have the right elements inside, right? I said it before, you know, we used to get 2x performance every two years, you know. Now it's not the case. Therefore we're gonna see more and more specialized technologies and actually more usage of accelerated computing and more use of technologies that it can enable you to achieve your goals. Optimizing, yeah, workload performance cannot be performed, you know, by discrete compute or networking device level. You want to look on a full stack approach.

Now, what's important, essentially, I would say that it's time to market, you know, time to solution. A customer considers total cost of ownership, performance, power, reliability, time to build and deploy the large-scale architectures, and this is what NVIDIA delivers. We build the full platforms. We're doing a huge amount of optimizations, and our customers can take it as a whole. You know, if customers want to take pieces, then well, they're welcome to take pieces of that, and mix and match other things that happen in the market, that exist in the market.

Moderator

Great. Gilad, one more for you. You guys are the sort of the gold standard as a company in the accelerated data centers. As it comes to InfiniBand network adoption, have you noticed a big difference in metrics for training versus inferencing, for example, for InfiniBand networks, either in terms of ports or in terms of any other metric that you think you can talk about? Because investors generally feel like inferencing is on the come, and that's gonna be a huge opportunity, so I wanted to address that.

Gilad Shainer
SVP of Networking, NVIDIA

No, yes, it's definitely a good question. You know, training requires very large-scale clusters, right? That are tightly coupled and optimized for massive data and compute. Inferencing, it typically required much smaller scale clusters. But what's happening now is that generative AI is becoming mainstream, and therefore, the number of the separate jobs running inferencing will dramatically increase. Therefore, inferencing will require a larger number of accelerated servers and the flexibility, essentially, to do that. And we're probably gonna see people that's gonna deploy system, and they will want to use those system for both training and inferencing. Both training and inferencing. You know, in such a case, obviously, InfiniBand is a great option for that.

Now, if someone's just doing inferencing and doesn't need to go to the large ones, of course, they can use Spectrum X for that. We're gonna see probably system that, gonna use for both. Makes sense to build system that is for both, and InfiniBand is a good option for them.

Moderator

Great. I have one more for you, Gilad. There is a perception in the investment community, even the people that know generative AI very well, that InfiniBand only works with NVIDIA's GPUs. Is that accurate? Listening to you talk, it seems like that's not the case, but I wanted to ask you since you're the expert on the topic.

Gilad Shainer
SVP of Networking, NVIDIA

Yeah, no, InfiniBand is open to be used with any other accelerated and non-accelerated compute compute platforms. You know, at NVIDIA, we do develop the full stack platform, and our customers can choose to take it as a whole if they wanted to take our design and copy the design as a whole. They can actually take pieces of it. You know, they can take our GPUs and use them with other networks. They can take our network and use it with other compute elements. It's, it's free to use in any platform. It's definitely not tied. Now, obviously, end-to-end, there's a lot of benefits into that, right?

We invest a lot of effort, and we're investing that so our customers will have much faster time to compute, much faster time to solution, much faster time to build their system. You know, when you build a supercomputer, you know, AI supercomputer, you don't want to spend nine months to build it. You know, that's nine months out of the lifetime of very expensive system. We're doing what we're doing so they can build the systems in weeks, not in months. And they can take the full performance out of it. Again, people can choose take our components. They can use our network with any other compute element and so forth.

Moderator

Wonderful. I know you're on the road, I want to be very mindful of your time. We've got two minutes, I'll just ask one final question. In a typical setup, let's say as you guys go and deploy a accelerated AI data center, do you typically find the entire data center to be, you know, either or? Is it all InfiniBand, or is it all Ethernet, or is there a possibility to mix and match some of your offerings depending on what the lines are supposed to do?

Gilad Shainer
SVP of Networking, NVIDIA

Yeah, yeah. First, obviously, there are entire Ethernet systems out there, right? We all know that. There are systems that are for just Ethernet. In such systems, there are different kind of Ethernet. You know, we created Spectrum-X in order to bring the right class of Ethernet for the AI compute fabric. You can definitely have just Ethernet systems. You know, if you want to build a GenAI cloud system, and you want to leverage the Ethernet ecosystem for some of the elements, you don't need to develop all the software yourself in the cloud, Spectrum-X is a good answer. You know, it give the speeds of feeds that needed for AI and give you the ecosystem friendliness of Ethernet.

Those systems are definitely going to exist. On the other side, when we're talking large scale, we have systems that are essentially combining both InfiniBand and Ethernet. It's not one versus the other. They're completely going to coexist. In a large AI factory, large system that runs large language models or doing training, you have Ethernet for the North-South access. You know, InfiniBand was not built for user access. That's not its. That's not its purpose. For that interface, we have Ethernet.

For the compute fabric, once you want to connect large amount of GPUs, you know, thousands to tens of thousands to hundreds of thousands of GPU in a single workflow, InfiniBand actually gives you or the combination of InfiniBand gives you the connectivity there. If you look on systems that we design, the system that, you know, for example, we recommend, kind of, you know, look on what we did and, you know, copying that will enable to leverage everything we design. Our system includes both InfiniBand and Ethernet, completely coexist. It's not that one replaces the other. I think, you know, you have those networks, they both exist, and each one has its own purpose.

Moderator

With that, we have come to the end of this presentation. Gilad, I cannot thank you enough for your time, particularly, I know you're on the road. Colette, thank you again for your time and appreciate your comments and thoughts earlier on. Simona, Stuart, thank you for your help in putting this together. With that, till next time. Thank you.

Gilad Shainer
SVP of Networking, NVIDIA

Thank you very much.

Powered by