NVIDIA Corporation (NVDA)
NASDAQ: NVDA · Real-Time Price · USD
200.68
-8.57 (-4.10%)
Apr 30, 2026, 12:14 PM EDT - Market open
← View all transcripts

GTC Financial Analyst Q&A

Mar 19, 2025

Jensen Huang
Founder President, and CEO, NVIDIA

Sorry, I was like—I was on TV. I'm just kidding.

Toshiya Hari
VP of Investor Relations and Strategic Finance, NVIDIA

Thank you.

Jensen Huang
Founder President, and CEO, NVIDIA

Chuck and I were on TV. Chuck Robbins, Cisco.

Colette Kress
CFO, NVIDIA

Oh, good. Also on Kramer this morning. Just a few interviews, right?

Jensen Huang
Founder President, and CEO, NVIDIA

Yes. That was fun.

Colette Kress
CFO, NVIDIA

Okay. Great to see everybody, both yesterday as well as last night at our cocktail hour. This opportunity to speak with Jensen and really talk about what this meant to our investor community in terms of our announcement at the GTC. I kindly remind you to look at our disclosure statement, this fine print in front of us. I want to make sure there's one announcement for you all. Toshiya Hara is here with you, but he is with you as our new lead of investor relations. He started just about yesterday, and he's now been here a good 48 hours. Please make sure you ask him a ton of questions in terms of there. We're really pleased to bring him on board, bring him on board here to both California and his many, many years in terms of semis working at Goldman.

Truly, truly excited to have him as part of the team and the whole. With that, I'm going to turn the mic over to Jensen for some opening remarks.

Jensen Huang
Founder President, and CEO, NVIDIA

Good morning. Great to see all of you. Let's see. We announced a whole lot of stuff yesterday. Let me put it all in perspective. The first is, as you know, everybody's expecting us to build AI infrastructure for cloud. That, I think everybody knows. The good news is that the understanding of R1 was completely wrong. That's the good news. The reason for that is because reasoning should be a fantastic new breakthrough. Reasoning includes better answers, which makes AI more useful, solving more problems, which expands the reach of AI. Of course, from a technology perspective, it requires a lot more computation. The computation demand for reasoning AIs is much, much higher than the computation demand of one-shot pre-trained AI. I think everybody now has a better understanding of that. That's number one. Number two.

The first thing is inference, Blackwell, incredibly good at it, building out AI clouds. The investments of all the AI clouds continue to be very, very high. The demand for computing continues to be extremely high. I think that is the first part. The part that I think people are starting to learn about, and we announced yesterday, and I'll just take—I could have done a better job explaining it. I'm going to do it again. In order to bring AI to the world's enterprise, we have to first recognize that AI has reinvented the entire computing stack. If so, all of the data centers and all the computers in the world's enterprises are obviously out of date.

Just as we've been re-modernizing the world's AI clouds, all the world's clouds for AI, it's sensible we're going to have to re-rack, if you will, reinstall, modernize, whatever words, the world's enterprise IT. Doing so is not just about a computer, but you have to reinvent computing, networking, and storage. I didn't give it very much time yesterday because we had so much content that that part, which represents about half of the world's CapEx, enterprise IT representing about half of the world's CapEx, that half needs to be reinvented. Our journey begins now. Our partnership with Dell and HPE, and this morning, the reason why Chuck Robbins and I were on CNBC together is to talk about this reinvention of enterprise IT. Cisco is going to be a NVIDIA networking partner.

I announced yesterday basically the entire world's storage companies have signed on to be NVIDIA's storage technology and storage platform partner. Of course, as you know, computing is an area that we've been working on for a long time, including building some new modular systems that are much more enterprise-friendly. We announced that yesterday, Spark, DGX Spark, DGX Station, and all of the different Blackwell systems that are coming from the OEMs. Okay. That's second. Now we're building AI infrastructure not just for cloud, but we're building AI infrastructure for the world's enterprise IT. The third is robotics. When we talk about robotics, people think robots. This is the great thing. It's fantastic. There's nothing wrong with that. The world is tens of millions of workers short. We need lots and lots of robots.

However, don't forget the business opportunity is well upstream of the robot. Before you have a robot, you have to create the AI for the robot. Before you have a chatbot, you have to create the AI for the chatbot. That chatbot is just the last end of it. In order for us to enable the world's robotics industry, upstream is a bunch of AI infrastructure we have to go create to teach the robot how to be a robot. Now, teaching a robot how to be a robot is much harder than, in fact, even chatbots for obvious reasons. It has to manipulate physical things, and it has to understand the world physically. We have to invent new technologies for that. The amount of data you have to train with is gigantic. It's not words, it's video. It's not just words and numbers.

It's video and physical interactions, cause and effects, physics. That new adventure we've been on for several years, and now it's starting to grow quite fast. Inside our robotics business includes self-driving cars, humanoid robotics, robotic factories, robotic warehouses, lots and lots of robotic things. That business is already many billions of dollars. It's at least $5 billion today, the automotive industry, and it's growing quite fast. Yesterday, we also announced a big partnership with GM, who's going to be working with us across all of these different areas. We now have three AI infrastructure focuses, if you will: cloud data centers, enterprise IT, and robotic systems. I would say those three buckets I've talked about. I talked about those things yesterday.

Foundationally, of course, we spoke about the different parts of the technology, pre-training and how that works, and how pre-training works for reasoning AIs, how pre-training works for robotic AI, how reasoning inference impacts computing, and therefore directly how it impacts our business. Answering a very big question that a lot of people seem to have, and I never understood why, which is how important is inference to NVIDIA? Every time you interact with a chatbot, you interact with AI on your phone, you're talking to NVIDIA GPUs in the cloud. We're doing inference. The vast majority of the world's inference is on NVIDIA today. It is an extremely hard problem. Inferencing is a very difficult computing problem, especially at the scale that we do. I spoke about inferencing technology, but at the technology level, I spoke about that.

At the industrial level, at the business level, I spoke about AI infrastructure for cloud, AI infrastructure for enterprise IT, AI infrastructure for robotics. Okay. I'll just leave with that. Thank you.

Ben Reitzes
Analyst, Melius Research

Hi. It's Ben Reitzes with Melius Research. Thank you for having us. This is obviously a lot of fun. I said that at the last meeting. Jensen, I was thinking about this question quite a bit. I want to ask you a big question, a big picture question about TAM. You talked about your share of data center spend last year or data center spend as your TAM, and your share was about 25%-30% last year. You used the Dell'Oro forecast. They go to a trillion dollars, but that's about a 20% CAGR. The street here has your data center growing about 60%, but then slowing to the rate of 20% thereafter. I'm just thinking like you're in all these areas that are growing faster than the market. You're at a 25%-ish share of the overall data center spend.

I'm thinking of Dell'Oro. I doubt they have robotics and AV data center infrastructure in there. My question is, with that backdrop, why wouldn't your share of data center spend go up over a three- to five-year period versus this 25%? Why is that CAGR actually not lower than Dell'Oro that you put up there? It doesn't seem like autonomous and robots are in there. Why wouldn't your share go up if you're in all the right areas?

Jensen Huang
Founder President, and CEO, NVIDIA

Excellent question. Yesterday, as I was explaining it, remember I said two things. I said one dynamic is that the world is moving from the general-purpose computing platform to GPU accelerated computing platform. That transition from the phase shift, that transition from platform shift from one to the other is going to whatever the CapEx of the world, it is very, very certain that our percentage of that is going to be much higher going forward. It used to be 100% general-purpose computing, 0% accelerated computing, but it is well known now that out of a trillion dollars, the vast majority of it would be for accelerated computing. That's number one. Whatever the forecast people have for data centers, I think NVIDIA's proportion of that is going to be quite large. We're not just building a chip. We're building networkings and switches.

We're basically building systems components for the world's enterprise, for the world's data center. That's number one. The second thing that I said that nobody's got right, none of these forecasts has it, this concept of AI factories. Are you guys following me? It's not a multipurpose data center. It's a single-function AI factory. These GPU clouds and Stargates and so on and so forth, these AI factories are not accounted. Do you guys understand? No, because nobody knows how to go do that. These multiple hundred billion dollar CapEx projects that are coming online, it's not part of somebody's data center forecast. Are you guys following me? How could they possibly know about these things? We're inventing them as we speak. I think there's two ideas here.

I tried to—I did not want to be too prescriptive in doing so because nobody knows exactly. Here are some fundamental things that I do know. I fundamentally believe that accelerated computing is the way forward. If you are building a data center and it is not filled with accelerated computing systems, you are building it wrong. Our partnership with Cisco and Dell and HP and all these enterprise IT gear, that is what that is about. That is number one. Number two, AI factories. GM is going to have AI factories. Tesla already has AI factories. Just as they have car factories, they have AI factories to power those cars. Every company that has factories will have an AI factory with it. Every company that has warehouses will have AI factories with it.

Every company who has stores will have AI factories for those stores to build the intelligence to operate the stores, to build the—and that store, of course, as you know, is an E Tel store. There's an AI that runs Amazon's E Tel store. There's an AI that runs Walmart's digital store. In the future, there's going to be AI that runs also the physical store. A lot of the systems will be AI, and the systems inside with physical systems and robotics will be AI-driven. I think those parts of the world are just simply not quantified. Does that make sense? No analyst has figured out that. Not yet. It will be common sense here pretty soon.

There's no question in my mind that out of $120 trillion global industries, that a very large part of that, trillions of dollars, trillions of dollars of it will be AI factories. There's no question in my mind now. Just as today, manufacturing energy, manufacturing energy, they're invisible too. Manufacturing energy is an entire industry. Manufacturing intelligence will be an entire industry. We are, of course, the factories of that. That layer has not been properly quantified. It is the largest layer of what we do. It is the largest layer of what we do by far. We are still going to, in the process, revolutionize how data centers are built, and it will be 100% accelerated. I am absolutely certain before the end of the decade, 100% of the world's data centers will be accelerated.

CJ Muse
Analyst, Cantor

Yeah. Good morning. CJ Muse with Cantor. Thank you for hosting this morning. One of the key messages yesterday was inference scaling laws are actually accelerating, led by test time scaling for enhanced reasoning. So my question, how is the work NVIDIA is doing to push the inference Pareto frontier impact how you think about the relative sizing of the inference market, your competitive positioning? And then could you discuss Dynamo briefly? Is there a way to isolate the productivity gains from this optimization software that we should be thinking about? Thanks so much.

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. Really appreciate it. Going backwards, as you know, people say NVIDIA's position is so strong because our software stack is so strong. And our training components in it. It's not just one thing because training is a distributed problem. The computing stack, our software runs on the GPU. It runs on the CPU. It runs in the NICs. It runs on the switches. It runs all over the place. You have to figure out which framework, which library, which systems to integrate it all into. Because no one company in the world develops the software holistically and totally for training, we have to break it up into a whole lot of parts and integrate it into a whole lot of systems, which is the reason also why our capability is so respected because it seems like everywhere you look, there's another piece of NVIDIA software.

That is true. Now, inference is software scaling at a level that nobody has ever done before. Inference is enormously complicated. The reason, as I was trying to explain yesterday, to give it some texture, some color, is that this is the extreme, the world's extreme computing problem because everything matters. Everything matters. It is a supercomputing problem where everything matters. That is quite rare. Supercomputers and supercomputing applications are used by one person. It is not used by millions of people at the same time. Everybody cares. The answer matters. The amount of software we have to develop for that is quite large. We put it under the umbrella called Dynamo. It has a lot of pieces of technology in it. The benefit of Dynamo, ultimately, is without it, just as without all of NVIDIA's training software, how do you even do it?

The first part is it enables you to do it at scale. One of the really innovative AI companies is a company called Perplexity. They find incredible value in Dynamo and the work that we do together. The reason for that is because they're serving AI at very large scale. Anyways, that's Dynamo. The benefit to it measurably is hard exactly how many X factors, but it's less than 10. It's probably less than 10 on the one hand in terms of if you do a good job versus not doing a good job. If you don't have it, you don't do it at all. Okay. It's essential, hard to put an X factor on it. With respect to inference, here's the thing. Reasoning generates a whole lot more tokens. The way reasoning works, the AI is just talking to itself.

Do you guys understand what I'm talking about? It's thinking. You're just talking to yourself. On the one hand, the reason why we used to think that 20 tokens per second was good enough for chatbots is because a human can't read that much faster anyways. What's the point of inferencing much faster than 20 tokens per second if the human on the other side of it can't read it? Thinking, the AI can think incredibly fast. We are now certain that we want the performance of inference to be extremely high so that the AI can think some of it out loud, most of it to itself.

We now know that's likely the way that AI is going to work: internal thinking, out loud thinking, and then final answer production, and then interactive part of it, which is getting more color, getting more explanation, so on and so forth. The thinking part is no longer one shot. The difference between thinking versus not thinking is like just a knee-jerk reaction of an answer. The number of tokens versus thinking tokens, it's at least 100 times. The point of putting a number on it was almost I just put an arrow on it because I know that there's no way you could put an answer on how a simple answer—and yet, as you know, people love simpler answers. You guys know that I always have a hard time with that because it always seems more complicated in my head.

I think 100X is easily so, most likely nearly all the time 100X. Now, here's the part that is the X factor on top of that. This is the part that people don't consider. You now have to generate a lot more tokens, but you have to generate it way faster. The reason for that is nobody wants to wait until you're done thinking. It's still an internet service. The quality of the service and the engagement of the service has something to do with the response time of the service, just like search. If it takes too long, and it's measurably so, if it takes too long, people just give up on it. They don't want to come back. They won't use it.

Now we have to take this problem where we're producing, we're using a larger model because it's more reasoning, more capable. It also produces a lot more tokens, and we have to do it a lot faster. How much more computation do you need? Which is the reason why Blackwell came just in time. Grace Blackwell with NVLink 72, FP4, better quantization, the fast memory on Grace. Every single part of the architecture, people now go, "Man, how did you guys realize all that? How did you guys reason through all of that? And how did you guys get all that ready?" Grace Blackwell, NVLink 72, just came just in time. I still think that 40X that Grace Blackwell provides is a big boost over Hopper, but I still think, unfortunately, it's many orders of magnitude short. That's a good thing. We should be chasing the technology for, hopefully, we'll be chasing the technology for a decade and not more.

Stacy Rasgon
Senior Analyst, Bernstein Research

Great. Thanks. Stacy Rasgon, Bernstein Research. We wondered if Blackwell and Hopper are both going to grow in the quarter. I'm just kidding.

Jensen Huang
Founder President, and CEO, NVIDIA

Listen, listen, listen, listen. Listen, that joke, that joke was kind of like my pin particle joke. Okay. Yeah. That's the kind of stuff that's funny only to your closest friends.

CJ Muse
Analyst, Cantor

Probably true. Hopefully, they're all here. What I did want to ask about.

Jensen Huang
Founder President, and CEO, NVIDIA

Stacy, not one person on the internet knows what we're talking about. Everybody in this room is cracking up. That's what I'm talking about.

CJ Muse
Analyst, Cantor

I liked your pin particle joke. What I did want to ask about was the chart you showed yesterday, though, the Hopper traction versus the Blackwell traction. It showed, I think it was 1.3 million Hopper shipments for calendar 2024. It talked about 3.6 million Blackwell GPUs yearly. I guess that's 1.8 million chips because it's a two-for-one. How do I interpret that chart? Because 1.8 million Blackwell chips would be, I don't know, $50 billion-$60 billion-$70 billion worth. Traction seems great, but that seems like a lot. Can you maybe just describe what that chart was actually trying to tell us and how to interpret it and what the read for it is for the rest of the year?

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. I really appreciate that. This is one of those things where, Stacy, I was arguing with myself whether to do it or not. Here's the question I was hoping to answer. Everybody's going, " R1 came. The amount of compute that you need is gone to zero." The CSPs, they're going to cut back on, and there were rumors of somebody canceling. It was so noisy. I know exactly what's happening inside the system. Inside the system, the amount of computation we need has just exploded. Inside the system, everybody's dying for their compute. They're trying to rip it out of everybody's hands. I didn't know exactly how to go and do that without just giving a forecast. What I did was the people they were asking about were just the top four CSPs, really.

Those are the CapEx that everybody kind of monitors. I just took the top four CSPs and I compared it year to year. I obviously, therefore, underrepresented the demand. Okay. I understand that. I am just simply trying to show that the top four CSPs are fully invested. They have lots of Blackwells coming. The CapEx, the capital investment that they are making is solid. Here are the things that I did not include that obviously are quite large. The internet services that are not public clouds, but they are internet services. For example, X and Meta, and I did not include any of that. I obviously did not include enterprise car companies and AI factories. I did not include international. I did not include a mountain of startups that all need AI capacity. I did not include Mistral.

I didn't include SSI, TMI, all of the great companies that are out there doing AI. I didn't include any of the robotics companies. I basically didn't include anything. That I understood, which then begs the question, and I was asking myself the same thing on stage, which begs the question, "Why did you do it?" There were just so many questions about the CSPs' investments that I kind of felt like maybe I'll just get that behind us. I hope it did. I think that the CSPs are fully invested, fully engaged. There are two things that are driving them. One, they have to shift from general-purpose computing to accelerated computing. The idea of building data centers full of traditional computers, it's not sensible to anybody. Nobody wants to do that.

Everybody's moving to this new way of doing computing, modern machine learning. Second, there are all these AI factories that are just built for just one purpose only, and that's not very well characterized and well followed by most. I think, in the future, these specialized AI factories, and that's why I call them AI factories, these specialized AI factories, that's really what the industry is going to look like someday above this $1 trillion of data centers.

CJ Muse
Analyst, Cantor

What was the number, though? That was what, shipments plus orders to those four customers within the first 11 weeks of the year. That's what, that 3.6?

Jensen Huang
Founder President, and CEO, NVIDIA

No, that 3.6 is what they have ordered from us.

CJ Muse
Analyst, Cantor

Ordered. Okay.

Jensen Huang
Founder President, and CEO, NVIDIA

Of just Blackwells.

CJ Muse
Analyst, Cantor

Okay.

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah, so far.

CJ Muse
Analyst, Cantor

So far.

Jensen Huang
Founder President, and CEO, NVIDIA

I know. The year just started.

CJ Muse
Analyst, Cantor

Okay.

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. Exactly. Yep.

CJ Muse
Analyst, Cantor

Thank you.

Jensen Huang
Founder President, and CEO, NVIDIA

Our demand's much greater than that, obviously. Yeah.

Vivek Arya
Managing Director and Senior Analyst, Bank of America Securities

Thank you. Good morning. Hi, Jensen. Hi, Colette. Vivek Arya from Bank of America Securities.

Jensen Huang
Founder President, and CEO, NVIDIA

Hi, Steve Vivek.

Vivek Arya
Managing Director and Senior Analyst, Bank of America Securities

Thanks for hosting a very informative event. Jensen, I had a near and sort of intermediate-term question. On the near-term, Blackwell execution and how it has, it's an incredibly complex product, obviously, very strong demand for it, but it has pressured gross margins a lot, right? I think some growing pains were to be expected, but we have seen margins go from high 70s to low 70s. Can you give us some confidence and assurance that as you get to Blackwell Ultra and Rubin, we should expect margins to start heading back, that there will be more profitable products versus what Blackwell has been so far? That's kind of the near-term. As we look out into 2026, Jensen, what are the hyperscale customers telling you about their CapEx plans in general?

Because there's a lot of building of infrastructure, but from the outside, we don't always get the best metrics to kind of visualize what the ROI is on these investments. As you look out at 2026, what's your level of confidence that their ability and desire to spend in CapEx can kind of stay on this pace that we have seen the last few years? Thank you.

Jensen Huang
Founder President, and CEO, NVIDIA

Yep. Our margins are going to improve because, as I was explaining yesterday, we changed the architecture of Hopper to Blackwell, not just the chip-to-chip name, but we changed the system architecture and the networking architecture completely. When you change the architecture that dramatically across the system, and now we've succeeded in doing so, there are so many components that are hard to exactly quantify cost and this and that, that now that it's all accumulated, it's challenging in the transition. Everybody's cost is a little higher. Everybody's new connector is a little higher. Everybody's new cable is a little higher. Everybody's new, everything's a little higher. Now that we ramp up into production, we'll be able to get those yields and those costs down. Okay.

I'm very certain, I'm quite confident that yields will improve as we use this basic architecture now called Grace Blackwell, this new NVLink 72 architecture. We're going to ride this for about three to three and a half, four years. Okay. We have opportunities between here and now that we're ramped up to improve yield and improve gross margins. That's that. In terms of CapEx, today we're largely focused on the CSPs, but I think very soon, starting almost now, you're starting to see evidence that in the future, these AI factories will be built even independently of the CSPs. We're going to see some very large AI factory projects, hundreds of billions of dollars of AI factory projects. That's obviously not in the CSPs. The CSPs will still invest, and they will still grow.

The reason for that is very clear. Machine learning is the present and the future. You're not going to go back to hand coding. The idea of large-scale software capabilities being developed by humans only as we're sitting there typing, that's almost quaint. It's cute. It's funny. It's not going to happen long term, not at scale. We now know that machine learning and accelerated computing is the path forward. I think that is a sure thing now.

The fact that every single CEO understands this, the fact that every single technology company is here, that we have partnerships from the Dells and HPs that are very understandable to Cisco, who have now also joined us, and industries, healthcare industries here and retail here and GM is here and car companies, startups to traditional, you're starting to see that people realize that this is the computing method going forward. I think that data center, the part that I do know very great confidence in, is that the percentage of that purple is going to keep becoming gold. Remember, that purple is compute CapEx. That has the opportunity to be 100% gold, right? I think that journey is fairly certain for me now. Okay.

The rest of it is how much more additional AI factory gets built on top of that is what we'll have to discover into. The way I reason about the fact that it's going to be significant is that every single industry in the world is going to be powered by intelligence. Every single company is going to be manufacturing intelligence. Out of that $120 trillion or so in the world, how much of it is going to be about intelligence manufacturing? Pick our favorite number, but it's measured in trillions.

Tim Arcuri
Managing Director and Equity Research Analyst, UBS

Hi, it's Tim Arcuri at UBS. Thanks. Jensen, I wanted to ask about custom ASIC. I ask because we listened to some of the same CSPs that you put up on that slide, and we listened to some of the companies who are making custom ASICs. Some of the deployment numbers sound pretty big. I wanted to just hear your position, how you're going to compete with custom ASICs, how they can possibly compete with you, and maybe how some of your conversations with these same customers would sort of form your view in terms of how competitive custom ASIC will be to you. Thanks.

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. First of all, just because something gets built does not mean it is great. Number two, if it is not great, all of those companies are run by great CEOs who are really good at math. Because these are AI factories, and it affects your revenues, not just your costs. It affects your revenues, not just your cost. It is a different calculus. Every company only has so much power. You just have to ask them. Every single company only has so much power. Within that power, you have to maximize your revenues, not just your cost. This is a new game. This is not a data center game. This is an AI factory game.

When the time comes, that simple calculus, as I was using yesterday, that simple math that I was showing yesterday, still has to be done, which is the reason why so many projects are started and so many are not taken into production. Because there's always another alternative. We are the other alternative, and that alternative is excellent. Not normal excellent, as you know. Everybody's still trying to catch up to Hopper. I haven't seen a competitor to Hopper yet. Here we're talking about 40x more. Our roadmap is at the limits of what's possible, not to mention we're really good at it and completely dedicated to it. A lot of people have a lot of businesses to do. I've got this one business to do. We're all in on this. 35,000 people doing one job. Been doing it for a long time.

The depth of capability, the scope of technology, as you saw yesterday, pretty incredible. It is not about building a chip. It is building an AI factory. We are talking about scale up, scale out. We are talking about networking and switches and software. We are talking about systems. These system architectures are insane. Even the systems itself, notice 100% of the computer industry, 100% of the computer industry has standardized on NVIDIA's system. Why? Because try to build an alternative. Building the alternative is not even thinkable. Because look at how much investment we put into building this one. Even the system is hard. What people used to think system is just sheet metal, hardly sheet metal. 600,000 parts is hardly sheet metal. All of the technology is hard. We are pushing every single dimension to the limit because we are talking about so much money.

The world is going to lay down hundreds of billions of dollars of investment in the next just a couple of two, three years. Let's do the thought experiment. Let's say you want to stand up a data center, and you want it to be fully operational in two years' time. When do you have to place the PO on that? Today. Let's suppose you had to place a $100 billion PO on something. What architecture would you place it on? Literally based on everything you have today. There's only one. You can't reasonably build out giant infrastructures with hundreds of billions of dollars behind it, hoping to turn it back on and to get the ROIC on it, unless you have the confidence that we are able to provide you. We can provide you complete confidence. And singularly so.

We're the only technology company where if I had to go place $100 billion on an AI factory, oh, that's interesting. I did. Literally the only company who's willing to place $100 billion POs across the industry to go build it out. You guys know that's the depth of our supply chain. We are. We have. Give me another. Give me another one that has that depth and that length. Now to the point where we've got to go and work with the supply chain upstream and downstream to prepare the world for hundreds of billions of dollars working towards trillions of dollars of AI infrastructure build-out.

Our partnerships with power companies and all of the cooling companies, the Vertiv, the Schneiders, our partnerships with BlackRock, the partnership network necessary to prepare the world to go build out trillions of dollars of AI infrastructure, that's undergoing as we speak. What architecture and what ASIC chip do you go select? That doesn't even make sense. It's a weird conversation even. I think that, one, the game is quite large. The investment level, therefore the risk level, is quite high. The certainty that you're selecting the best is quite important. The certainty that you can execute, vital. We are the company you can build on top of. We're the company, we're the platform that you can build your AI infrastructure on. We're the company that you can build your AI infrastructure strategy on.

I think it includes chips, but it's much, much more than that.

Mark Lipacis
Analyst, Evercore ISI

Hi, Mark Lipacis from Evercore ISI. Thank you so much for the informative presentation yesterday and sharing your insights today. Jensen, you brought up the expression homogeneous clusters. Why is this concept important? Why would it be better than a heterogeneous cluster? I think an investor question would be here, as your customers scale to one million node clusters, is it your view that those will more likely be homogeneous than heterogeneous? Thank you.

Jensen Huang
Founder President, and CEO, NVIDIA

The answer to the last question, Mark, I appreciate the question. Yes. It would not be my style to just leave it there, because your understanding is too important to me. Let me work on the backwards. GTC, I always assume that because, you know, the first GTC NVIDIA ever had was at the Fairmont Hotel. It is not far from here, right next to Great America. Literally, the entire GTC was this room divided in half. Literally, 100% of the audience was scientists, because they were the only users of CUDA. My recollection of GTC and what GTC means to me is still that. A whole bunch of computer scientists, a whole bunch of scientists, a whole bunch of computer engineers, and we are all building this computer computing platform.

I've always somehow had in my head that my GTC talks could be nerdier than usual. I show some charts that no CEOs would or should. If you saw the Pareto frontier, the frontier simply is a way of saying underneath that curve, in our simulation, was hundreds of thousands of other dots. Meaning that data center, that factory, depending on the workload, depending on the workload, and depending on the work style, the prompt style. Remember, a prompt is how you program a computer now. Therefore, every time you prompt it differently, you're actually programming the computer differently. Therefore, it processes it differently.

Depending on your style of prompting and deep research or just a simple chat or search, depending on that spectrum of questioning or prompting, depending on whether it's agentic or not, okay, depending on all of that, the data center's configuration is different. The way you configure it is using a software program called Dynamo. It's kind of like a compiler, but it's a runtime. It sets up the computer to be parallelizable in different ways. Sometimes it's tensor parallel, pipeline parallel, expert parallel. Sometimes you want to put the computer to do more floating point. Sometimes you want to use it to do more token generation, which is more bandwidth challenged. All of this is happening at the same time.

If every single computer has a different behavior, or if every computer has a different programming model, the person who's developing this thing called Dynamo would just go insane. The computer would be underutilized because you never know exactly what is needed. It's kind of like in the morning, you need more of this. In the night, you need more of that. If everything was fungible, then you don't care, which is the reason why homogeneous is better. Every single computer is fungible. It could be literally from Hopper on, every single computer could be used for context processing or decode, from prefill to decode, from tensor parallel to expert parallel. Every single computer could be flexibly used in that way, running Dynamo. The utilization of the data center will be higher, the performance will be higher, everything will be better.

Energy efficiency will be better. If you had to do this, if this computer can only be good for prefill, or this computer's only good for decode, or this computer's only good at expert parallel, it's kind of weird. Too hard.

Joe Moore
Semiconductor Industry Analyst, Morgan Stanley

Yeah. Joe Moore, Morgan Stanley. Very compelling conversation about reasoning and the 100x improvement in compute requirements. I guess one of the anxieties the market had was DeepSeek talking about doing that reasoning on fairly low-end hardware, even consumer hardware, and China may be forced to do this on low-end hardware. Can you talk about that disparity and how did this complexity pan out for you guys?

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. They were talking about a technology called distillation. Distillation, you train the largest model you can. Okay. ChatGPT 4.0, I think it's a couple of trillion parameters. R1 is 680. Llama 3 is 400 or so. Is it 280, 400, something like that? I forget anymore. The version that people run mostly is 70B. R1's 680. ChatGPT's 1.4. The next one I'm going to guess is 20 trillion. You want to build the smartest AI you can. All right? Number one. The second thing you want to do is you want to distill it, quantize it, reduce the precision, quantize it, distill it, and into multiple different configurations. Some of it, you might continue to run it in the largest form because, quite frankly, the largest is actually the cheapest. Let me tell you why. As you know, Joe, there are many problems.

We're getting the smartest person to do it is the cheapest way to do it. There are actually a lot of problems like that. Getting the cheapest person to do something is not necessarily the cheapest way to get it done. Do you guys agree? Are you guys following me? Did I? Everybody goes, "I don't understand what you're saying." No. In this audience, I hope that you understand. Okay. It turns out that there are many problems where it would actually cost less in runtime to have the smartest model do it. Depending on the problem you're trying to solve, you're going to use the cheapest form. When you take the largest and you distill it into smaller and smaller version, and even the smallest version is only like one billion parameters, which fits on a phone.

I surely would not use that to do research. You know what I am saying? I would use the larger version to do research. They all have a place, but it is a technology called distillation. People just got worked up on it. Even today, you can go into ChatGPT and there is a whole bunch of configurations. Do you guys see that? Is it 0.3 mini or something like that? It is a distilled version of 0.3. If you like to use that, you can. If you like to, I use the plain one. It is just up to you. Does it make sense? Yeah. It is a technology called distillation. Nothing changed.

Toshiya Hari
VP of Investor Relations and Strategic Finance, NVIDIA

Yeah. This is probably going to have to be the last question.

Jensen Huang
Founder President, and CEO, NVIDIA

I want to stay. I was late.

Toshiya Hari
VP of Investor Relations and Strategic Finance, NVIDIA

Okay.

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. Make the next meeting wait. I'm just kidding. The next meeting is lunch.

Aaron Rakers
Technology Analyst, Wells Fargo

Perfect. Thanks for taking the questions, Aaron Rakers at Wells Fargo.

Jensen Huang
Founder President, and CEO, NVIDIA

Nice to see you.

Aaron Rakers
Technology Analyst, Wells Fargo

Jensen, we talk a lot about the compute side of the world and how these scaled-up architectures are evolving. One of the underlying platforms that I feel are key to your success and your strategy is NVLink. I am curious as you move from 72, and I know there is some definitional difference around 72 to 144. You talk about 576. I am curious, how do we think about the evolution of NVLink to support continued scaled-up architectures and how important that is to your overall strategy? Thank you.

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah. NVLink, I said yesterday, distributed computing, distributed computing, it's like this room, distributed computing, let's say. Let's say we have a problem we have to work on together. It is better, it can get the job done faster if we actually have fewer people, but they were all smarter. They all did things faster. You want to scale up first before you scale out. Does that make sense? We all love teamwork, but the smaller the team, the better. We all love teamwork, but the smaller the team, the better. Therefore, you want to scale up the AI before you scale out the AI. You want to scale up computing before you scale out computing. Now, scale up is very hard to do.

Back in the old days, monolithic semiconductors, or otherwise the beginning of Moore's Law, was the only way we knew how to scale up. Does that make sense? Are you guys following me? Just make bigger and bigger and bigger and bigger chips. At some point, we didn't know how to scale up anymore because we're at radical limits. That's the reason why we invented, and you didn't see me, you don't remember it anymore, but back in the old days, during GTC, I was talking about this incredible SerDes we invented. The most high-speed, energy-efficient SerDes. NVIDIA is world-class at building SerDes. I go so far as to say we are the world's best at building SerDes. If there's a new SerDes that's going to get built, we're going to build it.

Whether it's the chip-to-chip interface SerDes or the package-to-package interface SerDes, which enabled NVLink, our SerDes is absolutely the world's best. Always at the bleeding edge. The reason for that is because we're trying to extend beyond Moore's Law. We're trying to scale up past the limits of radical limits. Now, the question is, how do you do that? People talk about silicon photonics. There's a place for silicon photonics. You should stay with copper as long as you can. I'm in love with copper. Copper is good. It's time-tested. It works incredibly well. We want to stay with copper as long as we can. It's very reliable. It's very cost-effective, very energy-efficient. You go to photonics when you have to. That's the rule.

We would scale up as far as we can with copper, which is the reason why NVLink 72 is in one rack, which is the reason why we pushed the world to liquid cooling. We pushed that knob probably two years earlier than anybody wanted to in the beginning, but everybody's there now. That way we could rack up, scale up NVLink to 72. With Kyber, our next-generation system, we can scale up to 576 using copper in one rack. We should scale up as far as we can and use copper as far as you can and then use silicon photonics, CPO, if necessary. For the necessary part, we've prepared the world to we've now built the technology. In a couple of two, three years, we could start scaling out to millions of GPUs. Does that make sense?

Now we can scale up to thousands, scale out to millions. Yeah. Crazy, crazy technology. Everything is at the limits of physics.

Blayne Curtis
Equity Research Analyst, Jefferies

Blayne Curtis, Jefferies, appreciate the bonus time too. Thanks for taking the question. I want to ask you, I know you were kind of making a joke about you couldn't give Hopper away. That's not the point. You were trying to say the performance is so incrementally better, new sales are going to go there very quickly. I'm kind of curious about the life cycle. You get this question a lot. I think in last earnings call, you talked about people are still using A100s, right? It's diversity of workload. I think there was this perception that inference was a lesser workload. You clearly made the case that now you need the best systems with Dynamo. Price per token is very important. I just want to think about how you have seen really just greenfield deployments. Do you see people rip and replace?

If power is the constraint, when do we see that in terms of people just pulling out? What is the life cycle of a GPU these days, now that the workload on both training and inferences?

Jensen Huang
Founder President, and CEO, NVIDIA

I really appreciate that question. First of all, the life cycle of NVIDIA accelerators is the longest of anyone's. Would you guys agree? There you go. The life cycle of NVIDIA's GPUs are the longest of any accelerators in the world. Why does that matter? It directly translates to cost. Easily, easily three years longer, maybe four. Easily three years longer, maybe four. The versatility of our architecture, you could run it for this and that. You could use it for language and images and graphics. You could use it. Isn't that right? All these data processing libraries, none of them have to be at the forefront. Even using Ampere for data processing is still, in order of magnitude, faster than CPUs. And they still have CPUs in data centers.

We have a whole waterfall of applications they could put their older GPUs into and then use the latest generation for their leading-edge work and use their leading edge for factory work, AI factories and such. I also said something. If a chip's not better than Hopper, quite frankly, you couldn't give it away. This is the challenge of building these data centers. The reason for that is because the cost of operation, the cost of building it up, the TCO of it, the risk of building it up, $100 billion data center, okay? Which is only 2 GW by the way. Every gigawatt is about $40 billion-$50 billion to NVIDIA, right? Every gigawatt of data center is about $50 billion, roughly, let's say. Every gigawatt, $50 billion. When somebody's talking about 5 gigawatts, that math is pretty clear.

You've got to do all the math as I explained yesterday. It becomes very clear that if you're going to build that new data center, you need the world's best of all these areas. Once you build it up, you have to start thinking about how do you retire it someday. Now the versatility of NVIDIA's architecture really, really kicks in. We are not only the world's best in technology, so your revenues will be the highest. We're also the best from a TCO perspective because operationally, we're the best. Very importantly, the life cycle of our architecture is the longest. If our life cycle is six years instead of four, that math is pretty easy to do. My goodness, the cost difference is incredible. Yeah.

People are starting to come to terms with all of that, which is the reason why these very specific niche accelerators or point products are kind of hard to justify building up a $100 billion data center with.

Pierre Ferragu
Analyt, New Street Research

Thanks for the time, Pierre Ferragu, New Street Research. That's really music to my ears, this idea that NVIDIA chips have a very long lifetime. And they're actually amortized relatively rapidly on balance sheets. It really makes me think in the future, data centers are going to have a business model that is very equivalent to the foundry business model. You buy super expensive equipment to manufacture chips. You depreciate running 20 years down the line. I'm trying to think about how the industry is going to grow with that framework in mind. I look at the economics, for instance, of OpenAI. The economics they shared with investors in their last round. My understanding of them is in 2028, 2029, they're going to be deploying like a $250 billion data center.

They're going to use it at their frontier to develop their most advanced models. That big data center, if they want to move to the next frontier data center, that might be like a $350 billion data center in the beginning of the next decade. They'll have to drop off somewhere a $250 billion data center. We'll have to figure out how to fit in that data center so that the frontier of the industry can keep going towards the more advanced technology. I'm still struggling to see how this is going to work because probably the inference of OpenAI alone is not going to be enough to fit in such a big data center every year or every other year if they change their leading-edge data center every year or every other year. Does that make sense?

Jensen Huang
Founder President, and CEO, NVIDIA

Yes.

Pierre Ferragu
Analyt, New Street Research

Thank you.

Jensen Huang
Founder President, and CEO, NVIDIA

Except for, yeah, except for I don't think they're going to have any trouble. The reason for that is this. The demand for inference will be larger than the demand for training. It already is. The number of computers they use for inference is nowhere near enough. That's the reason why as we're moving to these reasoning models, and I'm super enjoying using ChatGPT, and I use it every day, but the response time is getting slower. During certain parts of the time, I can tell when you guys are all on it, I pretty much don't get my answer back. Are you guys following me? Obviously, that's a problem. Obviously, they know that. That's the reason why they're clamoring for more capacity. They're just racing because the inference workload is just too high. Someday, I believe training is only 10% of the world's capacity.

They need that 10% to be the fastest of the 90, of the rest of the 100. Okay? Every year, they're going to come up with the new state-of-the-art. The reason for that is because they want to stay at the leading edge. They want to have the best products. Just like NVIDIA is today, listen, we do everything we can to spare no expense to make sure that we have the state-of-the-art technology. The reason for that is because we want to stay ahead. They want to stay ahead. There are five companies out there that need to stay ahead. Everybody will invest in the necessary state-of-the-art technology to stay ahead. That technology, that capacity, that training capacity, ideally will represent 10%, 5%, 20% of their total capacity. Let's use an example. You used an example, TSMC.

TSMC's prototype being fabbed that they run my tape-out chips on, the latency is very, very optimized for low latency because I need to see my prototypes as soon as possible. The cycle time for a prototype is only, let's pick two months. The cycle time for production is six months. It is the same equipment. They have a fab that is only for prototyping. That fab represents, I do not know, 5%, 3% of their overall capacity. The rest of their capacity is used for manufacturing, inference. 3% is used for. I think if you told me that in the near future, OpenAI will invest $350 billion or pick your favorite number every single year, I completely believe it. It is just that the trailing capacity will be used for inference. Their revenues will just have to support that.

I believe that the production of AI, the production of intelligence will support that level of scale.

Toshiya Hari
VP of Investor Relations and Strategic Finance, NVIDIA

I think maybe this will be our last question.

Brett Simpson
Analyst, Arete Research

Thanks. It's Brett Simpson at Arate Research. Jensen, it feels like there's a tipping point for reasoning inference at the moment, which is great to see. A lot of folks in this room are concerned about the macro backdrop overall at the moment. They're concerned about tariffs or potentially what impact tariffs have on the sector. Maybe this leads to a U.S. recession of some sorts. I'd love to get your perspective and maybe also Colette's perspective. Just thinking through, if there is a scenario where we see a U.S. recession, what does that do to AI demand? How do you think about the impact on your business if this comes through?

Jensen Huang
Founder President, and CEO, NVIDIA

If there's a recession, I think that the companies that are working on AI are going to shift even more investment towards AI because it's the fastest growing. Every CEO will know to shift towards what is growing. Second, tariffs. We're preparing, and we have been preparing to manufacture onshore. TSMC's investment in $100 billion of fabs here in Arizona, the fab that they already have, we're in it. We are now running production silicon in Arizona. We will manufacture onshore. The rest of the systems, we'll manufacture as much onshore as we need to. I think the ability to manufacture onshore is not a concern of mine. Our partners, we are the largest customer for many companies. They're excellent partners of ours.

I think everybody realized that the ability to have onshore manufacturing and a very agile supply chain, we have a super agile supply chain with manufacturing in so many different places. We could shift things around. Tariffs will have a little impact for us short term. Long term, we're going to have manufacturing onshore. Okay. How about I take one more? The last question has to be a happy question. Even though the tariffs was not a sad question, but tariffs was a happy question. How about one more question? The ultimate responsibility, sir.

Will Stein
Semiconductor and Artificial Intelligence Equity Research Analyst, Truist

It might not sound so much like a happy question. It's Will Stein from Truist. But I'm hoping you can put a happy spin on it.

Jensen Huang
Founder President, and CEO, NVIDIA

Oh, Will.

Will Stein
Semiconductor and Artificial Intelligence Equity Research Analyst, Truist

Another question. Jensen, what do you see as the biggest technical or operational challenges today? You just mentioned with regard to U.S. production that's not really weighing heavily on you. What is, and what is the company? What are you doing to turn that into an opportunity that'll result in even better revenue growth going forward?

Jensen Huang
Founder President, and CEO, NVIDIA

Yeah, Will, I really appreciate the question. In fact, everything I do started out with a problem, right? Almost everything that I do started out with a dream or everything I do started out with an irritation or everything I do started out with a concern or some problem. What are the things that we did? What are the things that we're doing? One of the things that you're seeing us do, no company in history has ever laid out a roadmap, a full roadmap three years out. No technology company has ever done that. It is like today we're announcing our next four phones. Are you guys following me? Nobody does that. The reason why we do it is because, one, the whole world is counting on us. The amount of investment is so gigantic, they cannot have surprises.

We're in the infrastructure building business, not in the consumer products business. We're the infrastructure business. The most important thing to our partners are things like trust, no surprises, confidence in execution, confidence in your ability to build the best. They use all of the same thoughts, words that you would use to describe, for example, TSMC. Are you guys following me? All the same ideas. We are an infrastructure building company now. Many of the things that you saw me do are reaction to that. I know it looks quite strange for a technology company to sit here and tell you about all these things that while you're sitting here buying this one, I'm already telling you about the next one, and you're about to place an order for the next one. I'm already telling you about the one after that.

You're forcing you to live in regret all the time. Okay? However, I also know this because they need to run their business every single day. They can't wait to run their business in three years and do a better job. They got to run it every single day. They have no choice because we're in the AI factory business. One, we're in the AI infrastructure business. No excuses, no surprises, long roadmap. Two, we're in the AI factory business. We got to make money every day. They're going to buy some every day. No matter what. No matter what I tell them about Rubin, they're going to buy backwalls. There's no question. No matter what I tell them about Feynman, I couldn't wait to tell you guys about Feynman, but they're still going to, I thought the keynote was long enough.

They're still going to, right? Does that make sense? We're in the AI factory business. People have to make money every day. Here's a third thing. We're a foundation business for so many industries. AI is a foundational business for so many industries, and we have only so far served the cloud. In order for us to serve the telecommunications network, yesterday we announced with T-Mobile and Cisco in the U.S., we're going to build a 6G AI RAN. It's all completely built on top of Blackwells. That ecosystem, what's your CapEx spend? $100 billion a year? That has to be retooled and re-architected and reinvented. You can't just do that in the cloud. We have to do that for AI industries. You can't just do that in the cloud. We have to do that for AI enterprise IT.

You can't just rely on the cloud to do that. There are different architectures, different go-to-markets, different software stacks, different product configurations, different purchasing style, and therefore the product has to fit the style of purchasing. Each one of these industries needs an AI infrastructure. We now know three things about our company. We're an infrastructure company. The supply chain, front and back, behind us, everything from land, power, and capital, we are a part of. One, AI infrastructure, AI factories. Three, we're an AI foundation, foundational company that the whole world is depending on. We have to bring the technology to them. It's not just about AI to them. It's about the entire computing platform to them. That includes networking and storage and computing. We have the might to do that.

We have the technical skills to do that. You heard all these things, frankly, in the keynote yesterday while I had to also remember while we have work to do. It also still had to be a little entertaining. We did a lot of work yesterday. We set the blueprints out for not just our company, but for the companies that are here, the industries that are here, the companies that are here, and the associated companies before and after us in the supply chain. So many companies were affected by what I said yesterday. We are laying the foundations for all of that. I want to thank all of you for coming to GTC. It's great to see all of you. This is an extraordinary moment in time.

I do really appreciate the question and the comment about R1 and the misunderstanding of that. There's a profound and deep misunderstanding of it. It's actually a profoundly and deeply exciting moment. It's incredible that the world has moved towards reasoning AIs. That is even then just the tip of the iceberg. I hope to catch up with you guys again. We're going to Computex. I hope you're coming to Computex. This year's Computex is going to be gigantic. Okay? We have lots and lots of things to do at Computex this year because, as you know, the ecosystem, the computing ecosystem starts there. We got a mountain of work to do there. I look forward to seeing you guys there. All right, you guys. Thank you.

Powered by