Status Update

Dec 6, 2023

Press release

Lisa Su

Chair and CEO, AMD

Good morning! Good morning, everyone. Welcome to all of you who are joining us here in Silicon Valley, and to everyone who's joining us online from around the world. It has been just an incredibly exciting year, with all of the new products and all the innovation that has come across our business and our industry, but today, it's all about AI. We have a lot of new AI solutions to launch today, and the news to share with you, so let's go ahead and get started. Now, I know we've all felt this this year. I mean, it's been just an amazing year. I mean, if you think about it, a year ago, OpenAI unveiled ChatGPT, and it's really sparked a revolution that has totally reshaped the technology landscape. In this just short amount of time, AI hasn't just progressed, it's actually exploded.

The year has shown us that AI isn't just kind of a cool new thing, it's actually the future of computing. And at AMD, when we think about it, we actually view AI as the single most transformational technology over the last 50 years. Maybe the only thing that has been close has been the introduction of the internet, but what's different about AI is that the adoption rate is just much, much faster. So, you know, although so much has happened, the truth is, right now, we're just at the very beginning of the AI era, and we can see how it's so capable of touching every aspect of our lives. So if you guys just take a step back and just look, I mean, AI is already being used everywhere.

Think about improving healthcare, accelerating climate research, enabling personal assistants for all of us, and for greater business productivity, things like industrial robotics, security, and providing lots of new tools for content creators. Now, the key to all of this is generative AI. It requires a significant investment in new infrastructure, and that's to enable training and all of the inference that's needed, and that market is just huge. Now, you know, a year ago, when we were thinking about AI, we were super excited, and we estimated the data center AI accelerator market would grow approximately 50% annually over the next few years, from something like $30 billion in 2023 to more than $150 billion in 2027, and that felt like a big number.

However, you know, as we look at everything that's happened in the last 12 months, and the rate and pace of adoption that we're seeing across the industry, across our customers, across the world, it's really clear that the demand is just growing much, much faster. So if you look at now to enable AI infrastructure, of course, it starts with the cloud, but it goes into the enterprise. We believe we'll see plenty of AI throughout the embedded markets and into personal computing. We're now expecting that the data center accelerator TAM will grow more than 70% annually over the next four years, to over $400 billion in 2027. So does that sound exciting for us as an industry? I have to say, for someone like me, who's been in the industry for a while, this pace of innovation is faster than anything I've ever seen before.

For us at AMD, we are so well-positioned to power that end-to-end infrastructure that defines this new AI era. Thinking about, you know, massive cloud server installations, to... We're gonna talk about on-prem enterprise clusters, to the next generation of AI in embedded and PCs. Our AI strategy is really centered around three big strategic priorities. First, we must deliver a broad portfolio of very performant, energy-efficient GPUs, CPUs, and adaptive computing solutions for AI training and inference, and we believe, frankly, that you're gonna need all of these pieces for AI. Second, it's really about expanding our open, proven, and being very developer-friendly in our software platform to ensure that leading AI frameworks, libraries, and models are all fully enabled for AMD hardware, and that it's really easy for people to use. And then third, it's really about partnership.

You know, you're gonna see a lot of partners today. That's who we are as a company. It's about expanding the co-innovation work and working with all parts of the ecosystem, including cloud providers, OEMs, software developers—you're gonna hear from some, you know, really AI leaders in the industry—to really accelerate how we work together and get that widespread deployment of our solutions across the board. So we have so much to share with you today. I'd like to get started, and of course, let's start with the cloud. Generative AI is the most demanding data center workload ever. It requires tens of thousands of accelerators to train and refine models with billions of parameters, and that same infrastructure is also needed to answer the millions of queries from everyone around the world to these smart models.

It's very simple: the more compute you have, the more capable the model, the faster the answers are generated. The GPU is at the center of this generative AI world. Right now, I think we all know it, everyone I talk to says it, the availability and capability of GPU compute is the single most important driver of AI adoption. Do you guys agree with that? That's why I'm so excited today to launch our Instinct MI300X. It's the highest performance accelerator in the world for generative AI... MI300X is actually built on our new CDNA 3 data center architecture, and it's optimized for performance and power efficiency. CDNA 3 has a lot of new features.

It combines a new compute engine, it supports sparsity, the latest data formats, including FP8, it has industry leading memory capacity and bandwidth, and we're gonna talk a lot about memory, today. And it's built on the most advanced process technologies and 3D packaging. So if you compare it to our previous generation, which frankly was also very good, CDNA 3 actually delivers more than 3x higher performance for key AI data types, like FP16 and BF16, and a nearly 7x increase in INT8 performance. So if you look underneath it, how do we get MI300X? It's actually 153 billion transistors, 153 billion. It's across a dozen 5 nm and 6 nm chiplets. It uses the most advanced packaging in the world, and if you take a look at how we put it together, it's actually pretty amazing.

We start with four I/O Die in the base layer, and what we have on the I/O Dies are 256 MB of Infinity Cache and all of the next-gen I/O that you need. Things like 128-channel HBM3 interfaces, PCIe Gen 5 support, our fourth-gen Infinity Fabric that connects multiple MI300Xs, so that we get 896 GB per second. And then we stack eight CDNA 3 accelerator chiplets or XCDs on top of the I/O die, and that's where we deliver 1.3 petaflops of FP16 and 2.6 petaflops of FP8 performance. And then we connect these 304 compute units with dense through-silicon vias or TSVs, and that supports up to 17 TB per second of bandwidth.

Of course, to take advantage of all of this compute, we connect eight stacks of HBM3, for a total of 192 GB of memory at 5.3 TB/s of bandwidth. That's a lot of stuff on that chip. I, I have to say, it's truly the most advanced product we've ever built, and it is the most advanced AI accelerator in the industry. Now, let's talk about some of the performance and why it's so great. For generative AI, memory capacity and bandwidth are really important for performance. If you look at MI300X, we made a very conscious decision to add more flexibility, more memory capacity, and more bandwidth, and what that translates to is 2.4x more memory capacity and 1.6x more memory bandwidth than the competition.

Now, when you run things like lower precision data types that are widely used in LLMs, the new CDNA3 compute units and memory density actually enable MI300X to deliver 1.3 x more teraflops of FP8 and FP16 performance than the competition. Now, these are good numbers, but what's more important is how things look in real world inference workloads. So let's start with some of the most common kernels used by the latest AI models. LLMs use attention algorithms to generate precise results. So for something like FlashAttention-2 kernels, MI300X actually delivers up to 1.2 x better performance than the competition. And if you look at something like the Llama 2 70B LLM, and we're gonna use this a lot throughout the show, MI300X, again, delivers up to 1.2 x more performance.

What this means is, the performance at the kernel level actually directly translates into faster results when running LLMs on a single MI300X accelerator. We also know... We talked about these models getting so large, so what's really important is how that AI performance scales when you go to the platform level and beyond. Let's take a look at how MI300X scales. Let's start first with training. You know, training is really hard. People talk about how hard training is. When you look at something like the 30 billion parameter model from Databricks, MPT-LLM, it's a pretty good example of something that is used by multiple enterprises for, you know, a lot of different things. You can see here that the training performance for MI300X is actually equal to the competition, and that means it's actually a very, very competitive training platform today.

But when you turn to the inference performance of MI300X, this is where our performance really shines. We're showing some data here on measured data on, you know, two widely used models. Bloom 176B, it's the world's largest open multi-language AI model. It generates text in 46 languages. And, you know, our Llama 2 70B, which is also very popular, as I said, for enterprise customers. And what we see in this case is a single server with eight MI300X accelerators is substantially faster than the competition, 1.4x-1.6x. So these are pretty big numbers here. And what this performance does is, it just directly translates into a better user experience. You guys have used it. When you ask the model something, you'd like it to come back faster, especially as the responses get more complicated.

So that gives you a view of the performance of MI300X. Now, excited as we are about the performance, we are even more excited about the work we're doing with our partners. So let me turn to our first guest, very, very special. Microsoft is truly a visionary leader in AI. We've been so fortunate to have a deep partnership with Microsoft for many, many years across all aspects of our business, and the work we're doing today in AI is truly taking that partnership to the next level. So here to tell us more about that is Microsoft's Chief Technology Officer, Kevin Scott.... Kevin, it is so great to see you. Thank you so much for being here with us.

Kevin Scott

EVP & CTO, Microsoft

It's a real pleasure to be here with you all today.

Lisa Su

Chair and CEO, AMD

You know, we've done so much work together on, you know, EPYC and Instinct over the years. Can you just, you know, tell our audience a little bit about that partnership?

Kevin Scott

EVP & CTO, Microsoft

Yeah, I think Microsoft and AMD have a very special partnership, and as you mentioned, it has been one that we've enjoyed for a really long time. It started with the PC, it continued then with a bunch of custom silicon work that we've done together over the years on Xbox. It's extended through the work that we've done with you all on EPYC for the high-performance computing workloads that we have in our cloud. And like, the thing that I've been spending a bunch of time with you all on the past couple of years, like actually a little bit longer even, is on AI compute, which I think everybody now understands how important it is to driving progress on, like, this new platform that we're trying to deliver to the world.

Lisa Su

Chair and CEO, AMD

I have to say, we talk pretty often.

Kevin Scott

EVP & CTO, Microsoft

We do.

Lisa Su

Chair and CEO, AMD

you know, Kevin, what I admire so much is just your vision, Satya's vision, about where AI is going in the industry. So can you just... I mean, give us a perspective of where are we on this journey?

Kevin Scott

EVP & CTO, Microsoft

Yeah, so we have been, with a huge amount of intensity over the past five years or so, trying to prepare for the moment that I think we brought the world into over the past year. So, it is almost a year to the day since the launch of ChatGPT, which I think is perhaps most people's first contact with this new wave of generative AI. But the thing that allowed Microsoft and OpenAI to do this was just a deep amount of infrastructure work that we've been investing in for a very long while. And, you know, one of the things that we realized fairly early in our journey is just how important compute was going to be, and just how important it is to think about the sort of full systems optimization.

So the work that we've been doing with you all has been not just about figuring out, like, what the silicon architecture looks like, but that's been a very important thing in making sure that, like, you know, we together are building things that are gonna intercept where the actual platform is going to be-

Lisa Su

Chair and CEO, AMD

Yeah

Kevin Scott

EVP & CTO, Microsoft

... you know, years in advance. But also, just doing all of that software work, that needs to be done to make this thing usable by, by all of the developers of the world.

Lisa Su

Chair and CEO, AMD

You know, I think that's really key. I think, you know, sometimes people don't understand. They think about, like, AI as this year, but, I mean, the truth is, we've been building the foundation for so many years. Kevin, I wanna take this moment to really acknowledge that Microsoft has been so instrumental in our AI journey. I mean, the work we've done over the last several generations, the software work that we're doing, the platform work that we're doing, we're super excited for this moment. Now, I know you guys just had Ignite recently, and Satya previewed some of the stuff you're doing with MI300X, but, you know, can you share that with our audience?

Kevin Scott

EVP & CTO, Microsoft

Yeah, we're super enthusiastic about 300X. Satya announced that the MI300X VMs were going to be available in Azure. Like, it's really, really exciting right now, sort of seeing the bring-up of GPT-4 on MI300X, seeing the performance of Llama 2, like, getting it rolled into production. And the thing that I'm excited to hear today is, we will have the MI300X VMs in preview available today.

Lisa Su

Chair and CEO, AMD

I completely agree with you. The thing that's so exciting about AI is every day we discover something new, and-

Kevin Scott

EVP & CTO, Microsoft

Every day

Lisa Su

Chair and CEO, AMD

... and we're learning that together. So, Kevin, we're so honored to be Microsoft's partner in AI. Thank you for all the work that your teams have done, that we've done together, and we look forward to a lot more progress.

Kevin Scott

EVP & CTO, Microsoft

Yeah, likewise.

Lisa Su

Chair and CEO, AMD

Thank you for being here.

Kevin Scott

EVP & CTO, Microsoft

Thank you very much.

Lisa Su

Chair and CEO, AMD

All right, so look, we certainly do learn a tremendous amount every day, and we're always pushing the envelope. Let me talk to you a little bit about how we bring more people into our ecosystem. So, when I talk about the Instinct platform, you have to understand, our goal has really been to enable as many customers as possible to deploy Instinct as fast and as simply as possible. And to do this, we really adopted, you know, industry standards. So we built the Instinct platform based on an industry-standard OCP server design, and I'd actually like to show you what that means, 'cause I don't know if everyone understands. So let's, let's bring her out. Her or him, I don't... Let me show you the most powerful Gen AI computer in the world.

Now, those of you who follow our shows know that I'm usually holding up a chip. But we've shown you the MI300X chip already, so we thought it would be important to show you just what it means to do generative AI at a system level. What you see here is eight MI300X GPUs, and they're connected by our high-performance Infinity Fabric in an OCP-compliant design. Now, what makes that special? So this board actually drops right into any OCP-compliant design, which is the majority of AI systems today, and we did this for a very deliberate reason. We wanna make this as easy as possible for customers to adopt, so you can take out your other board and put in the MI300X Instinct platform.

If you take a look at the specifications, we actually support all of the same connectivity and networking capabilities of our competition. So PCIe Gen 5, support for 400Gb Ethernet, that 896 GB/s of total system bandwidth, but all of that is with 2.4 x more memory and 1.3 x more compute server than the competition. So that's really why we call it the most powerful Gen AI system in the world. Now, I've talked about some of the performance in AI workloads, but I wanna give you just a little bit more color on that. You know, when you look at deploying servers at scale, it's not just about performance. Our customers are also trying to optimize power, space, CapEx, and OpEx, and that's where you see some really nice benefits of our platform.

So when you compare our Instinct platform to the competition, I've already showed you that we deliver comparable training performance and significantly higher inference performance. But in addition, what that memory capacity and bandwidth gives us is that customers can actually either run more models, if you're running multiple models on a given server, or you can run larger models on that same server. So in the case where you're running multiple different models on a single server, the Instinct platform can run twice as many models for both training and inference than the competition. And on the other side, if what you're doing is trying to run very large models, you'd like to fit them on as few GPUs as possible. And so with the FP16 data format, you can run twice the number of LLMs on a single MI300X server compared to our competition.

And this directly translates into lower CapEx, and especially if you don't have enough GPUs, this is really, really helpful. So to talk more about MI300X and how we're bringing it to market, let me bring our next guest to the stage. Oracle Cloud and AMD have been engaged for many, many years in bringing great computing solutions to the cloud. Here to tell us more about our work together is Karan Batta, Senior Vice President at Oracle Cloud Infrastructure. Hey, Karan.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

Hi, Lisa.

Lisa Su

Chair and CEO, AMD

Thank you so much for being here. Thank you for your partnership. Can you tell us a little bit about the work that we're doing together?

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

Yeah, thank you. Excited to be here today. You know, Oracle and AMD have been working together for a long, long time, right? Since the inception of OCI back in 2017. And so, you know, we've launched every generation of EPYC as part of our bare metal compute platform, you know, and it's been so successful, you know, customers like Red Bull as an example. And we've expanded that across the board for all of our portfolio of PaaS services like Kubernetes, VMware, et cetera. And then we are also collaborating on Pensando DPUs-

Lisa Su

Chair and CEO, AMD

Yes. Yes, absolutely.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

-where we offload a lot of that logic so that customers can get much better performance, flexibility. And then, you know, earlier this year, we also announced that we're partnering with you guys on Exadata-

Lisa Su

Chair and CEO, AMD

Yes

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

... which is a big deal, right? So, we're super excited about our partnership with AMD and then what's to come with 300X.

Lisa Su

Chair and CEO, AMD

Yeah, I mean, look, we, we really appreciate, OCI has really been, you know, a leading customer as we talk about how do we bring new technology, into, Oracle Cloud. Now, you're spending a lot of time on AI as well. Tell us a little bit about your strategy for AI and, and how we fit into that strategy.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

Absolutely. You know, we're spending a lot of time on AI, obviously.

Lisa Su

Chair and CEO, AMD

Everyone is. We are.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

Everybody is. It's the new thing. You know, we're doing that across the stack, from infrastructure all the way up to applications. Oracle's an applications company as well.

Lisa Su

Chair and CEO, AMD

Mm-hmm.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

And so we're doing that across the stack, but from an infrastructure standpoint, we're investing a lot of effort into our core compute stack, our networking stack. We announced clustered networking, and what I'm really excited to announce is that we're gonna be supporting MI300X as part of that bare metal compute stack.

Lisa Su

Chair and CEO, AMD

We are super thrilled about that partnership. We love the fact that you're gonna have 300X. I know your customers and our customers are talking to us every day about it. Tell us a little bit about what customers are saying.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

Yeah, we've been working with a lot of customers. Obviously, we've been collaborating a lot at the engineering level as well with AMD. You know, customers are seeing incredible results already from the previous generation, and so I think that will actually carry through with the 300X. And so much so that we're also excited to actually support MI300X as part of our generative AI service that's gonna be coming up live very soon as well. So we're very, very excited about that. We're working with some of our early customer adopters like, you know, Naveen from Databricks Mosaic.

Lisa Su

Chair and CEO, AMD

Absolutely.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

So, you know, we're very excited about the possibility. We're also very excited about the fact that, you know, the ROCm ecosystem is going to help us continue that effort moving forward. So we're very pumped.

Lisa Su

Chair and CEO, AMD

That's wonderful. Karan, thank you so much. Thank your teams. We're so excited about the work we're doing together and look forward to a lot more.

Karan Batta

SVP of Product, Oracle Cloud Infrastructure, Oracle

Thank you, Lisa.

Lisa Su

Chair and CEO, AMD

Thank you. Now, as important as the hardware is, software actually is what drives adoption, and we have made significant investments in our software capabilities and our overall ecosystem. So let me now welcome to the stage AMD President Victor Peng, to talk about our software and ecosystem progress.

Victor Peng

President, AMD

Thank you, Lisa. Thank you, and good morning, everyone. You know, last June at the AI event in San Francisco, I said that the ROCm software stack was open, proven, and ready. And today, I'm really excited to tell you about the tremendous progress we've made in delivering powerful new features, as well as the high performance on ROCm, and how the ecosystem partners have been significantly expanding the support for Instinct GPUs and in the entire product portfolio. Today, there are multiple tens of thousands of AI models that run right out of the box on Instinct, and more developers are running on the MI250, and soon they'll be running on the MI300. So we've expanded deployments in the data center, at the edge, in client, and embedded applications of our GPUs, CPUs, FPGAs, and adaptive SoCs, really end-to-end.

We're executing on that strategy of building a unified AI software stack so any model, including generative AI, can run seamlessly across our entire product portfolio. Now today, I'm gonna focus on ROCm and the expanded ecosystem support for our Instinct GPUs. We architected ROCm to be modular and open source to enable very broad user accessibility and rapid contribution by the open source community and AI community. Open source and the ecosystem are really integral to our software strategy, and in fact, really open is, is integral to our overall strategy. This contrasts with CUDA, which is proprietary and closed. Now, the open source community, everybody knows, moves at the speed of light in deploying and proliferating new algorithms, models, tools, and performance enhancements, and we are definitely seeing the benefits of that in the tremendous ecosystem momentum that we've established.

To further accelerate developer adoption, we recently announced that we're gonna be supporting ROCm on our Radeon GPUs. This makes AI development on AMD GPUs more accessible to more developers, startups, and researchers. So our foot is firmly on the gas pedal with driving the MI300 to volume production and our next ROCm release. So I'm, I'm really super excited that we'll be shipping ROCm 6 later this month. I'm really proud of what the team has done with this really big release. ROCm 6 has been optimized for gen AI, particularly large language models, has powerful new features, library optimizations, expanded ecosystem support, and increases performance by factors. It really delivers for AI developers. ROCm 6 supports FP16, BF16, and the new FP8 data types for higher performance while reducing both memory and bandwidth needs. We've incorporated advanced graph and kernel optimizations and optimized libraries for improved efficiency.

We're shipping state-of-the-art attention algorithms like FlashAttention-2, PageAttention, which are critical for performing LLMs and other models. These algorithms and optimizations are complemented with the new release of RCCL, our collective communications library for efficient, very large-scale GPU deployments. So look, the bottom line is ROCm 6 delivers a quantum leap in performance and capability. Now, I'm gonna first walk you through the inference performance gains you'll see with some of these optimizations on ROCm 6. So for instance, running a 70 billion Llama 2 model, PageAttention and other algorithms speed up the token generation by paging attention keys and values, delivering 2.6x higher performance. HIP Graph allows processing to be defined in graphs rather than single operations, and that delivers a 1.4x speed-up. FlashAttention, which is a widely used kernel for very high-performance LLM performance, delivers 1.3x speed-up.

So all those optimizations together deliver an 8x speed-up on the MI300X with ROCm 6 compared to the MI250 on ROCm 5. That's 8x performance in a single generation. So this is one of those huge benefits we provide to customers with this great performance improvement with the MI300X. So now let's look at it from a competitive perspective. Lisa had highlighted the performance of large models running on multiple GPUs. What I'm sharing here is how the performance of, of smaller models running on, single GPUs, in this case, the 13 billion Llama 2 model. The MI300X and ROCm 6 together deliver 1.2x higher performance than the competition. So this is the reason why our customers and our partners are super excited about creating the next innovations in AI on the MI300X.

So look, we're relentlessly focused on delivering leadership technology and very comprehensive software support for AI developers. In the field that drives, we've been significantly strengthening our software teams through both organic and inorganic means, and we're expanding our ecosystem engagements. So we recently acquired Nod.ai and Mipsology. Nod brings world-class expertise in open source compilers and runtime technology. They've been instrumental in the MLIR compiler technology, as well as in the communities. And as part of our team, they are significantly strengthening our customer engagements, and they're accelerating our software development plans. Mipsology also strengthens our capabilities, and they're especially in delivering to customers in very AI-rich applications like autonomous vehicles and industrial automation. So now let me turn over to the ecosystem. In addition to working closely with the ecosystem.

We announced that we had the partnership with Hugging Face just last June. Today, they have 62,000 models running daily on Instinct platforms, and in addition, we've worked closely with them on getting these LLM optimizations as part of their optimal library and toolkit. Our partnership with PyTorch Foundation has also continued to thrive with CICD pipelines and validation, enabling developers to target our platforms directly. We continue to make very significant contributions to all the major frameworks, including upstream support for AMD GPUs in JAX, OpenXLA, CuPy, and even initiatives like DeepSpeed for science. Just yesterday, the AI Alliance was announced with over 50 founding members that also include AMD, IBM, and Meta, and other companies. I'm really delighted to share some very late-breaking news.

AMD GPUs, including the MI300, will be supported in the standard OpenAI Triton distribution, starting with the 3.0 release. We're really thrilled to be working with Philippe Tillet, who created Triton, and the whole OpenAI team. AI developers using the OpenAI Triton are more productive working at a higher level of design abstraction, and they still get really excellent performance. This is great for developers, and aligned with our strategy to empower developers with powerful and open software stacks and GPU platforms. This is in contrast to the much greater effort developers would need to invest working at a much lower level abstraction in order to eke out performance. Now, I've shared a lot with you about the progress we made on software, but the best indication of the progress we've really made are the people who are using our software and GPUs, and what they're saying.

So it gives me great pleasure to have three AI luminaries and entrepreneurs from Databricks, Essential AI, and Lamini to join me on stage. Please give a very warm welcome to Ion Stoica, Ashish Vaswani, and Sharon Zhou. Yeah. Ashish, you sit here. Sharon, Ion, please... Sorry, Sharon. Okay, great. Welcome, Ion, Ashish, and Sharon. Thank you so much for joining us here. Really appreciate it. So I'm gonna ask each of you, you know, a bit about first what, the mission of your company, and share about the innovations you're doing with our, our GPUs and software, and what the experience has been like. So Ion, let me start with you.

Now, you're also not only founder of Databricks, but you're on the staff of the department of UC Berkeley, you know, Director of Sky Computing Labs, and also you've been involved with Anyscale and many AI, you know, startups. So maybe you could talk about, you know, your engagement with AMD, as well as your experience in the MI200 and the MI300.

Ion Stoica

Co-Founder and Executive Chairman, Databricks

Yeah. Thank you very much. Very glad to be here, and yes, indeed, I collaborated with AMD, wearing multiple hats. I am director of a Sky Computing Lab at Berkeley, which AMD is supporting, and also founders of Anyscale and Databricks. And in all my work over the year, one thing I really focus on is democratizing the access to AI. What this means, it's improving the scale, performance, and cost, reducing the cost to run these large AI applications, which means everything from AI workloads, everything from training, fine-tuning, inference, and generative AI applications. Just to give you some examples, we developed vLLM, which is arguably now the most popular open-source inference engines for LLMs. We have developed Ray, another open-source framework, which is used to distribute machine learning workloads.

Ray has been used by OpenAI to train ChatGPT, and more recently, Sky Computing, one of the projects there is SkyPilot, which helps you to run your applications or machine learning applications and workloads across multiple clouds. And why do you want to do that? Because you want to alleviate the scarcity of the GPUs and reduce the costs. Now, when it comes to our collaborations, we collaborate on all these kind of projects, and one things which was very pleasant surprise is that it was very easy to run and include ROCm in our stack. It really is, you know, it runs out of the box from day one. Of course, you need to do more optimization for that, and this is what we are doing and we are working on.

So, for instance, we have the support for MI250 and to Ray, and we are working, actually collaborating with AMD, like I mentioned, to optimize the inference for vLLM, again, running on MI250 and MI300X. From the point of view of SkyPilot, we're really looking forward to have more and more of MI250s and MI300X in various clouds, so we have more choices to-

Victor Peng

President, AMD

Well, honestly, thank you so much for all the collaboration across all of us. So Ashish, why don't you tell us about Essential's mission, and also your experience with ROCm and Instinct?

Ashish Vaswani

Co-Founder and CEO, Essential AI

Thank you. Thanks. Great to be here, Victor. Essential, we're really excited. We're really excited to push the boundaries of human-machine partnership in enterprises. We should be able to do. You know, we're at the beginning stages, where we'll be able to do 10x or 50x more than, you know, what we can just do by ourselves today. So we're extremely excited. And what that's gonna take, you know, is our belief, it's gonna be a full stack approach, so you're building the models, serving infrastructure, but more importantly, you know, understanding workflows in enterprises today and giving people the tools to configure these models, teach these models, to configure them for their workflows end to end. And so the models learn with feedback.

They get better with feedback, they get smarter, and then they're eventually able to even guide non-experts to do tasks they were not able to do. So we're really excited. We actually were lucky to start to benchmark the 250s earlier this year, and hey, we want to solve a couple of hard problems, scientific problems, and we were like: "Hey, are we gonna get long context?" And check. "Okay, so are we gonna be able to train larger models, able to serve larger models in smaller chips?" And so... As we saw, the ease of using the software was also very pleasant. And then, you know, we saw how things were progressing.

For example, I think in two months, I believe, FlashAttention, which is a critical component to actually scale to longer sequences, appeared. So it was generally very happy and just impressed with the progress, and excited about the chips.

Victor Peng

President, AMD

... Thanks so much, Ashish. And Sharon! So Sharon, Lamini has a very innovative business model and working with enterprise, you know, for their private models. Why don't you share the mission and how the experience with AMD has been?

Sharon Zhou

Co-Founder and CEO, Lamini

Yeah, thanks, Victor. So by way of quick background, Sharon, co-founder and CEO of Lamini. Most recently, I was a computer science faculty at Stanford, leading a research group in generative AI. Did my PhD there also under Andrew Ng, and teach about a 250,000 students and professionals online in generative AI. I left Stanford to pursue Lamini and co-found Lamini on the premise of making the magical, difficult, expensive pieces of building your own language model inside an enterprise extremely accessible, easy to use, so that, you know, companies who understand their domain specific problems best can be the ones who can actually wield this technology, and more importantly, fully own that technology.

In just a few lines of code, you can, you know, run an LLM and be able to imbue it with knowledge from millions of documents, which is, you know, 40,000x more than hitting Claude 2 Pro on that API. So just a huge amount of information can be imbued into this technology using our infrastructure, and more importantly, our customers get to fully own their models. For example, NordicTrack, one of our customers, that makes all the, you know, ellipticals and treadmills in the gym, you know, parent company is iFit.

They have over 6 million users on their mobile app platform, and so they're building an LLM that can actually create this personal AI fitness coach imbued with all the knowledge they have in house on what a good fitness coach is, and it turns out it's actually not a professional athlete. They tried to hire Michael Phelps, did not work. So they have real knowledge inside of their company, and they're imbuing the LLM with that so that we can all have personal fitness trainers. So, we're very excited to be working with AMD. We actually have had an AMD cloud in production for over the past year on MI200, so MI210, MI250s, and we're very excited about the MI300s.

I think, you know, something that's been super important to us is that with, you know, with Lamini software, we've actually reached software parity with CUDA on all the things that matter with large language models, including inference and training. I would say even beyond CUDA. We have reached beyond CUDA for things that matter for our customers. So that's including, you know, higher memory or higher capacity means bigger models, and our customers wanna be able to build bigger and more capable models.

And then a second point, which Lisa kind of touched on earlier today, is, you know, these machines, these chips, can actually, you know, given higher bandwidth, be able to return results with lower latency, which matters for the user experience of certainly a personal fitness coach, but for all of our customers as well.

Victor Peng

President, AMD

Super exciting. That's great. Great. So Ion, back to you. Changing, changing this up a little bit, so, you know, you heard several key components of ROCm is open source, and we did that for rapid adoption and also getting better, you know, more enhancements from the community, both open source and AI. So what do you think about this strategy, and how do you think this approach might help some of the companies that you've founded?

Ion Stoica

Co-Founder and Executive Chairman, Databricks

Yeah. So obviously, given my history, really love the open source. I love the open source ecosystem, and we try to do over time, you know, to, you know, do our own contribution, bring out. And, and I think that one thing to note is that many of the Gen AI tools today are open source, and we are talking here about Hugging Face, about PyTorch, Triton, like I mentioned, vLLM, Ray, and many others. And, and many of these, you know, these, these tools actually can run today on AMD and ROCm stack today. And this makes ROCm another key component of the open source ecosystem, and I think this is great. And, and it's the...

In time, I'm sure that actually quite fast, it's like the community will take advantage of the unique capabilities of AMD's MI250 and MI300X to innovate and to improve the performance of all these tools, which are running at the higher levels of the Gen AI stack.

Victor Peng

President, AMD

Great, that's our purpose and aim, so glad to hear that. I'm gonna, out of order of execution, jump over to Sharon.

Sharon Zhou

Co-Founder and CEO, Lamini

Oh.

Victor Peng

President, AMD

So, Sharon, you know, what do you think about, you know, how AI workloads are evolving in the future, and what do you think AMD GPU Instinct, since you have great experience with it, and ROCm can play in that future of AI development?

Sharon Zhou

Co-Founder and CEO, Lamini

Okay, so maybe a bit of a spicy take. I think that, you know, GOFAI, good old-fashioned AI, is not the future of AI, and I really do think it's LLMs or some variant of LLMs, of these models, that can actually be able to soak up all this general knowledge that is missing from these traditional algorithms, and we've seen this across so many different algorithms in our customers already. Those who are even at the bleeding edge of recommendation systems, forecasting systems, classification, are even using this because of that general knowledge that it's able to learn. So I think that's the future. It's, you know, maybe more known as Software 2.0, coined by my friend, Andrej Karpathy.

And I really do think Software 2.0, which is hitting these models time and time again, instead of writing really extensive software inside a company, will be supporting Enterprises 2.0, meaning enterprises of the future, of the next generation. And I think the MI300 AMD Instinct GPUs are critical to basically supporting, ubiquitously supporting this software 2.0 of the future. We absolutely need compute to be able to run these models efficiently, to run lots of these models, more of these models, and larger models with greater capabilities. So overall, very excited with the direction of not only these AI workloads, but also the direction that AMD is taking in doubling down on these MI300s that, you know, of course, can take on larger models and more capable models for us.

Victor Peng

President, AMD

Awesome. So, Ashish, we'll, we'll finish up with you, and I'll give you the same kind of question. So where do you think about the future of AI workloads, and how do you think, you know, our GPUs and ROCm can play, and, you know, and how you're driving things at Essential?

Ashish Vaswani

Co-Founder and CEO, Essential AI

Yep. So, so I think that, you know, we have to improve reasoning and planning to solve these complex tasks. Like, take an analyst, and if they actually want to absorb, like, an earnings call and figure out whether how they should revise their opinion on whether to invest in a company or what recommendations that they should provide, right? It's actually gonna take. It's gonna take multiple reasoning over multiple steps.

It's gonna take ingesting a large document and, being able to extract information from it, apply their models, actually ask for information when they don't have any, get, world knowledge, but also maybe have some, some, you know, some reasoning and some outside reasoning and planning there. And then for all these sort of... So when I look at, like, the MI300 with very large HBM and high memory bandwidth, I think of what's going to be unlocked and, which capabilities are going to be improved and what new capabilities will be available.

So, I mean, I mean, even, even with what we have today, just imagine a world where you can just, you can process long documents, or you can make these models much more accurate by adding more examples in the prompt. But imagine, like, just complete user sessions that you can maintain in model state, how they would actually improve your, the end-to-end user experience, right? And I think that, we're, we'll...

You know, we're moving to a kind of architecture where what typically used to happen at inference, a lot of search, is now gonna go into training, where the models are going to explore thousands of solutions and eventually pick one that's actually the best option for the goal, the best solution for the goal. And that's... And definitely this, the large HBM and high bandwidth is gonna not only be important for serving large models with low latency for better end-to-end experience, but also for some of these new techniques that we're just about, we're just exploring that are gonna improve the capabilities of these models.

So very excited about the new chip and what it's gonna unlock. Yeah.

Victor Peng

President, AMD

Great. Thank you, Ashish. Ion, Ashish, Sharon, this has been really terrific. Thank you so much for all the great insights-

Ashish Vaswani

Co-Founder and CEO, Essential AI

Thank you

Victor Peng

President, AMD

... you have provided us.

Lisa Su

Chair and CEO, AMD

Thank you.

Ashish Vaswani

Co-Founder and CEO, Essential AI

Thank you for joining us today. Thank you. Thank you, Victor.

Victor Peng

President, AMD

It's just so exciting to hear about companies like Databricks, Essential AI, and Lamini are training, achieving with our GPUs, and just super thrilled that their experience with our software has been so smooth and, and really a, a delight. So you could tell, they see absolutely no barriers, right? And they're extremely motivated to innovate on AMD platforms. Okay, to sum it up, what we've delivered over the past six months is empowering developers to execute their mission and realize their vision. We'll be shipping ROCm 6 very soon. It's optimized for LLMs, and together with the MI300X, it's gonna deliver 8X gen on gen performance improvement and is higher performance in inference than the competition.

We have 62,000 models running on Instinct today, and more models will be running on the MI300 very soon. We have very strong momentum, as you can see, in the ecosystem, adding OpenAI training to our extensive list of industry-standard frameworks, models, runtimes, and libraries. And you heard from the panels, right? Our tools are proven and easy to use. Innovators are advancing the state of the art of AI on AMD GPUs today. ROCm 6 and the MI300X will drive an inflection point in developer adoption, I'm confident of that. We're empowering innovators to realize the profound benefits of pervasive AI faster on AMD. Thank you. And now I'd like to invite Lisa back on the stage.

Lisa Su

Chair and CEO, AMD

Thank you, Victor. And weren't those, innovators great? I mean, you love the energy and just all of the thought there. So look, as you can see, the team has really made great, great progress with ROCm and our overall software ecosystem. Now, I said I wanted, though, we really want broad adoption for MI300X, so let's go through and talk to some, you know, additional customers and partners who are early adopters of MI300X. Our next guest is a partner really at the forefront of Gen AI innovation and working across models, software, and hardware. Please welcome Ajit Mathews of Meta to the stage.

Ajit Mathews

Senior Director Engineering, Meta

Good to be here.

Lisa Su

Chair and CEO, AMD

Hello, Ajit. It's so nice of you to be here. We're incredibly proud of our partnership together. You know, Meta and AMD have been doing so much work together. Can you tell us a little bit about, you know, Meta's vision in AI? Because it's, you know, really, you know, broad and key for the industry.

Ajit Mathews

Senior Director Engineering, Meta

Absolutely. Thanks, Lisa. We are excited to partner with you and others, and innovate together to bring generative AI to people around the world at scale. Generative AI is enabling new forms of connection for people around the world, giving them the tools to be more creative, expressive, and productive. We are investing for the future by building new experiences for people across our services and advancing open technologies and research for the industry. We recently launched AI stickers, image editing, Meta AI, which is our AI assistant that spans our family of apps and devices, and lots of AIs for people to interact within our messaging platforms. In July, we opened access to our Llama 2 family of models, and as you have seen it, have blown away by the reception from the community, who have built some truly amazing applications on top of them.

We believe that an open approach feeds to better and safer technology in the long run, as we have seen from our involvement in the PyTorch Foundation, Open Compute Project, and across dozens of previous AI models and dataset releases. We're excited to have partnered with the industry on our generative AI work, including AMD. We have a shared vision to create new opportunities for innovation in both hardware and software to improve the performance and efficiency of AI solutions.

Lisa Su

Chair and CEO, AMD

You know, that's so great, Ajit. We completely agree. I mean, we completely agree with the vision. We agree with the open ecosystem, and that really being the path to get all of the innovation, you know, from all the smart folks in the industry. Now, we've collaborated a lot on the product front as well, both EPYC and Instinct. Can you talk a little bit about that work?

Ajit Mathews

Senior Director Engineering, Meta

Yeah, absolutely. We have been working together on EPYC CPUs since 2019, and most recently deployed Genoa and Bergamo-based servers at scale across Meta's infrastructure, where it now serves many diverse workloads. But our partnership is much broader than EPYC CPUs, and we have been working together on Instinct GPUs, starting since the MI100 in 2020. We have been benchmarking ROCm and working together on improvements for its support in PyTorch across each generation of AMD Instinct GPU, leading up to MI300X now. Over the years, ROCm has evolved, becoming a competitive software platform due to optimizations and ecosystem growth. AMD is a founding member of PyTorch Foundation and has made significant commitment to PyTorch investment, providing day zero support for PyTorch 2.0 with ROCm, torch.compile, torch.export, all of those things are great.

We have seen tremendous progress on both Instinct GPU performance and ROCm maturity, and are excited to see ecosystem support grow beyond PyTorch 2.0, like to OpenAI Triton, today's announcement with respect to being a default backend of AMD, that's great. FlashAttention-2 is great, Hugging Face, great, and other industry frameworks. All of these are great partnerships.

Lisa Su

Chair and CEO, AMD

You know, it, it really means a lot to hear you say that, Ajit. You know, I think we also view that it's been an incredible partnership. I think the teams work super closely together. That's what you need to do to drive innovation, and, the work with the PyTorch Foundation is, you know, foundational for AMD, but really the ecosystem, as well. But, you know, our partnership is very exciting right now with GPUs. So can you talk a little bit about the 300X plans?

Ajit Mathews

Senior Director Engineering, Meta

Oh, here we go. We are excited to be expanding our partnership to include Instinct MI300X GPUs in our data centers for AI inference workloads. Thank you, Lisa. So just to give you a little background, MI300X leverages the OCP accelerator module standard and platform, which has helped us adopt in record time. In fact, MI300X is trending to be one of the fastest design-to-deployment solutions in the Meta history. We have also had a great experience with ROCm, and the performance it's able to deliver with MI300X. The optimizations and the ecosystem growth over the years have made ROCm a competitive software platform.

As model parameters increase and the Llama family of models continues to grow in size and power, which it will, the MI300X, with its 192 GB of memory and higher memory bandwidth, meets the expanding requirements for large language model inference. We are really pleased with the ROCm optimizations that AMD has done, focused on the Llama 2 family of models on MI300X. We are seeing great, promising performance numbers, which we believe will benefit the industry. So to summarize, we are thrilled with our partnership and excited about the capabilities offered by the MI300X and the ROCm platform as we start to scale their use in our infrastructure for production workloads.

Lisa Su

Chair and CEO, AMD

That is absolutely fantastic, Ajit.

Ajit Mathews

Senior Director Engineering, Meta

Thank you, Lisa.

Lisa Su

Chair and CEO, AMD

Thank you so much. We are thrilled with the partnership, and we look forward to seeing lots of MI300Xs in your infrastructure. So thank you for being here.

Ajit Mathews

Senior Director Engineering, Meta

It's good. Appreciate it.

Lisa Su

Chair and CEO, AMD

So super exciting. We said cloud is really where a lot of the infrastructure is being deployed, but, you know, enterprise is also super important. So when you think about the enterprise right now, you know, many enterprises are actually thinking about their strategy. They want to deploy AI broadly across both cloud and on-prem, and we're working very closely with our OEM partners to bring very, you know, integrated enterprise AI solutions to the market. So to talk more about this, I'd like to invite one of our closest partners to the stage, Arthur Lewis, President of Dell Technologies Infrastructure Solutions Group. Hey, welcome, Arthur. I'm so glad you could join us for this event. And, you know, Dell and AMD have had such a strong history of partnership.

I actually also think, Arthur, you have a very unique perspective of what's happening in the enterprise, just given your purview. Can we just start with giving the audience a little bit of a view of what's happening in enterprise AI?

Arthur Lewis

President, Dell Technologies

Yeah, Lisa, thank you for having me today. We are at an inflection point with artificial intelligence. Traditional machine learning, and now generative AI, is a catalyst for much greater data utilization, making the value of data tangible and therefore quantifiable. Data, as we all know, is growing exponentially. 100 ZB of data was generated last year, more than doubling over the last three years, and IDC projects that data will double again by 2026. And it is clear that data is becoming the world's most valuable asset, and this data has gravity. 83% of the world's data resides on-prem, and much of the new data will be generated at the edge. Yet customers are dealing with years of rapid data growth, multiple copies on-prem across clouds, proliferating data sources, formats, and tools.

These challenges, if not overcome, will prevent customers from realizing the full potential of artificial intelligence and maximizing real business outcome. Today, customers are faced with two suboptimal choices: number one, stitch together a complex web of technologies and tools and manage it themselves, or two, replicate their entire data estate in the public cloud. Customers need and deserve a better solution. Our job is to bring artificial intelligence to the data.

Lisa Su

Chair and CEO, AMD

You know, that's great perspective, Arthur, and that, you know, 83% of the data and where it resides, I think, is something that sticks in my mind a lot. Now let's move to, you know, a little bit of the technology. I mean, we've been partnering together to bring some great solutions to the market. Tell us more about what you have planned from a tech standpoint.

Arthur Lewis

President, Dell Technologies

Well, today's an exciting day. We are announcing a much anticipated update to the family of our PowerEdge 9680, the fastest growing product in Dell ISG history, with the addition of AMD's Instinct MI300X accelerator for artificial intelligence. Effective today, we are going to be able to offer a new configuration of 8 MI300X accelerators, providing 1.5 TB of coherent HBM3 memory, delivering bandwidth of 5.3 TB per server. This is an unprecedented level of performance in the industry and will allow customers to consolidate large language model inferencing onto a fewer number of services, while providing for training at scale, while also reducing complexity, cost, and data center footprint. We are also leveraging AMD's Instinct Infinity platform, which provides a unified fabric for connecting multiple GPUs within and across servers, delivering near linear scaling and low latency for distributed AI.

And there's more. Through our collaboration with AMD on software and open source frameworks, which, Lisa, you talked a lot about today, including PyTorch and TensorFlow, we can bring seamless services for customers and out-of-the-box LLM experience. We talked about making it simple. This makes it incredibly simple. And we've also optimized the entire stack with Dell Storage, specifically PowerScale and ObjectScale, providing ultra-low latency Ethernet fabrics, which are designed specifically to deliver the best performance and maximum throughput for generative AI training and inference. This is an incredibly exciting step forward, and again, effective today, Lisa, we're open for business, we're ready to quote, and we're taking orders.

Lisa Su

Chair and CEO, AMD

I like the sound of that! Look, this, it's so great to see how this all comes together. Our teams have been working so closely together, you know, over the last few years and definitely over the last, last year. Tell us, though, there's a lot of, you know, co-innovation and differentiation in, in these solutions. So just tell us a little bit more about that.

Arthur Lewis

President, Dell Technologies

Well, our biggest differentiator is really the breadth of our technology portfolio at Dell Technologies. Products like PowerScale, which is our OneFile system for unstructured data storage, has been helping customers in industries like financial services, manufacturing, life sciences, to help solve the world's most challenging problems for decades as the complexity of their workflows and scale of their data estate increases. And with AMD, we are bringing these components together with open networking products and AI fabric solutions, taking the guesswork out of building tailored Gen AI solutions for customers of all sizes, again, making it simple. We have both partnered with Hugging Face to ensure transformers and LLMs for generative AI don't just work for our combined solutions, but are optimized for AMD's accelerators and easy to configure and size for workloads with our products. And in addition to that, Delt...

Excuse me, Dell Validated Designs. We have a comprehensive set and a growing array of services and offerings that can be tailored to meet the needs of customers looking for a complimentary Gen AI strategy consultation, all the way up to and fully managed solution for generative AI.

Lisa Su

Chair and CEO, AMD

That's fantastic, Arthur. Great set of solutions. Love the partnership, and love what we can do for our enterprise customers together. Thank you so much for being here.

Arthur Lewis

President, Dell Technologies

Thank you for having me, Lisa.

Lisa Su

Chair and CEO, AMD

Our next guest is another great friend. Supermicro and AMD have been working together to bring leadership computing solutions to the market for many years, based on AMD EPYC processors, as well as Instinct accelerators. Here to tell us more about that, please join me in welcoming CEO Charles Liang to the stage.

Charles Liang

CEO, Supermicro

... congratulations.

Lisa Su

Chair and CEO, AMD

Thank you so much. Hello, Charles.

Charles Liang

CEO, Supermicro

Well, successful announcement.

Lisa Su

Chair and CEO, AMD

Yeah, thank you so much for being here. I mean, Supermicro is really well known for, you know, building highly optimized systems for lots of workloads. You know, we've done so much together. Can you share a little bit about how you're approaching Gen AI?

Charles Liang

CEO, Supermicro

Thank you. Because our building block solution based on a modularized design, so that enable Supermicro to design product quicker than others, and deliver product to customer also quicker, better leverage inventory, and better with service. And thank you for our close relationship. Thank you for all our help, so that's why we are able to design product time to market as soon as possible.

Lisa Su

Chair and CEO, AMD

Well, I really appreciate that. Our teams also work very closely together, and, you know, we now know that, you know, everybody is calling us for AI solutions. You've built a lot of AI infrastructure. You know, what are you seeing in the market today?

Charles Liang

CEO, Supermicro

Oh, the market continue growing very fast. The only limitation is-

Lisa Su

Chair and CEO, AMD

Very fast, right?

Charles Liang

CEO, Supermicro

Very fast. Maybe more than very fast. So all we need is just more chip, more solution.

Lisa Su

Chair and CEO, AMD

I know.

Charles Liang

CEO, Supermicro

So today, including USA, Netherlands, Taiwan, and Malaysia, we have more than 4,000 rack per month capacity. And the customer facing to not enough power not enough space problem. So with our rack scale building block solution, with free air cooling, optimized for hybrid air and free air cooling, optimized for liquid cooling, that can help customer save energy power up to 30%-40%. And that allow customer to install more system with fixed power budget, and or same power same system, but less energy cost. So all of those, together with our rack scale building block solution, we install a whole rack, including generative CPU, GPU, and storage, switch, firmware, management software, security function.

When we ship to customer, customer just simply plug in two cable, power cable, data cable, and then ready to run, ready to online. For liquid cooling customer, for sure they need a water kind of tube. So that make customer can easily online with one chip available.

Lisa Su

Chair and CEO, AMD

Yeah, no, that's fantastic. Thank you, Charles. Now, let's talk a little bit about MI300X. What do you have planned for MI300?

Charles Liang

CEO, Supermicro

Okay, the big product line. We have a product based on MI300X, like 8U for air cooler or for the air cooler, and then 4U optimized for liquid cooler. So the air cooler per rack, we support up to 40 kW or 50 kW. For liquid cooler, we support up to 80 kW or 100 kW, and so all kind of rack scale plug and play. So when customer need, once we have chip, we can ship to customer quicker.

Lisa Su

Chair and CEO, AMD

That sounds wonderful. Well, look, we appreciate all the partnership, Charles, and we will definitely see a lot of opportunity to collaborate together on generative AI. So thank you so much.

Charles Liang

CEO, Supermicro

Thank you so much. Thank you.

Lisa Su

Chair and CEO, AMD

Okay, now let's turn to our next guest. Lenovo and AMD have a broad partnership as well that spans from data center to workstations and PCs, and now to AI. So here to tell us about this special partnership, please welcome to the stage Kirk Skaugen, EVP and President of Infrastructure Solutions Group at Lenovo. Hello, Kirk. Thank you so much for being here. We truly appreciate, you know, the partnership with Lenovo. You know, you have a great perspective as well. Tell us, you know, tell us about your view of AI and what's going on in the market.

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

Sure. Well, AI is not new for Lenovo. We've been talking and innovating around AI for many years. We just had a great supercomputing, where we're the number one supercomputer provider to the TOP500. We're proud that IDC just ranked us number three AI server infrastructure in the world as well. So it's not new to us, but you are at Tech World-

Lisa Su

Chair and CEO, AMD

I was. Yes, yes.

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

... so thank you for joining us today ... in Austin. We're trying to help shape the future of AI from the pocket, to the edge, to the cloud, and we've had this kind of concept of AI for all. So what does that mean? Pocket meaning Motorola smartphone, AI devices, and then all the way to the cloud with our ODM + model. So, our collaboration with our customers is really to accelerate AI adoption, and we recently announced another $1 billion to the original $1.2 billion we announced a few years ago, to deliver AI solutions to businesses of all sizes, from the smallest business to the largest cloud. So we believe that generative AI will ultimately be a hybrid approach, and fundamentally, we do want to bring AI to the data.

I think one of the most exciting things for me is, I think like Arthur said, right, we'd see data doubling in the world over the next few years. 75% of that compute is moving to the edge, and today we're only computing 2% of it, so we're throwing away 98%. So more data is gonna be created in the next few years in the entire history of the world combined, and together, we're bringing AI to the edge with the recent SE455 ThinkEdge that we announced. We think that there's kind of three views of generative AI: public AI, private AI, and personal AI, and the key for us is protecting privacy and addressing data security.

So public AI, where you'd use, you know, obviously public data, enterprise AI, where you'd use only your enterprise data within your firewall, and then on things like an AI PC, you know, things that you choose to have only on your device, whether that's a phone, a tablet, or a PC.

Lisa Su

Chair and CEO, AMD

Yeah, no, no, it's a, it's a very comprehensive vision, and, you know, we, we see it very much the same way. Now, you talked a lot about your AI strategy at Tech World, and, you know, you had, you know, some key pillars there. Do you wanna just tell us a little bit more about that?

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

Yeah, so there's I think there's three fundamental pillars of our AI vision and strategy. First, we have an AI product roadmap, I think that's second to none, from a rich smart device portfolio, and we'll talk about AI PCs probably more in another day, smartphones and tablets. Then we have a huge array now of over 70 AI-ready server and storage infrastructure products, and then we've recently launched ahead of a whole set of solutions and services around that as well. So, more than 70 products, and we'll talk about the new ones we're announcing today, which are very exciting. The second thing is, we have something called an AI Innovators Program. You know, what's really daunting to people is there's over 16,000 AI startups out there. So if you have an IT department of a few dozen people, how do you even start?

So we've gone and scoured the earth. We've found 65 ISVs, 165 solutions, where we've optimized them on top of Lenovo infrastructure for some of the key verticals, and are delivering kind of simplified AI to the customer base. And then at Tech World, we launched a comprehensive set of professional services. Now, Lenovo, more than 40% of our revenue is non-PC, so we're transforming into data center and services. So we're doing everything in the AI from just basic customer discovery of what you can do if you're a stadium, what are the best-in-class stadium solutions, if you're a fast-food chain, if you're a supermarket, all the way to AI adoption, and then even from a sustainability perspective, things like asset recovery services to make sure you have a sustainable AI journey as well.

Lisa Su

Chair and CEO, AMD

Yeah, no, it makes a lot of sense. And, you know, Gen AI and large language models are, like, you know, sort of the defining moment for us right now. You're spending a lot of time with customers. What are you hearing from them, and what are their challenges?

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

Yeah, so I think the key message is that customers need help in simplifying their AI journey. I mean, there's so much coming at them. So our investments in that $2 billion I talked about are really expanding our AI-ready portfolio to deliver fully integrated systems that bring AI-powered computing to everywhere data is created, especially the edge, and helping businesses easily and efficiently deploy generative AI applications. We're also hearing that customers want choice: choice in systems, choice in software, choice in services, and definitely large language models and model training are creating a lot of buzz. But over time, I think we all know inference is gonna become the dominant AI workload as data flows from these billions of connected devices at the edge.

So generative AI, from our perspective, like you said, I think, in your opening comments, needs high-performance compute, large and fast memory, and a software stack to support the leading AI ecosystem solution. So with that, I believe Lenovo and AMD are really uniquely positioned to take advantage of these trends.

Lisa Su

Chair and CEO, AMD

Yeah, absolutely. And, you know, our teams are doing a lot of work together, and working closely on MI300X. Tell us more about your plans.

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

Well, we have a long, proven track record as a PC company and as a data center company of bringing Ryzen AI to our ThinkPads, and we're committed to being time to market on large language models, on inferencing, and we're working with AMD to develop our next-Gen AI product roadmap and our solution portfolios. So we're incredibly excited today about the addition of the MI300X to the Lenovo ThinkSystem platform.

Lisa Su

Chair and CEO, AMD

Thank you.

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

It's gonna be very exciting.

Lisa Su

Chair and CEO, AMD

Thank you.

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

So we're committed to be time to market with a dual EPYC 8- GPU MI300X and have a lot of customer interest on that. So bottom line, from edge to cloud, you know, we are incredibly excited about what's ahead for us. You know, we're gonna have all of this available as a service through our Lenovo TruScale as well, so you only have to pay for what you need. So as we move to an as-a-service model, everything we talked about today will be available through that as well. So thank you very much, and look forward to continuing the collaboration.

Lisa Su

Chair and CEO, AMD

Absolutely. Kirk, thank you so much. Thanks for the partnership.

Kirk Skaugen

EVP and President of Infrastructure Solutions Group, Lenovo

All right, thank you.

Lisa Su

Chair and CEO, AMD

So that's great. Big thank you to Kirk, and Arthur, and Charles for all the work that we're doing together to really bring MI300X to our customers. It really does take an entire ecosystem. We're very proud of actually the broad OEM and ODM ecosystem that we have brought together to bring a wide range of MI300X solutions to market in 2024. And, in addition to the OEM and ODM ecosystem, we're also significantly expanding our work with some of these specialized AI cloud partners.

So I'm happy to say today that all of these partners are adding MI300X to their portfolio, and what's important about this is it will actually make it easier for developers and AI startups to get access to MI300X GPUs as soon as possible with a proven set of providers who each have, you know, their own unique value and capabilities. So that tells you a little bit about the ecosystem that we're putting together for MI300X.... Now, we've given you a lot of information already, but, you know, what is very, very important is not just the hardware and the software and all of our customer partnerships, but it's also the rest of the system partnerships. So now let me welcome to the stage Forrest Norrod to talk more about our AI networking and high-performance computing solutions.

Forrest Norrod

EVP and General Manager, AMD

Thank you, Lisa. Good morning. So far, we've talked about the amazing GPU and open software ecosystem that AMD is building to power generative AI systems. But there's a third element that's equally important to the performance and scalability of these large AI deployments, and that's networking. The compute required to train the most advanced models has increased by a factor of 50 billion over the past decade. While GPU performance has also increased, what that performance demand means is that we need many GPUs in order to deliver the required total performance. Leading AI clusters are now tens of thousands of GPUs, and that's only going to increase. Well, so the first way we've scaled to meet that demand is within the server. A typical server has perhaps a couple of high-performance x86 CPUs and perhaps 8- GPUs. You've seen that today.

These are interconnected with a high-performance, low-latency, non-blocking local fabric. In the case of NVIDIA, that's NVLink. For AMD, that's Infinity Fabric. Both have high signaling rates, low latency, both are coherent. Both have demonstrated the ability to offer near-linear scaling performance as you increase the number of GPUs, and both have been proprietary, effectively only supported by the companies that created them. I'm pleased to say that today AMD is changing that. We are extending access to the Infinity Fabric ecosystem to strategic partners and innovative companies across the industry. Doing so allows others to innovate around the AMD GPU ecosystem to the benefit of customers and the entire industry. You'll hear more about this from one of our partners in a few minutes, and much more on this initiative next year. But beyond the node, we still need to connect and scale to much larger numbers.

We need fabrics to connect the servers to one another, welding them into one resource. Now, there are usually two networks connected to each of these GPU servers: a traditional Ethernet network used to connect the server to the rest of the data center traditional infrastructure, and more importantly, a back-end network to interconnect the GPUs, allowing them to share parameters, results, activations, and coordinate in the overall training and inference tasks. When we're connecting thousands of nodes like we do in AI systems, the network is critical to overall performance. It has to deliver fast switching rates and very low latency. It must be efficiently scalable so that congestion problems don't limit performance. At AMD, we believe it must also be open, open to allow innovation. Today, there are two options for the back-end fabric, InfiniBand or Ethernet. At AMD, we believe Ethernet is the right answer.

It's a high-performance technology with leading signaling rates. It has extensions such as RoCE and RDMA to efficiently move data between nodes, a set of innovations developed for leading supercomputers over the years. It's scalable, offering the highest radix switching technology from leading vendors such as Broadcom, Cisco, and Marvell, and we've seen tremendous innovation recently in advanced congestion control to deal with the issues of scale effectively. And most of all, it's open. Open means companies can extend Ethernet, innovating on top as needed to solve new problems. We've seen that from Hewlett Packard Enterprise with their Slingshot technology, which powers the network at the heart of Frontier, the world's fastest supercomputer, enabling it to achieve exascale performance. And we've seen Google and AWS, who run some of the largest clusters in the world, develop their own Ethernet extensions.

Finally, maybe most importantly, we've seen the industry come together to create the Ultra Ethernet Consortium and standard, where leaders across the field have united to drive the future of Ethernet and ensure it's the best high-performance interconnect for AI and HPC. We're proud to welcome to the stage today some of those networking leaders, Andy, Andy Bechtolsheim from Arista, Jas Tremblay from Broadcom, and Jonathan Davidson from Cisco.

Jonathan Davidson

EVP and General Manager, Cisco

All right, Andy.

Forrest Norrod

EVP and General Manager, AMD

Welcome, gentlemen. It's not often that we have such a panel of Ethernet experts on the stage, but before we jump right into Ethernet. Perhaps we can talk a little bit about the work of enabling an ecosystem for AI solutions, and, you know, what that looks like, and why is it so important to have an open approach? And maybe Jonathan, you could start.

Jonathan Davidson

EVP and General Manager, Cisco

Sure, absolutely. Well, first of all, congratulations on all the announcements today. You know, we look at how Ethernet is so critical because I remember back in the day doing testing on 10 Mbps Ethernet interoperability. We're now at 400 Gbps, 800 Gbps, we have line of sight to 1.6 Tbps. It is absolutely ubiquitous across the industry, and it's also interoperable. It's a beautiful thing. So that open standard is really important for us to be able to make this successful.

Forrest Norrod

EVP and General Manager, AMD

Absolutely, and Jas, so your thoughts as well on open-

Jas Tremblay

VP and General Manager, Broadcom

Oh, I 100% agree. And Forrest, you and I share a vision of the power of the data center ecosystem. You think about a data center, you've got thousands of companies coming together to work as one, and this is really enabled by open standards and a code of conduct that we shall interop. We're gonna make things work together across companies, in some cases across competitors, and I'm especially excited about the work that, you know, you and I have been doing on the Infinity Fabric XGMI.

Forrest Norrod

EVP and General Manager, AMD

Right.

Jas Tremblay

VP and General Manager, Broadcom

We want to let the industry know that the next generation of Broadcom PCIe switches, which are used as the internal fabric inside AI servers, are going to support Infinity Fabric XGMI, and we'll be sharing more details around that over the next few quarters. But I think this is... It's important that we offer choices and options to customers, and that we come together and jointly innovate.

Forrest Norrod

EVP and General Manager, AMD

I completely agree. And Andy, you've long been a proponent of open.

Andy Bechtolsheim

Co-Founder and Chief Architect, Arista Networks

Yeah, well, open standards have been, you know, the driving force for a lot of innovation throughout the industry's history. But nowhere is this more true than in the case of Ethernet, where, you know, the incredible progress we have seen for the last 40 years would not have happened without the contributions of many, many ecosystem participants, including the companies that are represented here on the stage.

Forrest Norrod

EVP and General Manager, AMD

Absolutely. Well, okay, so since this is a panel of Ethernet luminaries, let's talk about Ethernet in particular. You know, what are the advantages of Ethernet for AI? What are the advantages of Ethernet in general? And, you know, how are customers using it today? And we'll talk about the future in a minute, but let's reflect on current state. Maybe Andy, you can start out.

Andy Bechtolsheim

Co-Founder and Chief Architect, Arista Networks

Yeah, so Ethernet is, at least to me, is the clear choice for AI fabrics, and for a very basic reason: it doesn't have a scalability limit. It can truly support not just 10,000s of nodes today, but 100,000s, perhaps even 1 million nodes in the future, and there is no other techno... network technology that has that attribute. And, you know, without that scalability, you know, you're just boxing yourself in.

Forrest Norrod

EVP and General Manager, AMD

Yeah, very true. And Jonathan, I know you guys have been working quite a bit on AI networking systems as well. Maybe you could-

Jonathan Davidson

EVP and General Manager, Cisco

Absolutely

Forrest Norrod

EVP and General Manager, AMD

... amplify.

Jonathan Davidson

EVP and General Manager, Cisco

Well, for today specifically, we see the majority of hyperscalers, as you've had some of them on the stage today, are either using Ethernet for AI fabrics or there's a high desire for them to move to Ethernet for the AI fabrics. So that requires a lot of collaboration from the folks up here on stage to make that happen.

We also have been helping customers deploy in the past their AI networks for enterprise use cases globally, and it might have started more in the financial trading sector in the past, but we're seeing a tremendous amount of interest and use cases for that whole system and how you pull all those things together, from the network, the GPU, the NIC, the DPU, all the way to how you wrap the software around that to really make it simple and understand how things are working, and when they're not working, why, and making that simple for them to do that as well.

Forrest Norrod

EVP and General Manager, AMD

Absolutely. Jas, I know, well, all of us have been working together in deploying Ethernet-based solutions for AI leaders today. I mean, we've been working with-

Jas Tremblay

VP and General Manager, Broadcom

Yeah

Forrest Norrod

EVP and General Manager, AMD

... with the two gentlemen on the end on switching. But, but, Jas, maybe reflect on the, the NIC as well.

Jas Tremblay

VP and General Manager, Broadcom

Well, I think the NIC is critical. People want choices, and we need to move the innovation even faster in the NIC. And you'll see much more linkages between the NIC and the switch, where before you had a compute domain and a network domain, and these things are really coming together. And AI is a driving force of that because the complexity is going up so much.

Forrest Norrod

EVP and General Manager, AMD

Yeah, absolutely. Well, okay, so let's talk about the future a little bit. You know, the Ultra Ethernet Consortium is all four companies on stage are founding members, and there's many others that have joined. You know, UEC is one of the fastest-growing or maybe the fastest-growing consortium under the Linux Foundation, which has been great to see. It's gonna shape... I think UEC is gonna shape the future of AI networking. And so let's unpack that, 'cause I think that's a critical topic for folks. So maybe, Jas, why don't you go ahead and start out?

Jas Tremblay

VP and General Manager, Broadcom

Yeah, so first of all, Ethernet is ready today for AI, but we need to continue to innovate. UEC started with a group of eight companies, including four of our companies here, cloud providers, system providers, and semiconductor providers, coming together around a common vision. The vision is: AI networks need to be open, standards-based, we need to offer choices, and we need to enhance them. With that common vision, you know, the engineers we've assigned from all our companies really got together and rolled up their sleeves, and the innovation happened extremely quickly.

Forrest Norrod

EVP and General Manager, AMD

Mm.

Jas Tremblay

VP and General Manager, Broadcom

So it's quite exciting, actually. One of the things that I'm most excited about this is we're not building something new.

Andy Bechtolsheim

Co-Founder and Chief Architect, Arista Networks

... we are jointly going to enhance Ethernet that's existed for 50 years. So it's not starting from scratch, it's enhancing, it's recognizing that Ethernet is what people want. We just need to continue to enhance it and making this open and standards-based.

Forrest Norrod

EVP and General Manager, AMD

Absolutely. And Jonathan, I know Cisco's been a huge proponent of UEC as well. Maybe you can reflect on your thoughts of where this is going.

Jonathan Davidson

EVP and General Manager, Cisco

Absolutely. Well, I think that UEC absolutely is very critical for Cisco, everyone on the panel, and the whole industry, so that we can continue to drive that movement towards open. It always takes time. You got to debate what are the right technical way to solve things, but I think that overall it's moving in the right direction. What I see what's happening here is that we're gonna have to have interoperability in more than just one area. You know, Andy, you might want to talk about LPO and all the things that we need to do there to make that actually happen. And what's happened to UEC is another important part, and what I see what's happening between now and when the first standard comes out is really a coalition of the willing.

Like, how do we get all of us together to drive towards those open interfaces, whether it be at the Ethernet layer, whether it be at things you need to plug into it, how the GPUs connect into that, how you're actually gonna spray traffic-

Forrest Norrod

EVP and General Manager, AMD

Right

Jonathan Davidson

EVP and General Manager, Cisco

... across a very broad radix, how you're gonna make sure you can reorder packets in a consistent way. These are all things that we need to make sure that we are driving towards from an interoperability perspective, and we've got our own silicon, we've got optics, but we also are in the component business, at Cisco, and so we, we sell those things. Hyperscalers might want to just buy pieces from us, like the silicon, and enterprises may want the full system. But we want to make sure that it's absolutely 100% interoperable in every single environment.

Forrest Norrod

EVP and General Manager, AMD

Absolutely. And Andy, maybe you can hone in a little bit more. I mean, I think many people that aren't familiar at all with networking may think, "Hey, how hard can this be? We're just, you know, you're just shuffling bits around between systems." But there's a lot of problems to solve.

Andy Bechtolsheim

Co-Founder and Chief Architect, Arista Networks

Yeah. So, UEC is in fact solving a very important technical problem, which is, the way I would describe it is modern RDMA at scale, and this has not been solved before. To be clear, you know, RoCE today exists, but it has its limitations. It does take an ecosystem effort approach, and it involves the in particular the adapter, the NIC silicon vendors, but also the whole end-to-end interoperability of that architecture. You know, we're very excited to be part of this. We're not in the NIC business ourselves, but this is absolutely key to enable scaling of RDMA across, you know, 100,000s, if not 1 million nodes.

Forrest Norrod

EVP and General Manager, AMD

Yeah, absolutely. And when you look at what’s being predicted in terms of, you know, million node, you know, hundreds of thousands up to a million node systems, I mean, we all have our work cut out for us. But working together, I know we can solve the problems. Well, guys, thanks so much for coming to talk to us today. I’d like to thank you all for your partnership in this journey, and thank you all for coming today. Thanks very much. Thanks, Jas. Thank you so much, John. I’d really like to thank our partners from Arista, Broadcom, and Cisco for attending and for their partnership in driving this critical third leg that determines the performance of AI systems.

Now, let's turn our focus to high-performance computing, the traditional realm of the world's largest systems. AMD has been driving HPC technology for many years. In 2021, we delivered the MI250, introducing third-generation Infinity Architecture. It connected an EPYC CPU to the MI250 GPU through a high-speed bus, Infinity Fabric. That allowed the CPU and the GPU to share a coherent memory space and easily trade data back and forth, simplifying programming and speeding up processing. But today, we're taking that concept one step further, really to its logical conclusion, with the fourth-generation Infinity Architecture bringing the CPU and the GPU together into one package, sharing a unified pool of memory.

This is an APU, an accelerated processing unit, and I'm very proud to say that the industry's first data center APU for AI and HPC, the MI300A, began volume production earlier this quarter and is now being built into what we expect to be the world's highest performing system. Now, Lisa already showed you what our chiplet technologies make possible with the MI300X. The MI300A takes those same building blocks in a slightly different fashion. Now, the I/O die is laid down first, as before, and contains the Infinity Cache and connections to memory and I/O. The CDNA accelerator chiplets are bonded on top, as in the MI300X.

But with the MI300A, we also take CPU chiplets, leveraged directly from our fourth generation EPYC CPUs, Genoa, and we put those on top of the IODs as well, thus bringing together our leading CPU Zen and CDNA technologies into one amazing part. Finally, 8 stacks of HBM3, with up to 128 GB of capacity, complete the MI300A. A key advantage of the APU is no longer needing to copy data from one processor to another, even through a coherent link, because the memory is unified, both in the RAM as well as in the cache. The second advantage is the ability to optimize power management between the CPU and the GPU. That means dynamically shifting power from one processor to another, depending on the needs of the workload, optimizing application performance.

Very importantly, an APU can dramatically streamline programming, making it easier for HPC users to unlock its full performance. Let's talk about that performance. 61 teraflops of double precision floating point, FP64, 122 teraflops of single precision. Combined with 128 GB of HBM3 memory at 5.3 TB/s of bandwidth, the capabilities of the MI300A are impressive. They're impressive, too, when you compare it to the alternative. When you look at the competition, MI300A has 1.6 x the memory capacity and bandwidth of Hopper. For low precision operations, like FP16, the two are at parity in terms of computational performance. But where precision is needed, MI300A delivers 1.8 x the double and single precision FP64 and FP32 floating point performance.

Beyond simple benchmarks, the real advantages of an APU come with the performance of real world applications which have been tuned for the APU architecture. For example, let's look at OpenFOAM. OpenFOAM is a set of computational fluid dynamics codes widely used across research, academia, and industry. With MI300A, we see 4x the performance of Hopper on common OpenFOAM codes. Now, that performance comes from several places, from higher performance math operations, as we talked, larger memory, and the increased memory bandwidth. Much of that uplift really comes from that unified memory eliminating the need to copy data around the system. That can perform for tuned applications, truly transformative performance. I'm also proud to say that beyond performance, AMD has stayed true to its heritage, to its history of leading in power efficiency.

At the node level, the MI300A has twice the HPC performance per watt of the nearest competitor. Customers can thus fit more nodes into their overall facility power budget and better support their sustainability goals. With the MI300A, we set out to help our customers advance the frontiers of research, and not just running traditional HPC applications. One of the most exciting new areas in HPC is actually the convergence with AI, where AI is used in conjunction with HPC techniques to help steer simulations, thus getting much better results much faster. A great example of this is CosmoFlow. It couples deep learning with traditional HPC simulation methods, giving researchers the ability to probe more deeply and allowing us to learn more about the universe at scale.

CosmoFlow is one of the first applications targeted to be run on El Capitan, which we believe will be the industry's first true 2-exaflop supercomputer running double precision float when it's fully commissioned at Lawrence Livermore National Laboratory. Now, it's gonna be an amazing machine. So let's hear more about El Capitan and its applications for HPC and AI from our partners at LLNL and Hewlett Packard Enterprise.

Speaker 19

We expect El Capitan to be an engine for artificial intelligence and deep learning. We will recreate the experimental environment in simulation, generate lots of data, for example, and then train our artificial intelligence methods on that simulation data. El Capitan will be the most capable AI machine, and its use of APUs at this scale will be the first of its kind.

Speaker 18

As you operate these exascale level workloads, all of those nodes talk to each other. AMD and HPE have a long legacy of partnership, and it was only natural for us to partner again for El Capitan. The MI300A can be versatile across many different workloads, and we couple it directly with our Slingshot fabric to give it high performance as it operates as a system.

Speaker 19

We work very closely with AMD and HPE to deliver the hardware and the software that's actually used by the scientists in the machine itself.

Speaker 18

It's really that partnership together that can really go after and build these supercomputers.

Speaker 19

El Capitan will be 16x faster than our existing machine here at Lawrence Livermore. It will enable scientific breakthroughs that we can't even imagine.

Forrest Norrod

EVP and General Manager, AMD

We're proud to have partnered with Hewlett Packard Enterprise to design and now build this amazing system. So I'd like to invite to the stage Trish Damkroger, the Senior Vice President and Chief Product Officer for HPC, AI, and Labs from Hewlett Packard Enterprise. Thank you. Welcome, Trish.

Trish Damkroger

SVP and Chief Product Officer, HPC

Thank you.

Forrest Norrod

EVP and General Manager, AMD

You know, the AMD and HPE teams have been working closely together over the years to deliver some next-generation supercomputers. You know, most recently, of course, we've broken the, broken the exascale barrier. I gotta say that again. We broke the exascale barrier! with Frontier for Oak Ridge National Laboratory, and now we're looking forward to powering another exascale system and another bench- another record with you, with El Capitan for Lawrence Livermore National Laboratory, another U.S. Department of Energy lab. Maybe you can share more with this audience about our journey together and the innovations that we've ushered in this journey to exascale.

Trish Damkroger

SVP and Chief Product Officer, HPC

Sure. First, I wanna echo the long partnership that we've had with AMD. Frontier continues to be the fastest computer in the world. Many doubted our ability to actually reach exascale, but we were able to achieve this feat with industry-leading liquid cooling infrastructure, next-generation high-performance interconnect with Slingshot, our highly differentiated system management, and Cray programming environment software, along with the incredible MI250. With Frontier, exascale computing has already made breakthroughs in areas such as aerospace, climate modeling, healthcare, and nuclear physics. Frontier is also one of the world's top 10 greenest supercomputers. In fact, HPE and AMD have the majority of the world's top 10 energy efficient supercomputers. I know. I am very excited to deliver El Capitan to Lawrence Livermore. As you know, I worked there.

Forrest Norrod

EVP and General Manager, AMD

Yeah

Trish Damkroger

SVP and Chief Product Officer, HPC

... for over 15 years. El Capitan's computing prowess will fundamentally shift what the scientists and engineers will be able to achieve. El Capitan's gonna be 15x-20x faster than their current system. Supercomputing is truly essential to the mission of the Department of Energy. Lawrence Livermore has been at the forefront, driving the convergence of HPC and AI, demonstrated by work at the National Ignition Facility and other of the national security programs. I'm really looking forward to continuing our journey of bringing more leadership class systems to the world.

Forrest Norrod

EVP and General Manager, AMD

Absolutely. I couldn't agree more, Trish. It's been a rewarding journey working together with HPE. But speaking of our shared success in building these record-breaking systems, can you tell us a bit more about El Capitan and how HPE is developing the Instinct 300A-powered APU, CPUs to El Capitan?

Trish Damkroger

SVP and Chief Product Officer, HPC

Great. Yes, El Capitan will feature the HPE Cray EX supercomputer with the MI300A accelerators to power large AI-driven scientific projects. The HPE Cray EX supercomputer was built from the ground up, with end-to-end capabilities to support the magnitude of exascale. El Capitan nodes include the MI300A-

Forrest Norrod

EVP and General Manager, AMD

Right

Trish Damkroger

SVP and Chief Product Officer, HPC

... coupled with our Slingshot fabric to operate as a fully integrated system. Supercomputing is the foundation needed for large scale AI, and HPE is uniquely positioned to deliver this with our Cray supercomputers. El Capitan will be that engine for AI and deep learning for the Department of Energy. They'll be recreating the experimental environment and simulations, and training the AI models with all of that vast amount of data. El Capitan will be one of the most capable AI systems in the world. And beyond El Capitan, we're excited to have expanded our supercomputing portfolio with the MI300A to bring next generation accelerated compute to a broad set of customers.

Forrest Norrod

EVP and General Manager, AMD

Yeah. So, Trish, that's fantastic, and I... You know, actually, let's double click into that a little bit more. I know that there are a growing number of supercomputing customers, not just at LLNL, that are really applying AI to their projects. Can you tell us a little bit even more about that?

Trish Damkroger

SVP and Chief Product Officer, HPC

Sure. So AI undoubtedly will be the catalyst to transform scientific research. As I said earlier, supercomputing is the foundation needed to run AI, and HPE is the undisputed leader in delivering supercomputers. Some example where AI will be fundamental in El Capitan include the National Ignition Facility, where they'll be using 1D, 2D, 3D simulations, along with trained AI models, to develop a more robust design for higher yield fusion reactions. Just imagine fusion energy in our future. Another application is high resolution earthquake modeling, essential for understanding building structural integrity and also emergency planning. And one more application is bioassurance, where simulation and AI models will be key in developing rapid therapeutics. Supercomputing and AI are tools to allow engineers and scientists the ability to find the unknown.

I'm thrilled to be part of the journey of accelerating scientific discovery and the scale of impact it has on changing the way people live and work.

Forrest Norrod

EVP and General Manager, AMD

Fantastic. Well, Trish, thank you.

Trish Damkroger

SVP and Chief Product Officer, HPC

Thank you.

Forrest Norrod

EVP and General Manager, AMD

I'm so excited about the opportunities that researchers and scientists will have with the systems that we're bringing to the market together.

Trish Damkroger

SVP and Chief Product Officer, HPC

Right.

Forrest Norrod

EVP and General Manager, AMD

Thanks so much.

Trish Damkroger

SVP and Chief Product Officer, HPC

Thank you.

Forrest Norrod

EVP and General Manager, AMD

Yeah, on behalf of AMD and the entire team, I really wanna just, just thank HPE and, and our customers for the opportunity to participate in the development of these massive systems. 'Cause El Capitan will be an amazing machine and a real showcase for the MI300A, which defines leadership at this critical juncture as HPC and AI converge. AMD is proud of the leadership systems powered by MI300A, which will be available soon from partners around the world. I can't wait to see what researchers and scientists are gonna do with these systems. And with that, I'd like to welcome Lisa back on stage to conclude our journey today. Thank you.

Lisa Su

Chair and CEO, AMD

All right, thank you, Forrest, and thank you to all of our partners who joined us. You've heard from Victor, Forrest, our key partners. We have significant momentum, and we're building on that for the data center, AI platforms. To cap off the day, let me now talk about another important area for AMD, where we're delivering leadership AI solutions, and that's the PC. Now, for the PCs, we recognized several years ago that on-chip AI accelerators, or NPUs, would be very, very important for next generation PCs, and the NPU is actually the compute engine that will enable us to reimagine what it means to build a truly intelligent and personal PC experience. At AMD, we're on actually a multi-year journey. We have a strong roadmap to deliver the highest performance and most power-efficient NPUs possible.

We were actually the first company to integrate an NPU into an x86 processor when we launched Ryzen Mobile 7040 Series earlier this year, and we integrated the XDNA architecture that actually came from our acquisition of Xilinx. It actually took us less than a year to bring Xilinx's proven technology into our PC products. Let me tell you a little bit about XDNA. It's a scalable and adaptive computing architecture. It's built around a large computing array that can efficiently transfer the massive amounts of data required for AI inference. And as a result, XDNA is both extremely performant and also very energy efficient, so you can run multiple AI workloads simultaneously in real time.

Now, I'm happy to say that we've already shipped millions of Ryzen AI-enabled PCs into the market with all of the leading PC OEMs, and all of this provides the hardware foundation for developers to leverage this first wave of AI PCs. Now, if you look at some of the applications, today, Ryzen AI powers hundreds of different AI functions, things like advanced motion tracking and sharpening to deblur 4K video, enabling production-level digital production capabilities with unlimited virtual cameras, all in an ultra-thin notebook for the very first time. We're also working with key software leaders like Adobe and Blackmagic, and they're using our on-chip Radeon GPU to accelerate the AI-enabled editing features so that you can dramatically improve productivity for content creators. And of course, we've worked very, very closely with Microsoft to enable Windows 11 Studio Effects on Ryzen AI.

Now, today, we're launching some additional capabilities. So Ryzen AI 1.0 software, it will make it easier for developers to add advanced Gen AI capabilities. So with this new package, developers can create an AI-enabled application that's ready to run on AI hardware... on Ryzen AI hardware just by choosing a pre-trained model. So, for example, you can choose one of the models that are available on Hugging Face, you know, quantize it based on your needs, and then deploy it through ONNX Runtime. So this is a major step forward when you think about the broad ecosystem that wants to run AI apps for Windows, and we can't wait to see what ISVs will do when they really capture the leadership performance that you can get from an NPU in Ryzen AI. Now, of course, we know developers always want more AI compute.

So today, I'm very happy to say that we're launching our Hawk Point Ryzen 8040 Series mobile processors, and- Thank you. Hawk Point combines all of our industry-leading performance and battery life, and it increases AI TOPS by 60% compared to the previous generation. So if you just take a look at some of the performance metrics for the Ryzen 8040 Series, if you look at the top of the stack, so a Ryzen 9 8945, it's actually significantly faster than the competition in many areas, delivering more performance for multi-threaded applications, 1.8x higher frame rates for games, and 1.4x faster performance across content creation applications. But when you look at the AI improvements of Ryzen 8040, you really see some substantial improvements. So I talked about additional TOPS in Hawk Point, and what that results in faster performance when you're running the key models.

So things like Llama 2 7B, we run 1.4x faster, and also 1.4x faster on things like AI image recognition and object detection models. So all of this, what does it do? It provides faster response times and overall better experiences. Now, I really believe that we're actually at the beginning of this AI PC journey, and it's something that is really gonna change the way we think about productivity at a personal level. So we've been working very closely with Microsoft to ensure that we're co-innovating across hardware and software to enable those next generation of AI PCs. To share more about this work, I'm pleased to welcome Pavan Davuluri, Corporate Vice President of Windows and Devices at Microsoft, to the stage. Hey, how are you?

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Thank you. It's great to be here.

Lisa Su

Chair and CEO, AMD

Pavan, thank you so much for being here. You know, we started the show with Kevin Scott talking about the great partnership between Microsoft and AMD, and, you know, all the work we're doing on the big iron and the cloud and Azure.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Yeah.

Lisa Su

Chair and CEO, AMD

It seemed fitting that, you know, we close the show with, you know, the other very, very important work that we're doing together on the client side. So can you tell us a little bit, Pavan, about all the great work and your vision for client AI?

Pavan Davuluri

CVP of Windows and Devices, Microsoft

For sure. As you and Kevin covered, Microsoft and AMD have a long partnership together across Azure and Windows, and it's incredible to see us moving that partnership together into the next wave of technology with AI. As you shared, Lisa, for us, there are millions of PCs right now with Ryzen 7040 AI in market, and that's amazing because these are the first x86 PCs with integrated NPUs, enabling enhanced AI experiences.

Lisa Su

Chair and CEO, AMD

You told me everybody wanted NPUs.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Absolutely. I mean, you know, right now, we get to see some incredible AI features-

Lisa Su

Chair and CEO, AMD

Yes, yes.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Some we talked about, Windows Studio Effects coming to life across the scale of the ecosystem, absolutely fantastic, I would say. Now, for us at Microsoft and for the ecosystem, our marquee AI experience is really Copilot. Similar to how the Start button is the gateway into Windows, the Copilot for us is the entry point into this world of AI on the PC. It has a fundamental impact on everything we will do on a computer, from work and school and play and entertainment and creation.

Lisa Su

Chair and CEO, AMD

You know, I completely agree, Pavan. I think Copilot is so transformational. I mean, for everyone who's had a chance to experience it, it's so... It really changes the way we do work. So, let's talk about the tech that's underneath it, so to enable Copilot and everything that we wanna do on PCs.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

We are putting together new systems architectures that really power those experiences going forward, and they really pull together GPU, NPU, and certainly the cloud as well. And quite honestly, we're seeing customer habits change early at this point in time, and we believe, to your point earlier, we're early in the cycle of innovation that's coming. When we have these powerful NPUs, like the ones you're building, it gives us an opportunity to create apps that take advantage of both local and cloud inferencing. And to me, that's what the Windows AI ecosystem is about, and that's what we're building in partnership with you. It's designed to enable those scenarios, with the ONNX Runtime, of course, and the Olive toolchain to back this up.

Applications are gonna have many models, like, Llama that you mentioned, Phi- 2 running, and they will run very capably on the TOPS that we will have. And, of course, not to mention the foundation models that are powered by the GPUs, in the cloud.

Lisa Su

Chair and CEO, AMD

Yeah, I mean, I think this is an area where, Microsoft and AMD really have a very unique position because, you know-

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Right

Lisa Su

Chair and CEO, AMD

... we have so much capability in the cloud. We have also access to the client and the local view.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Sure.

Lisa Su

Chair and CEO, AMD

Can you share a bit about how we're thinking about, you know, across all of these, the cloud, local, view?

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Yeah. With AMD, we're making it simpler to incorporate what we call the hybrid pattern or the hybrid loop into applications, and we want to be able to load shift between the cloud and the client to provide the best of computing across both those worlds. For us, it's really about seamless computing across the cloud and the client. It brings together the benefits of local compute, things like enhanced privacy and responsiveness and latency, with the power of the cloud, high-performance models, large datasets, cross-platform inferencing. And so for us, we feel like we're working together to build that future where Windows is the destination for the best AI experiences on PCs.

Lisa Su

Chair and CEO, AMD

Yeah, no, I think that sounds great. Now, one of the things, though, that, you know, you definitely are always talking to me about is more TOPS. Pavan asks for more TOPS all the time. So, look, we completely believe that, to enable, you know, your vision for AI experiences, you know, we've really thought about how do we actually accelerate our client AI roadmap. So, I wanna share a little bit of our roadmap today.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Gotcha.

Lisa Su

Chair and CEO, AMD

You know, Ryzen 7040 and 8040, you know, we've already delivered those industry-leading NPU capabilities, but today I'm very excited to announce that our next gen Strix Point Ryzen processors will actually include a new NPU powered by our second generation XDNA architecture coming in 2024.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Congratulations, Lisa.

Lisa Su

Chair and CEO, AMD

Thank you. So, a little bit about XDNA 2. You know, it's designed really for leadership Gen AI performance. It delivers more than 3 x the NPU performance of our current Ryzen 7040 Series. And Pavan, I'm very happy to share... I know your teams already know this 'cause you have the silicon, but today, Strix Point is running great in our labs, and we're really excited about it. Our teams have been working really closely together to make sure that all of those great future Windows AI features run really well on Strix Point, so I can't wait to share, share more about that later this year.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Lisa, that's awesome, and we will use every TOPS you will provide us.

Lisa Su

Chair and CEO, AMD

You, you promised, right?

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Absolutely.

Lisa Su

Chair and CEO, AMD

Good.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

You know, it's not just the size of the neural engines. The dramatic increase in efficiency, performance per watt, of these next-generation NPUs we think will bring a whole new level of capabilities to the market, enabling personalization on every interaction on these devices. Together with Windows, we feel like we're building that future for the Copilot, where we will orchestrate multiple apps, services, and across devices, quite frankly, functioning as an agent in your life that has context and maintains context across entire workflows. So we're very excited about these devices coming to life for the Windows ecosystem. We're excited to see what developers will do with this technology and, quite frankly, at the end of the day, ultimately, what customers will do with all of this innovation.

Lisa Su

Chair and CEO, AMD

Thank you so much, Pavan. We are so excited about the partnership. We appreciate, you know, all, all the long-term work we're doing together and, look forward to, lots of great things to come.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Thank you for having me, Lisa. I appreciate it.

Lisa Su

Chair and CEO, AMD

Thank you, Pavan.

Pavan Davuluri

CVP of Windows and Devices, Microsoft

Thank you.

Lisa Su

Chair and CEO, AMD

All right, so it's been such a fun day, but now it's time for me to wrap up a bit. We've showed you a lot of new products, a lot of new platforms, a lot of new technologies that are all about taking AI infrastructure to the next level. MI300X, MI300A accelerators, these are all shipping today in production. They're already being adopted by Microsoft, Oracle, Meta, Dell, HPE, Lenovo, Supermicro, and many others. You heard from Victor how we're expanding the ecosystem of AI developers working with us, ROCm 6 software, the open ecosystem, that our goal is to make it incredibly easy for everyone to use Instinct GPUs. You heard from Forrest in our panel on the overall system architecture, our work with Arista, Broadcom, and Cisco.

We believe that to create this high-performance AI infrastructure, it has to be open, and that's what we're doing together for scale-out AI solutions. And then you heard what we're doing on the other side, the client part of our business, because we actually believe AI should be everywhere. So our latest Ryzen processors really extend our compute vision and our AI leadership. I hope you can see that AI is absolutely the number one priority at AMD. Our goal is to push the envelope, to bring innovation to the market, to do more than anything thought was possible, because we believe, as wonderful as our technology is, it is about doing it together in a partner ecosystem where everybody brings their best to the market. Today, I wanna say on a personal level, today is an incredibly proud moment for AMD.

If you think about all of the innovation, everything that we bring to the market, to be part of AI at this time, at the beginning of this era, to work with these amazing people throughout the industry, throughout the ecosystem at AMD, I can say that I've never seen something more exciting. A very, very special thank you to all of our partners who joined us today, and thank you all for joining us.