COMPUTEX Taipei 2024

Jun 5, 2024

Speaker 2

Good afternoon, everyone. Welcome back to the 2024 Computex Forum. For those of you joining us for the first time, I'm Jeannie Chang, your MC for today. This afternoon, we'll dive deeper into the theme of Behind Generative AI. Let's get started with our first speaker, Mr. Praveen Vaidyanathan, the VP and GM of the Compute Products Group, Compute and Networking Business Unit at Micron. Together, we will explore the memory and storage technologies critical to AI's future across data center, edge, and client devices. Before the presentation, let's watch a short video together. Let's welcome Mr. Vaidyanathan.

Praveen Vaidyanathan

VP and General Manager, Micron Technology

Thank you. Thank you. Creative, magical, beautiful, inspirational. These are words that we use to describe the core of who we are as humans. These are words that we just used to describe technology. These are words that I will use today to connect humans and technology. I've been thinking about why you're all here today, and I believe that you're all here today to learn something new about technology, something new about products that you didn't know before you came to Computex this year. I believe you're here to make connections. To make connections with colleagues, industry leaders, to make connections with friends, some old, some new, that you will walk away with. I also believe you are here to take away something that you can go back and use in your life at work.

I also believe that whatever you do at work with what you take away, will fundamentally build better products and will fundamentally change the lives of millions of people in the world. Micron and you have a very shared mission. Micron's mission is to transform how people use information to enrich life for all. My name is Praveen Vaidyanathan, and I am the VP and GM of the Compute Products Group at Micron, and I am here today to share a little bit about memory and storage products that I believe will enable the art of the possible, and sometimes the art of the impossible. We want to build deeper connections with each one of you, our customers, our partners, users, so that we can reimagine how we architect the human technology connection. Let's talk a little bit about the why of AI.

Why is AI all everywhere around us? Why is AI being unleashed in the world? I'm sure all of us can come up with unique examples of our own, but I want to talk about three of these. I want to talk about healthcare, I want to talk a little bit about automotive, and I want to talk about what we generally call science. Healthcare. I live in San Francisco, and I live very close to the hub and the center of the biotechnology industry. There are so many stories that I've heard and I've read about, about what used to take decades, years, to develop new molecules, to develop new drugs, has been accelerated by orders of magnitude with the help of AI. A massive help to humanity. Where I live in San Francisco, it's very common to have self-driving autonomous taxis.

We use these to ride to work, we use these to ride to go meet friends and family, we use these just to avoid the traffic in San Francisco. Science. Massive data analytics enabled through machine learning algorithms. Not just analytics, but being able to convert this into intelligent insight that changes our lives, is enabled through AI. On a more personal level, I think about languages, and here, I am here talking to you, and we are able to communicate and understand each other through the power of language. We also live in a very global world with multilingual personalities and cultures, and we all know the challenges when we cannot communicate in the same language. Natural language processing, LLMs, are changing this. Language is no longer a barrier in communication.

It enables how we relate to each other. The why of AI should immediately translate into the how. And at the very heart of enabling AI and the how heart, and the how of AI, is memory and storage. Let me start with the data center and the cloud applications of AI. If you think about generative AI in the cloud, you think about AI infrastructure in the data center. It's been a big part of lives for the last 2 years. Memory and storage are trying to solve three problems in this space: bandwidth, capacity, and power. 30 TB of storage per SSD forms the cold storage, warm storage tier of AI infrastructures.

Massive amounts of SSDs connected across data centers at the rack level, within servers, forms the backbone of the massive amounts of data that we are using to develop AI, to train AI models, and develop AI inference solutions. When do you want this move to this memory? When you want to move this data quickly into your memory, you need high-performance SSDs. Gen 5, Gen 6 SSDs continue to scale the capabilities of this hot storage level. Last week, Micron announced a key milestone in the industry when we released our Compute Express Link memory modules that takes your existing AI infrastructure and allows you to scale the capacity of that infrastructure and give you additional bandwidth. Then we come to the workhorse of the AI infrastructure, which is DDR memory.

DDR5 is one of the most versatile solutions when you want to optimize bandwidth, capacity, and storage in your AI infrastructure. Finally, you know, my favorite three-letter word used to be M-O-M, my mom, but that was last year. My favorite three-letter word now, and I will pause for dramatic effect, is HBM, high bandwidth memory. This innovation came to the industry in the mid-2000s, and it has scaled up to become a very important element of any AI infrastructure today. As we move from compute-bound workloads to memory-bound workloads, the bandwidth that comes from high-bandwidth memory is absolutely necessary to build AI infrastructure. So I told you what my favorite three-letter word was. Any guesses what my favorite five-letter word is? It's HBM3E. Micron was the first to release 8-high stack, 24 GB HBM3E into the market in July of last year.

We now have solutions with 24 GB and 36 GB capacities. The 24 GB is shipping in the market with an amazing speed of more than 1.2 TB/s, per placement of HBM. Most infrastructures use multiple placements per GPU to scale the bandwidth. A corollary of the HBM3E is DDR5. Micron was again the first to develop, the first to validate on CPUs, and the first to ship to production, a monolithic 32 Gb base DRAM, 128 GB, in March of this year. Most server infrastructures couple GPUs with CPUs. CPUs that require 2-3 TB of memory absolutely require this 128 GB solution, which with our advanced technology, provides you over 45% improvement in bit density and is the most TCO-efficient way for you to construct an AI server.

One of the other things I talked about as a problem we are trying to solve is power. You can scale performance, you can scale bandwidth, but we all know how much we all need to focus at all levels of components and infrastructure on optimizing power. Micron's HBM3E was developed with this, with this in mind. It has over 30% lower power than our competition. That's somebody who actually developed the product, I think. Our DDR5 was with the same thing in mind, using our advanced 1β nanometer node, using our 32 Gb design, over 20% improvement in energy efficiency. Think about what you can do in the data center with this. You can drive lower power consumptions. It lowers your operational costs. You can actually use that power to drive further performance improvements and efficiency gains.

This will continue to be a focus for Micron, in order to drive power optimization in the industry. And I don't know how many of you know this, but yesterday, 4 June , in the U.S., was declared National DRAM Day. 56 years back, on 4 June , the first DRAM patent was issued to Robert Dennard and IBM, a very important day in the industry. This is a recognition how DRAM has become a big part of an economic driver in the industry and is driving technological innovation. This week in Taiwan, there's been a lot of conversations about how Taiwan is now at the heart of AI. Micron's HBM3E is stacked, assembled, and built right here in Taiwan. Micron's DDR5, 128 GB, is manufactured using 1 β technology wafers right here in Taiwan.

We are very proud to be a core part of this ecosystem that is developing in Taiwan. We are very proud to work with the people of Taiwan to bring these innovations to the industry.... Generative AI is not just at the edge, not just at the cloud. It is transitioning to devices at the edge, things that we hold in our hands every day. But I will use the term Hybrid AI, because as we talk more to people, we find out use cases. We realize that this is really a combination of use cases across the cloud and the edge that will really make this most valuable to all of us. The same challenges of power, bandwidth, capacity. The devices I'm talking about are personal computing devices, PCs, tablets, notebooks, laptops that we use every day.

Phones in our hands, everybody's got one, bringing AI closer, right in the palm of your hand. Automotive, self-driving cars, the massive amounts of compute, memory, and storage that goes into a car to drive real-time decisions, to drive better, safe driving, is something that's gonna be driving our choices on what cars we buy, which cars we drive, or maybe which cars we don't drive, just cars we get into. This week in Computex, there's been a lot of information coming to us about what are AI PCs. PCs are central to Computex, and AI PCs has become central to Computex this year. There's lots of conversations on top CPUs and NPUs. How many TOPS do you need? How much memory do you need? How much storage do you need?

Announcements by a lot of our partners on 45 TOPS, 48 TOPS, 50 TOPS, maybe 120 TOPS soon. But as this improve, we are working very closely with our partners to ensure that the right memory and storage solution is right there. It's designed to enable compute workloads and memory workloads. So I want to talk a little bit about the products that we are building in this space, and I'll talk about three specific products. The first is the Micron 3500 NVMe SSD. Oh, sorry. Did this go to the right one? No, it didn't. Okay. The first is the Micron 3500 NVMe SSD. So this is an interesting product. This is a Gen 4 product. It will fully saturate a Gen 4 PCIe bus.

When we designed this product, and we released it in December of last year, we took careful attention to develop firmware, to tune the firmware based on workloads that the SSD operates in, in the AI environment. A lot of you in this room are technologists. You work in storage, you work in compute. You are familiar with a benchmark called SPECwpc, which really looks at very optimized workloads for very specific applications, and I'll take science as an example, life sciences. The optimization that we did at the firmware level on this product drove over 132% improved performance compared to all the SSDs in this class of products at that time.

That's the kind of work that we need to be doing as a memory and storage, and we'll continue to do to ensure that we deliver the best products to the industry. Before I talk about the next product, I maybe want to spend a little bit on the history of PCs. Late 1970s was the PC revolution. Since then, a lot of different memory solutions have been coming out, single inline memory modules, different kinds of DDR. In 1997, the most ubiquitous, commonly used form factor in PCs, small outline, dual inline memory modules, which a lot of us very fondly call SODIMMs, was released, and it's been a workhorse for the industry since then. But there haven't been any major innovations in memory form factors for PCs till 2024.

Collaborating with our partners and the industry, we released what is called an LPCAMM2, which is a new form factor. We were the first to bring it to the market in January of this year, and one of our key OEM partners actually has this in a laptop released in the market in April of this year. LPCAMM2 is a low-power, memory-based modular solution. It's called a compression attached memory module, and it drives some pretty significant value in a PC. I talked about three problems we are trying to solve, right? Bandwidth, capacity, power. In PCs, there's one more area that is important: form factor and modularity. This product today, as it's released, the LPCAMM2, gives you an over 50% bandwidth improvement for a DDR5-based SODIMM available at the same time. It's a dual channel, 128-bit module that drives much better performance in a system.

It will provide over 60% lower power in active use cases and over 80% lower power in standby use cases. We talked about how power is so important. AI PCs are hungry for performance, hungry for capacity, but we cannot increase power. This form factor is gonna be a key part of how AI PCs evolve. In fact, we believe that this will be the primary solution as AI PCs scale to higher and higher performance over the next several years. The other thing that is important... The other thing that is important, I said, is space. This product is 60% smaller than the equivalent SODIMM of the same capacity and same bus width. Massive opportunity to innovate the motherboards and the PC design. I've been sharing a lot of my favorite things with you today.... Two more favorite things.

My favorite music album is Led Zeppelin III. My favorite graphics memory is GDDR7. Today, at Computex, with you, we are announcing for the first time and introducing this product. We are sampling the industry's fastest graphics memory, GDDR7, a new standard today with our customers. Think of graphics memory for two key applications. Think of it as gaming, and think of it as AI. The advent of ray tracing in gaming, resolutions going from 1080p to 4K and 5K and beyond, motion capture technology in gaming, the introduction of non-player character behaviors with AI in gaming, all this is driving higher and higher bandwidth requirements.

GDDR7 will be launching today with the speed capability of up to 32 Gbps, which means at a system, you're talking about 1.5 TB/s of bandwidth, more than 60% higher than the previous generation, GDDR6. Power is very important. Micron's GDDR7 has a 50% improvement in power efficiency compared to the previous generation, GDDR6. Our innovations on the 1β DRAM technology, our design innovations, allow us to record the highest bit density GDDR ever available in the market, GDDR7. We were pioneers in introducing what you all would recognize as PAM4 technology with the previous GDDR product. This uses an I/O interface based on PAM3 technology, driven by some of our experiences where we've been able to innovate with this product.

These are the three products I'd like you to take away and think about as you all think about how you want to architect your next PC solutions. I want to leave you with a view of the entire Micron product portfolio, developed for AI across all the applications of data center, PCs, mobile phones, and automotive. As I close out, I want to go back to where we started, about the connections between technology and humans. Some of us, maybe not all of us, grew up reading the thoughts and words of one of the most imaginative people of our generation, Isaac Asimov. Asimov talks about technology, he calls it science, but he says science accumulates knowledge, humanity accumulates wisdom. We have an opportunity to bring technology and humans, knowledge and wisdom, closer together through AI.

Let's strive to use AI to deepen the connections between us and our connection to the world around us. Thank you for spending your afternoon with Micron.

Speaker 2

Thank you, Mr. Vaidyanathan, for your wonderful sharing.