Oracle Corporation (ORCL)
NYSE: ORCL · Real-Time Price · USD
165.96
-7.00 (-4.05%)
At close: Apr 28, 2026, 4:00 PM EDT
165.76
-0.20 (-0.12%)
After-hours: Apr 28, 2026, 7:59 PM EDT
← View all transcripts

Oracle AI World 2025 Keynote Building the Cloud for You

Oct 15, 2025

Speaker 4

Please welcome to the stage Chief Executive Officer of Oracle, Clay Magouyrk.

Clay Magouyrk
CEO, Oracle

Thank you all for being here. I don't know if you can see this stage is extremely large. It's not that I'm just very small and walk slowly. It's actually a large stage. First, I want to thank everybody for being here. I know that you've got other things you can do with your day than to come out to this event and to hang out with us. I also promise that if you haven't made friends with your neighbors yet, here's my advice. I did this yesterday. Make friends with your neighbor. Every 10-15 minutes, just kind of scoot back and forth between your seats. It keeps the blood flowing. If you have to sit here for a good two hours, as if it wasn't clear, this is a multi-hour presentation. It'll help you. You can kind of prevent blood clots and nerve pain.

If you need a moment to do that, do it now. This is a very serious presentation. OK. Why does OCI exist? I joined Oracle more than 11 years ago. Before that, I had spent my career working on cloud infrastructure. Just so if people don't remember, joining Oracle in 2014 to work on building a new cloud was not an obvious decision. Many people, in a polite way, were asking, do we really need a fourth cloud provider? Is this really what you should be spending your time on? Look, I've made a lot of good decisions in my life. The people that know me also know I've made some wrong decisions. Joining Oracle and building OCI is not one of those wrong decisions. I spent a lot of time thinking about why OCI needed to exist.

With that context of the previous experience, it was clear to me that the industry needed a different cloud. Our mission that we've been working on for those past 10 years has been consistent. It's actually quite simple. Our goal is to be the highest performance, lowest cost, and most secure infrastructure that we can be. Sometimes when I talk about this, people think that I'm talking about, hey, we want to be higher performance compared to our competitors, or we're trying to be lower cost than somebody else, or we want to be more secure. That's not what I'm saying. I'm saying something fundamentally different. We want to be the highest performance, lowest cost, and most secure infrastructure that we can imagine. Our goal is not to be better than competitors. Our goal is to be the absolute best that we can be. That's easy for us to say.

It's very hard to deliver. Now, one more thing to add to that is that it's not enough to be those things. You also have to be a cloud that's available where customers need it. That means more than just the country or the city where the cloud is located. It's also about the configuration it's delivered in, the specific location, as well as how that cloud is governed. Still, even with that little addition, this does not appear to be a very complex goal. Along the way in building towards this goal, it's actually quite easy to get distracted. It's easy to get distracted by new features. It's easy to get distracted by new services. Customers have infinite wants, and they are not shy about telling you exactly what you should be doing.

They think that you should be building the new things that they want that are going to help solve their problem right now. We love that feedback. Cloud infrastructure is constructed of many layers, and each one of those layers is utterly dependent on the layers below it. It takes a lot of skill and dedication to build an architecture that's resilient to the constant change required by cloud infrastructure. It's even harder to build something you can improve upon throughout the process. That's what you want. You want something that gets better with time, not worse. Great systems are extensible. To be extensible, you have to have an architecture that sees into the future, anticipates unknown improvements across both hardware and software. You need a system that improves across all the layers of the stack, not just at the top. That's what we're focused on every day at OCI.

To begin with, I want to review some of the most important choices we've made over the past few years and how they directly contribute to these goals. I don't know if any of you have been in the industry long enough, but I remember the time before virtualization. We didn't call things bare metal. We just called them computers or servers, and that's the way pretty much everybody did computing. Obviously, through the advent of virtualization and then the advent of the cloud, suddenly VMs became the standard, and so having an actual whole server really became exotic. When we designed OCI, we made a very conscious choice to focus on bare metal servers up front. Why did we do that? One major reason is security. By implementing bare metal first, it meant that us, as a cloud provider, actually have no software that runs on your machine.

When people provision a bare metal server, they have complete control of that. I can't see what's going on in your memory. I can't see what's going on in your CPU. We also did it because of extensibility. What I mean by that is that our virtual machine service, when you provision a VM on OCI, it's actually built on top of our bare metal server. I don't just mean the hardware. I mean the actual service. The reason that's important is because it actually enables other people to go out and then build extensible platforms on top of us. It's also about hardware flexibility. We knew when we started 10 years ago that there were going to be a lot of different types of hardware that we needed to plug in to our infrastructure. We knew we wanted different servers. We knew we had different storage appliances.

We knew we needed to do things like Exadata. This has proven extremely valuable to us as we look at how do we optimize for different hardware accelerators, especially in this AI era. Along the way, doing bare metal forced us to invent new security. We invested a lot of energy into our hardware root of trust. We had to invent off-box network and storage virtualization. The thing to remember here is that bare metal is, was, and always will be a first-class citizen in OCI , specifically because it fulfills all of our goals around performance, efficiency, and security. Now, Oracle Database, as you can imagine, at Oracle, is a very important piece of technology. Through a combination of things like RAC and Exadata, it was very clear to us that we would need to support RDMA networks early on. We took on that challenge.

We made a secure way to actually dynamically hard partition RDMA networks. That means you get the full performance benefits, but all of the security enhancements you expect from a fully virtualized cloud. That same design then enabled our HPC and our GPU networking environments. We also focused on the highest performance block storage service that we could create. We designed this service for the most extreme workloads. We're constantly adding more throughput, more IOPS, and more scalable volumes all the time. One of the big decisions we made early on was to make sure that all of our services are available in all of our regions. That may seem obvious as a thing to do. It turns out it's actually not how many other people do it. What they have is a giant Swiss cheese chart where you have to look up a decoder ring. There are multiple dropdowns.

You pick your region, then you pick to see what services are available and what hardware types are available. We found that to be far too complex. Instead, we have a simple solution: everything is available everywhere. That makes it easy for customers, and it actually makes it easy for us because we're also a customer of our cloud. Anything less than that just creates far too much complexity. Performance is also something that was important to us. We also wanted to commit to our customers, so we created performance SLAs that apply across all of our regions and all of our different region types. This gives customers the comfort that they can always rely on the best performance from Oracle. When pricing is complex, customers no longer understand it. Suddenly, there are entire job functions that are created to try to understand the complex pricing.

They invent new tools to be able to analyze and understand their bills. We found it's actually much simpler if we just have a single consistent price across all of our regions. Accessing your data should not be expensive. We made a decision early on that within an OCI region, across all of our data centers, it would be free to transfer data around. We also put a lot of effort into optimizing our internet transit costs and our backbone costs, which resulted in us having the lowest egress fees by a factor of 10. We went and worked with our partners like Microsoft and Google, such that within a cloud region, our multi-cloud interconnect enables zero egress fees between our cloud and their cloud. It is not enough to focus on the aggregate availability of your cloud.

We found it was really important to focus on the availability and performance of each individual VM. To do this, we focused on optimizing these availabilities through a combination of Ksplice, which supports zero downtime kernel upgrades, as well as implementing live migration, which enables us to do hardware maintenance without customer reboot intervention. We chose to make our infrastructure building blocks flexible, not brittle. What that means, say, about compute, is that you get exactly the cores and the memory that you want, and you're only paying for what you use. Our load balancer is infinitely flexible and scalable, and you don't have to pick between 15 different block storage volume types. There's a single volume type. You can dynamically change the performance in real time. How much more flexible is our compute offering? Compared to our competitors, we have 7,000x the configurations.

You might think that actually adds complexity. It doesn't, because it turns out that in those different configurations, each core costs exactly the same amount, and each gigabyte of RAM costs the same amount. Understanding what you're going to pay for is simple. Write down what you need, and you just pay for that. We designed our network, our hardware, and our services to scale down as well as to scale up. We designed our operations to scale out. We needed to deliver and operate in an immense number of regions. The combination of those two things enabled us to provide dedicated regions for individual customers. Each year, we continued to deliver more and more dedicated regions inside customers' data centers, bringing the best of OCI where they need it.

Our ability to scale down and our ability to build and operate so many regions also opened up new possibilities for us. Suddenly, we could put OCI inside of other clouds, and we could bring the full functionality of our data platform to the other cloud environments. What that means for customers is that they get the same exact service, the same great hardware, and it's available in all of the cloud environments. I don't know if any of you have tried to do this. I personally think it's actually really hard to start your own cloud from scratch. Don't ask how I know. We built OCI as an extensible platform, making it easy to extend with new services. By doing that, we enabled ourselves to create Alloy, which combines OCI , Fusion Applications, and other custom IP.

The combination of that enables others to become cloud providers themselves. It's kind of a whirlwind tour of some of the key choices we made along the way, but it's just a subset of many of the important decisions that we've made. The thing I want you to take away from that is that we're constantly focused on living up to our commitment to be the highest performance, lowest cost, and most secure infrastructure possible. We get closer to that ideal every single day. Let's take a moment to take a look at a customer that's taking advantage of a bunch of this technology.

With over a billion users worldwide, TikTok has transformed how people discover, create, and connect. What began as a platform for short videos has become a cultural force, shaping music, fashion, education, and more, turning everyday creators into global stars. Powering that experience is ByteDance's world-class infrastructure, built to empower creativity at scale. Through its partnership with Oracle, TikTok has been able to deliver a seamless experience to users around the world. Please welcome to the stage Head of Infrastructure Engineering of ByteDance, Fangfei Chen.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Thank you.

Clay Magouyrk
CEO, Oracle

You're in the book.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Thank you.

Clay Magouyrk
CEO, Oracle

Fangfei, it's been a journey, sir. Thank you for coming.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Thanks for having me.

Clay Magouyrk
CEO, Oracle

I took out my iPad, which makes sure I hit all my critically important talking points. It's really hard to memorize things. That's not a skill set I have. I don't have many skills. Memorizing stuff is not. Fangfei Chen's much better at remembering than me. You agree?

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Yeah.

Clay Magouyrk
CEO, Oracle

OK, good. OK. We've been working together for more than five years. Since the beginning of this partnership, we both know that every one of these videos is going to run on some of the most sophisticated infrastructure in the world. What I can tell you I didn't anticipate, and I don't think you fully anticipated, was how fast we had to scale this infrastructure up, both in a single location as well as globally, and all of the unique engineering challenges that came with that. For us, it's not just about providing a cloud infrastructure. We really had to come together and engineer this as a team to make sure we have the performance, the reliability, and the scale. Let's start there. Can you tell us, what is at the heart of TikTok?

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Sure. Again, thanks for having me. It's very exciting to be here after working on this for five years. TikTok's mission has always been to inspire creativity and bring joy. That means we want to be the canvas for people to create, the windows to discover, and the bridge to connect, really doing that on a global scale. When we say global scale, today on the platform, we have well over 1 billion users globally. Within the U.S. alone, we're talking about over 170 million users who generate approximately 20 million videos every day. We're also proudly supporting 7.5 million small businesses on the platform. As the infrastructure guy, this really translates to a ton of infrastructure demand. We're really talking about millions of servers, zettabyte scale of storage, and hundreds and hundreds of terabytes per second of network capacity.

Today, even the smallest deployment we put together will require tens of thousands of servers. That really set a high bar for an infrastructure provider like Oracle.

Clay Magouyrk
CEO, Oracle

We're learning Fangfei's memorization skills are about as good as mine.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Let's see.

Clay Magouyrk
CEO, Oracle

Does that help, Fangfei?

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Yeah.

Clay Magouyrk
CEO, Oracle

I agree. You are really awesome. You are awesome.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

The example we have with Oracle here is that the way we integrate with OCI really is deeply at the network layer. Because of that, we need hundreds of terabytes per second of interconnection traffic. The way we integrate that way actually translates to thousands of fast connects. I believe that's one of the reasons Oracle was the first to release 100G fast connect and then 400G. Together, we're really pushing the boundary of infrastructure evolution. We need to continue to do that because the business grows. Since 2021, when we first deployed, we really have seen a 60% increase in our monthly active users. If you add the complexity and new features to the platform, our infrastructure needs to scale even beyond that. That's really at the core of the problem we need to solve.

Clay Magouyrk
CEO, Oracle

I could not agree more. I feel like we talk about these numbers these days. I remember when a terabyte was a lot. I remember when a petabyte didn't exist. Now suddenly we throw them around like, oh, it's just a few hundred petabytes of network traffic. It's fine. To do this, we really had to invest a significant amount of effort to design this network fabric together. I remember us on multiple iterations of the JFAB in Virginia, for example. There has been a lot of learnings that I know we at Oracle have gained from working hard to maintain and operate consistently at this scale. What I would say here is that I want to thank you and the team. It was not always an easy road. Sometimes there are bumps. It has been amazing to work together to solve those problems.

I'd like to understand a bit more about what's actually driving the scale and that growth. You talked about 60% growth. I don't know if many of you, but TikTok was pretty big in 2021. So 60% growth in the past four years is huge in monthly active users. At some point, there's just not more people on Earth to use the service, which is a separate that's not my problem yet with Oracle Cloud Infrastructure, so you know. We're not quite reaching saturation of all people on Earth. One of the things that I remember was huge was when you actually launched TikTok Shop. What are some of the new ways that your users have used the platform that you just didn't expect early on?

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Yeah, TikTok Shop is definitely a great example. It's unique because it's a different type of shopping experience. It's different from the traditional web-based shopping. It's all about live streams. In the past three years, we have seen the number of live streams double on the platform. For shopping events like Black Friday, you can even observe the number of shoppers double on a single day. The good news is some of the shopping events are predictable. The way we handle the capacity is to basically plan them ahead of time. The way we work with Oracle here is we plan in terms of demand and supply, just like how you would plan for your supply chain. We do that in lockstep. The challenges come in when plans sometimes have to change, and very often on short notice.

Thanks to the flexibility the OCI team has provided to us, we can accommodate those changes. At the end of the day, we're making sure we have enough capacity at exactly the time we need it. Other types of events may not be so predictable. The way we handle those is we need to build a smart load balancing system so that we can take into consideration all sorts of information we can gather, including those telemetries we get from OCI. We're even sometimes tapping into something deeper, like data center information, temperature, and power cappings. We can really respond to those spikes fast and precisely while maintaining the efficiency and stabilities.

Clay Magouyrk
CEO, Oracle

Completely agree, right? I think, obviously, in the retail space, Black Friday is critical. One of the things I'm very proud of is that I think our teams have an amazing Black Friday readiness program that we implemented. We do that every year. If you didn't know, it's coming up again. It is really important, I think, that we do that, and we make sure your customers are happy again this year.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Look forward to that.

Clay Magouyrk
CEO, Oracle

Yes. When you manage such a large-scale infrastructure and you grow at such a fast pace, what do you have to do to maintain that user experience? Because look, if Fangfei runs all of this infrastructure, both cloud as well as a whole bunch of other infrastructure, you're kind of in that middle ground where we're here, and then you're there, and then you've got all your customers yelling at you. What do you have to do to make things work?

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

That's a great question. At TikTok, we're really obsessed about user experiences. As a matter of fact, we even test almost all of our engineering work against user performance, user experience metrics. The metrics could be whether the user liked the video, whether they finished watching the video, or they just simply swiped through the videos. We do that even for the infrastructure layer. This is definitely beyond the typical network latency or response time type of metrics. Basically, we want to make sure whatever infrastructure solution we put together, users will stay engaged with our platform. Once you establish those user experience metrics, you really would consider stability as the most important factor contributing to user experiences. You definitely want to avoid large-scale incidents, failures, outages at all costs. To me, very often the smaller things matter as well.

It could be a minor code bug or someone pulled the wrong cable in the data center. That happens, and it could cause a cascading failure to your infrastructure. For that, you cannot simply just rely on the SLAs because those are the minimum requirement. We need something more proactive and more day-to-day. The way we work with Oracle is we actually establish a set of joint stability goals. Those are top-down goals, which means those are sponsored and supported by senior leadership on both sides. The two teams, when they conduct daily work, have stability as the top priority in their mind. When they develop new features or they roll out changes, they will have stability as a priority. On top of that, we share full transparency at the infrastructure layer. We write our operation procedures together. The two teams work seamlessly like a team together.

I have to say there's no secret here. What we have done together, there's a lot of hard work together, and you have to do that consistently every single day.

Clay Magouyrk
CEO, Oracle

I completely agree. Look, the reality is that when we are having problems, which we hope are rare, and they are, the key is to have both of our teams working together. I've seen it many times. There'll be something happening, whether it's a plan for the future or a game day, kind of like what we do with Black Friday, or an actual operational incident. We bring everyone together. We're looking at the same data. We have all access to the same whiteboards. We have definitely learned a lot. What I wanted to say to you, and I think to everybody at TikTok, I can tell you from personal experience, OCI and Oracle would not be where we are today without all of the opportunity and the learnings that we've had from serving you and your customers. Thank you very, very much.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

We really appreciate the partnership as well. I really appreciate the comment as well. TikTok's growth, Oracle has definitely played a very vital role in that growth. I do want to take the opportunity to thank teams from both sides for their hard work, which made everything possible. I also want to emphasize that the work we do really matters because TikTok is not only just a fun app. It's a platform that creates opportunities for millions of people and businesses. We're really trying to make the world a better place. For that, we're sharing a view to our economic and social impact report. I strongly encourage the audience to take a look.

Clay Magouyrk
CEO, Oracle

Look, Fangfei, it has been amazing to work with you over the past few years. I'm very excited for what comes next. Thank you for coming out here and sharing your story with everybody. Thank you.

Fangfei Chen
Head of Infrastructure Engineering, ByteDance

Likewise, thank you, Clay.

Clay Magouyrk
CEO, Oracle

We've covered a lot of the investments into our foundation and the very real impact that those investments are having on our customers. Something that may not be obvious, but it's important that you all understand, is that during this same time period, we've also been performing a major upgrade to the foundation of OCI . It was not enough to design OCI differently. We have to continually refresh our architectural core to take advantage of advancements in hardware, as well as from everything that we've learned, both in software and operationally. Today, I'm very pleased to announce Acceleron after many years of hard work. This project is directly focused on our core mission of performance, efficiency, and security. Acceleron is already today used by all of our customers in some fashion. With these new additions, we'll significantly improve the infrastructure experience for all of our customers.

What is Acceleron? It's a combination of our software and architecture for securing and accelerating all of our input output. It's a combination of host accelerators, fabric architectures, and fabric accelerators. We've been working on this for a while. What I want to do now is take a minute to cover some existing investments and today's new capabilities. First, let's start with dedicated network fabrics. I don't know how many of you have tried designing a network with zero networks. If you have, it turns out it doesn't work so good. Great for security. Security people love it. If you just take everything off the network, super secure. Does it improve performance or availability? Also, something to think about is that when we design cloud networks, they need to be designed to be non-blocking, not oversubscribed, to enable flexible placement in a multi-tenant environment. That's what we do.

At a minimum, you need one of those networks. You then get the choice, is one network enough? We made a very conscious decision to move from one network to two, specifically to enable RDMA fabrics for things like Exadata. We had to design a system that provided complete RDMA performance while also supporting multi-tenancy. That network served very well for Exadata and then became extremely important for our HPC business. AI also needs RDMA, but different. AI workloads need a much bigger cluster size. They care a lot more about total throughput than the absolute lowest latency. What we've done is we've created a unified architecture that allows you to scale up and down the size of your dedicated networks. They can be either latency or throughput optimized.

All of this is configured in a secure and hard partition manner that gives you all of the performance you expect from a dedicated network, but all of the security benefits you expect from a cloud virtualized system. The next thing I want to talk about is disk intermediation. Disk intermediation really is the concept of removing something. First, I might need to explain what it is that we're removing. Anyone that's ever configured networks, you have basic network functions, and then you have advanced network functions. Hey, I want to just make a connection, talk to somebody. That's a basic function. If you want to do things like network address translation or peering two networks together, traditionally that's done through what are called middle boxes. Middle boxes can be physical or virtual. The good thing is they add this functionality. The bad thing is that they add latency.

They can be performance bottlenecks, they can be difficult to scale, and they can result in overall reduced availability. The solution to this is to get rid of the middle boxes, and that concept is called disk intermediation. It sounds easy, but it's actually quite hard to implement. To do this, you need a very flexible software architecture that allows you to seamlessly move network functions from one location to the other. We've been working on this for years, and it's actually already deployed across many of our different network systems. The net result is that you get significantly lower costs, and if you do it right, you can then pass on those savings to customers. That's a big part of the reason why before, when I was talking about our pricing, we have significantly better network fees. It's because we've done engineering work like disk intermediation.

Next, let's talk about converged NIC. This is something new that we're launching now and comes out with our next generation of hardware. Before I can talk about converged network interfaces, let's talk about the existing architecture we've had at OCI since its inception. We started by having a separate host NIC and a smart NIC. The reason for that is purely based on security. The host NIC is given to the customer. They have complete control over that. The smart NIC is controlled by Oracle. They only talk to each other over a network interface. Our original architecture optimized for security over ease of use and performance. The downside of having these two separate NICs is that it can be expensive because you have to pay for two NICs. You have a performance hit because you have to process the packet twice.

You only can expose a network interface. You do not have the ability to do things like expose an NVMe interface. We did this because it made bare metal possible. It's actually pretty easy to get rid of the host NIC and have only a single smart NIC. You just take the host NIC out, put the smart NIC in, you're done. To do that, though, you have to rely on compute VMs for isolation. You reduce your costs and you get better efficiency and latency. This reduces your security posture. This is just not an option we were willing to accept. We went out and we designed a new architecture. What this architecture does is on a single smart NIC, we have a hard partitioning between the customer NIC and the provider NIC. What happens is we have dedicated cores and memory for the customer NIC.

We have dedicated cores and memory for the provider NIC. They still only communicate over network packets. Instead of having two separate cards and an Ethernet cable in between, what you have is two separate sets of dedicated hardware with a shared ring buffer that you process as network packets in between. We did all of this in collaboration with the AMD networking team. It's been an amazing job so far. I'm very excited for the benefits this provides for us and our customers. Now, with a converged NIC, you get all of the security benefits we talked about with two separate NICs. You also get an NVMe interface for block storage. You get line rate encryption for all of your traffic. You get seamless patching of even your bare metal host NIC. Suddenly, I can do bare metal but still patch your NIC for you.

You can get twice the available throughput for compute because you're not processing the packets twice. Next, I want to talk about Zero Trust Packet Routing. Traditionally, network architecture and network security are intertwined. If you go look at any complex network, you've got a whole bunch of subnets and ACLs. You've got routing rules. The reason for those is both for connectivity as well as for enforcing security boundaries. With Zero Trust Packet Routing, which we launched last year, it enables you to write security policies in a security policy language for networks. Your network architecture is just about network architecture. The result of that is that now you can analyze your security policies individually. It's much safer and much easier to use. To illustrate the enhancements that we've added to Zero Trust Packet Routing, I'm going to walk through a simple example. That example is object storage.

How do we prevent it from being used to exfiltrate data out of a cloud environment? First, we enable private access from our database to object storage for things like backups. This zipper policy at the bottom, what it does is it enables those database hosts to talk to object storage, but only through this private service access that's been created. Next, we use an IAM deny policy to prevent any usage of object storage except through private service access. What that means is that the combination of those two policies prevents any usage of object storage from the internet. That means that even if someone were to steal some credentials that would, in theory, give them access to that object storage bucket, they can't get that data out through the internet because through a combination of zipper and IAM deny policies, it's just not allowed.

I want to talk a bit about multiplanar networks. Before that, I probably need to explain what I mean by it. Almost all the networks that we think about today exist inside of a single plane. In fact, it's so common we don't really talk about it. Single plane networks expose a couple of single points of failure, typically between the host and the next layer in the network. Oftentimes, your [T0] layer is also a single point of failure. Some networks are designed with two planes. Anyone that's used like a traditional enterprise network that has a separate fiber channel storage network and a front-end network, or even some mission-critical workloads have redundant front-end networks. The downside of that is it can be expensive and hard to manage. What you can do is you can take a single plane network and you can split it into multiple planes.

You get some benefits. It creates redundancy. The downside is it's hard to use because suddenly, your computer goes from having one network interface to, say, eight network interfaces. The downside is that on those eight, the maximum size of a single flow is also reduced. If you instead expose a single plane to the host, which is what we're doing now with Acceleron, and behind the scenes, you implement multiple planes, you actually can get all of the benefits of a multiplanar network, such as higher overall availability because now you have redundancy across those layers. You get lower costs because you can build smaller networks with lower rate of switches, and you get better performance because there's fewer hops in the network. You don't get the downside of it being hard to use.

Hopefully, now you understand that Acceleron is a foundation for all of our I/O security and acceleration functions. With these new additions, customers will see significantly higher peak performance. They will see lower costs across our infrastructure, and they will have better ease of use and increased functionality while also receiving increased security. We're just getting started. There are many more exciting Acceleron enhancements coming soon. Instead of just talking about the technology, why don't we take a look at another customer that's actually using a lot of this technology already?

We've all been wowed by ChatGPT. The large language model is only the beginning of what OpenAI hopes to achieve with its cutting-edge AI technology. OpenAI's next installment of its name brand model, GPT-5, comes with improved reasoning, memory, and multimodal capabilities. New APIs will allow more businesses to integrate this powerful technology into their operations, for example, by developing and deploying custom GPT models. OpenAI is also driving innovations in robotics, language translation, and video generation, all with a crucial commitment to uphold human values and ethical standards. Please welcome to the stage Vice President of Infrastructure and Industrial Compute of OpenAI, Peter Hoeschele.

This is what I'm saying. I'm like, yeah, I'll go around the back.

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

Oh, I'm going to go around the back. That's what they told me.

Clay Magouyrk
CEO, Oracle

You did well. Peter and I were talking about how long the walk is. It really is quite long. Peter, thank you for coming. For those of you that don't know it, Peter does a lot. He also is responsible for all of the infrastructure at OpenAI. All right, we got a few minutes. As AI continues to accelerate, I think you know this more than anybody else. The industry is facing a real challenge. How do we actually get enough compute capacity? How are you thinking about that problem?

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

Where I want to start is I met Clay about a year ago. I think it was last May. I was freaking out because we were having another launch, and our researchers didn't have enough compute. Twitter was alight with we didn't have enough, and this team comes along. Jay Jackson, Luis is out here somewhere. I found them online, truly through LinkedIn, and said, do you have capacity? These guys show up with 200 MW of capacity, which, watching your presentation, was really interesting. For those of you who've been in the industry for a while, 200 MW is passé now, but five years ago, 5 MW would have been a crazy amount to have. These people showed up with not only capacity, they showed up with an intelligent cluster design. They showed up understanding all of our needs.

They showed up understanding our security requirements, all the things that Clay went into from a technical perspective, they showed up with, and that blew my mind. I'm going to talk for a second about what it takes to do. Do you guys deserve a round of applause? I'm very serious. It has been so inspiring watching Clay and the team build OCI . I'd love if you all gave them a round of applause for what they built. It's incredible. Seriously, you deserve everything. When I think about kind of where we're at, we are in the stage of what I call industrializing compute. It's not just a build-to-suit data center in Northern Virginia anymore. It is hitting on every single lever and trying to maximize them at every moment, from power all the way through to silicon.

That's where, when I think about the work that we've done together, same thing. NVIDIA has done incredible work, but Clay and the team have come to me also and said, would you like to do anything else? When AMD and Lisa Su were saying, hey, we really want to push the next generation of AMD, we're not getting as much traction as we'd like, we went to Oracle. Oracle said, we will do anything to make this work with you. We had a huge announcement last week about this. We're super excited. We had our Broadcom announcement, all of the new product integration that these folks are going to do to help us continue to hit on every single lever that we need to hit to hit another 10x over the next, hopefully, two years.

Clay Magouyrk
CEO, Oracle

OK, just so you know, I did not tell him to say those nice things about me. That's what I would say, even if I had told him to say those nice things about me. Something I think that people want to understand is that you know better than anyone else. We've got these compute constraints. You've got capacity that can be used for training and optimizing new things. You've got capacity that can be used for developing new models, doing more research. You've got capacity that you need for actually serving your customers. How do you think about the need you have to balance across those? How does it change what we build?

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

Yeah, and even that's been a massive paradigm shift. I joined OpenAI in 2019, and that's when people were just hearing about what we call the scaling hypotheses, this idea that as you use more compute, you get better capable models. Our strategy was always to hit at the front end of whatever the newest generation of GPUs was, get the biggest cluster that we could find, design the biggest cluster, and pre-train one of these models, which just means lots of synchronous workloads happening all at once, and then see if we continued on that line that said model got better. We launched ChatGPT. Some people have heard this story now.

I still remember sitting in a conference room December of the year that it was launched, and the researchers said, no, no, no, no, no, don't worry about it, as I'm freaking about the amount of capacity that we have, saying it's just going to be a low-key research preview. We'll pull it down in two weeks. We just want to show the world this research that we've done. That was our first big learning in this idea that we have to have a fungible fleet. We have to have the ability to go from being able to run these large pre-training models to saying, actually, we're launching Sora now, and for the next three weeks, we need to be able to flex into that. That wasn't a capability that we had internally.

That's where, again, all of the things that you were pointing out from multiplanar networks, from a security perspective, which they're very different between these different workloads, we have to be able to build into that. I get asked all the time, how do you think about training versus inference? I'm telling you all, let's stop talking about training versus inference. We are in a new regime now where the models are ideally kind of constantly running and are constantly going through sampling and training and getting better all of the time. That matters because for business work use cases and for databases and other things, as we start feeding data back into the models, they are going to get better and better and better for each of these individual use cases. That's where the extensibility of the platform becomes really important as well.

Clay Magouyrk
CEO, Oracle

OK, you mentioned it earlier. We've been working together for a little over a year. It's great to know that the way that you thought you were calling Jay out for good, that you found him on LinkedIn. I'm like, Jay, it's kind of your job to contact him. Talk about that later. Thanks, Peter. Did you expect us to make this much progress together so quickly? How does what we're doing compare to, I think, your expectations at the beginning?

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

I mean, that's truly why I said the round of applause. There is nobody like you all out in the market. I think, I hope everybody wins. I kind of think of this as an ecosystem view. What you're helping us do is push some of the more traditional players to really think from a customer-centric viewpoint first and that flexibility that we're going to need around compute. When we think about co-designing things, we think about the clusters that we're working on together. We think about the trade-offs that you have to make to be able to provide us more capacity, some of the multiplanar stuff. There's something called ZettaScale 10. You all should look it up. There was a great press release about this.

I have been blown away not just by your willingness to co-design things and co-engineer, but how everybody across the OCI stack understands these things. We joked about Jay. I won't mention him again. Very seriously, down to people like Jay who understand these problems deeply from a technical perspective, which means that our cycle times drop. We don't have to do a contract for multiple weeks. We don't have to go engineering with the hash multiple times. I don't even have to call you to escalate things very often. Your team just understands our problems and brings us the capacity that we need so that we can continue scaling so that when Sora comes up or a new product comes up, we're ready to go versus multi-month contracted, oh, this doesn't fit in OCI, or whatever it is.

Clay Magouyrk
CEO, Oracle

OK, you've talked a lot about the things that have gone very well. What are some of the hurdles that you feel like we had to overcome that people should understand?

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

I mean, Abilene came together in 11 months. A data center of that scale, because I've worked on some other ones at a similar scale, took four years to plan. When we go through that, you deal with questions of, let's say, fungibility, where your finance team might say, and my comms team is going to kill me for this, but they might say, OpenAI is a startup. How can you possibly lean into them for this amount of investment? The point is that the technology that you all have developed and the infrastructure that you've developed is so fungible that we want to use it so that you can get your finance teams comfortable that somebody else could buy it. We're using the same stuff TikTok U.S. is using. We're using the same stuff that I'm sure Milwaukee Tool is going to use. That's pretty incredible in this space.

Clay Magouyrk
CEO, Oracle

OK, you guys are expanding globally very quickly. I won't talk about the numbers. I think in terms of just users, in terms of revenue growth, everything is I've never seen a company grow like your company has grown. What is the most difficult thing you're dealing with as you manage that growth while also focusing on security and efficiency?

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

That's the $1 trillion question. Maybe I'll give one example here. There's a press release every week right now about a new Stargate. I think Stargate Argentina was announced last week. We had Stargate UAE, all these other things. Some of those difficulties are in the current policy environment. Let's talk bluntly about export controls and other things for chips. There is always a question of, can we go serve here? Can we run capacity here? What are the requirements that both the government and our own security teams are going to have? It's been really interesting working with you all because Oracle has kind of become our one-stop shop for any country to go in and say, OK, you understand our internal security standards. You understand the policy requirements. Just please go make this work. We've seen this now across many different countries in a very rapidly evolving environment.

It's a huge win for me that I don't have to think about that. I can rely on you as we do expand 10x year-over-year at this point in some of those locales.

Clay Magouyrk
CEO, Oracle

It's a great example. Peter, I really appreciate the work that you and your team have done. Working with your team has been incredible. It wasn't just that we moved fast. I think that our pace and urgency is matched across both of our companies. Thank you for coming out here today and telling everybody your story. I'm excited for what comes next. Thank you very much.

Peter Hoeschele
VP of Stategy Infrastructure and Industrial Compute, OpenAI

Thank you.

Clay Magouyrk
CEO, Oracle

I promise I'm almost done. AI is clearly extremely useful, but it only is as good as the data that it has access to. Public data is public, and it's exposed through standard self-describing interfaces. Private data is actually none of those things. It's not public, and it's very rarely self-describing, but it's also where the most value is to each of us. One thing you can do to take advantage of this technology is you can just put all of your data on the internet and do it in a way that's self-describing and wait for the next training run, and the model will have all the answers to the questions you want to know. That has some downsides. Let's assume you don't want to go with that option. What do you do instead? You need to bring secure, controlled access to the leading models next to your data.

We enable that by integrating the latest AI models in our GenAI service, and we're committed to always having the best and newest models available. Once you have the models, you need a way to bring all of your private data together. While you could do that by copying the data into a single location, that's not the only way. You can also create a shared index of all of your private data, but leave that data in its system of origin. You then need fine-grained access control. This is critical because you need some way to control what the user of the AI has access to. As an example, it's great that the models can answer questions about your customers and your financials, but should all of your employees be able to get those answers from the model?

Once you can answer queries about your private data, that's a great start, but it's not enough. You then need to perform actions that influence that data, creating a virtuous cycle. You want to ask questions, make a plan, take action, and repeat. AI is so much more useful when it can do things for us. Our AI database performs two of those key functions. In addition to being a repository for your data, it also acts as an index for external data stores. It provides a single place to enforce access control. This is only possible because of our tireless work to improve our database. We unify all of your data types and access patterns into a single system. Only then is it possible to solve complex problems like integrating your private data with AI.

Our AI database does this by enabling you to pull all of your data assets together by mounting external catalogs. You can use real application security to put access control at the table, row, column, and even the cell level directly in the database. It keeps up-to-date vector indices of the data that's inside your Oracle database, as well as the data you have stored externally. It also supports open-source external formats like Apache Iceberg for seamless integration into your existing data flows. We've also created a new GenAI agent platform with the goal of easily integrating tool usage into your AI workflows. That platform comes pre-built with agents for common tasks like retrieval augmented generation and coding. It's compatible with open-source frameworks, allowing you to reuse everything you do here in other locations.

It's pre-integrated with the Oracle application ecosystem, making it easy to build agents to integrate with other Oracle applications. We are announcing our AI data platform that brings together the best models, the power of our AI database, and our new agent platform. The AI data platform solves problems for developers and business users. It makes it easy for developers to write custom applications. It also makes it easy for a data analyst to perform analytics across all of your data. It's great that we have this new platform. As we talked about earlier, location matters. This new platform only helps you if it's available where you need it. All of this is available in our cloud, in other clouds, and inside your own data center. Customers tell us they love the ability to get our data services in all of their clouds.

They want it to be easier to make financial commitments with more safety. We are launching multi-cloud universal credits. They enable a customer to work with Oracle and contract once, knowing they can deploy their database services in any cloud at the same price with the same functionality. We're also announcing the general availability of Dedicated Region 25, our latest footprint for Dedicated Region. We've been continually shrinking the size of our regions. Our first Dedicated Region was actually more than 50 racks. Today, you get far more functionality in just three racks. This makes it easier for customers to take advantage of our cloud where they need it. OCI is in a constant state of reinvention. We have to always be looking for ways to improve our fundamentals so we can deliver on our continual promise to deliver the best in performance, efficiency, and security.

Often, these improvements are subtle. You can't see them because they're behind the scenes. The accumulation of hundreds of thousands of these improvements results in a significantly better cloud. You can see our commitment to being better every single day, and you can see the value that that has for customers, like we heard from Fangfei and from Peter. Tomorrow is another day, and you will find OCI continuing to focus on improving our performance, our efficiency, and our security. Thank you very much.

Powered by