Bank Of America Securities Global A.I. Conference 2023

Sep 11, 2023

Operator

Ladies and gentlemen, the program is about to begin. Reminder that you can submit questions at any time via the Ask Questions tab on the webcast page. At this time, it is my pleasure to turn the program over to your host, Tal Liani.

Tal Liani

Technology Analyst, Bank of America

Hi, good morning. Thanks very much for rejoining us. We have today, in the morning, you heard about contract from contract manufacturers that make the, the products for the big cloud companies. Then you heard from cybersecurity, and you heard from others. Now I'd like to welcome both Eyal Dagan and Rakesh Chopra to our conference. Eyal is Executive VP of the Common Hardware Group at Cisco. He is responsible for delivering silicon, optics, and hardware across Cisco switching, routing, optical, and IoT portfolios. Eyal has an extensive background in the industry. Prior to joining Cisco, was the co-founder and CEO of Leaba Semiconductor, the semiconductor company that was acquired by Cisco in 2016. Rakesh is a Cisco Fellow in the Common Hardware Group and has been with Cisco since 1997.

Rakesh runs a system architecture team that focuses on hardware platforms and also owns the business and customer engagements for selling Cisco Silicon One to external partners. So thank you, Eyal and Rakesh, for joining us today, and maybe before we start, I know that the distinguished IR team of Cisco has a forward-looking statement to read to us.

Thanks, Tal. This webcast is educational in nature, with no new financial information being given. We will be making forward-looking statements. The actual results may differ materially from those forward-looking statements and are subject to the risks and uncertainties found in our SEC documents, the 10-Q and the 10-K. I'll turn it over to Tal and Rakesh.

Great. So maybe, Rakesh, before you start, I'll just set the stage. So when we think about AI deployments, we think about the infrastructure, the underlying infrastructure, and there are many questions around it. Will the infrastructure change? What about InfiniBand versus Ethernet? What does it mean that the investments of any company, does it mean the investments needs to go up because the capacity and the data you need to deal with goes up? We always say, and Cisco has been saying it for 25 years, "You need to have a good network." You always need to have a good network. It's at the bottom of it, you want to deliver services, you have to have a great network. So having Cisco today is very important because we want to discuss the network.

We want to discuss basically the underlying demands for, for generative AI, as well as AI, as, as well as the different architectures and different deployment schemes, et cetera. So with no further ado, I'll pass it on to Rakesh. We're gonna have about a 20-minute presentation and then open it for Q&A. As always, send me the questions via the portal, please. Rakesh.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Awesome. Thank you very much for the introduction, Tal, and as usual, you've sort of captured the entire essence of what we're talking about today. So with that, why don't we go ahead and jump directly into it? If we could advance the slide, that would be wonderful. One more, please. So I wanted to start off before we get into talking about the networking infrastructure, to just sort of normalize this all. AI/ML is, of course, the buzzword of the day. It's what everybody is out talking about, and for a company like Cisco, that's actually quite an interesting proposition because at the end of the day, unlike many companies in the industry, Cisco is a very, very large company, and we build a bunch of different types of products.

So when we're talking about AI/ML, I like to break it down into sort of two basic categories. The first is, you can imagine, that Cisco uses AI to improve our products and services that we offer to our customers. So for example, using Desk Pro or Webex, there's noise canceling that is sort of AI-powered. That right now is sort of filtering out the huge amount of construction noise that is happening directly by my ears. It does an amazing job. This is a really important piece of AI for Cisco, but that's not really what I'm here to talk about today. I'm here to talk about the right side of this picture, which is we also sort of sell our products to enable others to build AI networks. And that's again, where we're sort of focusing today.

So if we could sort of advance to the next slide. When we think about different networking architectures, you can think about trying to understand the data center in various different sort of roles and responsibilities, and we've tended to focus on what's called the front-end network. The front-end network is designed to take sort of general purpose compute, x86 or ARM, interconnect them together through top of rack switches and spine switches, and then connect to the outside world via DCI or data center interconnect routing topologies to the wide area network. This is really where Cisco has played historically, and this is really all powered by Ethernet. So if I break that network down into two roles, there's roles within the network that is primarily switching, and you can imagine that silicon, system, and optics is technologies that Cisco sells into that area.

Towards the upper end of this picture is the routing roles. Here, again, we sell silicon, system, and optics. Now, this area here is not what we consider the sort of AI/ML network. Clearly, this network needs to connect to AI/ML infrastructure, but when people typically talk about AI/ML networks, they're actually talking about the bottom part of this picture, the back-end network. So what actually ends up getting created is a dedicated network that is built to connect AI and ML compute infrastructures. Some people might call these GPUs, some people might call them specialized compute, but it's a network designed to allow these devices to communicate at very high bandwidths. That has historically been InfiniBand-based technologies, and what we strongly believe here at Cisco is that this technology will migrate towards Ethernet. Now, we've seen this play out in the past already.

So it used to be, for example, that storage was all done on back-end networks, but as technology evolved, storage moved to actually the front-end network with RoCE or RDMA over Converged Ethernet, riding the bandwidth over the Ethernet infrastructure. Now, what's interesting, though, about this back-end network for us here at Cisco, is this is actually a brand-new market opportunity for us. So it's a very high bandwidth network, a very critical network, and we at Cisco, we believe we're very well positioned here to sell silicon systems and optics here. This, again, is an additive network TAM for us, rather than replacing the front-end TAM. So if we could advance to the next slide, please. So why do I make the claim that it is moving towards Ethernet? At the end of the day, it is simply because it is built today on InfiniBand for historical reasons.

There, there's infrastructure built for high performance compute, which is based on sole source GPU, sole source technology for switching and InfiniBand interconnect. As we all know, AI/ML is exploding. There's multiple customers now building GPUs, whether that's vendors like Intel or AMD, or actually, end customers like Meta and Google are all public about the fact that they are building their own GPU infrastructure. As they do that, they will end up having to use somebody's interconnect. Are they gonna use a sole source interconnect brought in by somebody else, or are they gonna use something like Ethernet, which is widely available? So if you ask me, Ethernet is actually an inevitable answer in terms of this transition. We could advance to the next slide. So as we move towards Ethernet, the question becomes: What's interesting about Cisco?

Why would somebody pick Cisco rather than somebody else for Ethernet-based technologies? At the end of the day, if I oversimplify it for a second, Cisco has the silicon technologies, we have the systems that we build around it, and we have the optics technology. So we're actually uniquely positioned in this market, that we have all the key building blocks to enable an AI/ML-based network. And I'll go into more details in this as we get along. If we could advance to the next slide. The other key point here is that it's not just about technology, it's about how we engage our customers. So you've always been able to buy full systems from Cisco, so that's the silicon, the hardware systems, and the operating systems together, shown on the right of this.

What's interesting is that back in December of 2019, we announced disaggregated business models, and so now customers can buy components directly from us, whether it's silicon, gray optics, or coherent optics. You can buy that equipment directly from Cisco, build your hardware, write your software on top of that, or you could also buy white boxes, where we build the hardware platforms on top of the silicon, and you bring your own software. So again, another unique thing here about Cisco is that we have all the different business models allowing us to engage with our customers on the terms that they want to engage with us on. So if we can move to the next slide, please. Now, jumping into a bit the silicon piece, I want to talk about Cisco Silicon One. So we announced Cisco Silicon One back in December of 2019.

We made a big splash about it then, and we've been iterating on that technology ever since, coming back to the market twice a year or so with new advancements of the technology. At its heart, Cisco Silicon One proposition is to erase the hard boundaries that exist in a network and focus on having one converged architecture that can be deployed across your network, across form factors. But we realized that convergence isn't enough. We have to be able to be the absolute best technology in every single one of these roles. So regardless if it's a top-of-rack switch or a core router, we think about the key priorities for those individual roles, we want to be the best at each one of those things.

We then take this converged architecture with incredible performance, and we offer it to our customers in multiple different business models, giving our customers sort of one network and one experience, regardless of how they consume it. We could jump to the next slide, please. So as we think about Cisco Silicon One, the way we sort of think about driving this innovation strategy is, first of all, we have to have differentiated products. We have to have the best products in the industry. The second thing is convergence not only helps our customers deploy our technology, it actually allows us to leverage our investment, and sort of double down in terms of driving innovation at a lower cost, allowing us to do further innovations. But also, how we build silicon today is very different than how we built it in the past.

Cisco's always built silicon in what's known as the ASIC model. That's us doing the design and then working with a back-end partner to do it. The other way that we ship products is we use third-party silicon providers, which, of course, has a margin stack on top of that. What we do now inside Cisco is we're a true fabless semiconductor. It's what's called a COT or a customer-owned tooling model. That allows us to get our cost down of development significantly. We own all of the IP, we drive our own roadmaps and our own technology transition points.... We then take this technology, and we go to the market, and we try, and we win customers. Now, we have engaged very heavily in web scale over the recent years, and the reason that is, is because web scale drives technology transitions at the high end.

And so we have to engage with these customers to make sure that we have the best products, both for the web scalers, but also for the rest of the market. They also have an amazing ability to drive very large volume, and that maximizes our revenue set. Now, we go to those customers, and we sell them either silicon only, white boxes or full systems, as we talked about before. We're the only vendor in the industry who offers all of these business models. Taken together, this drives our volumes way up, which drives our costs way down, and then we can take that additional margin dollars and reinvest it into this innovation cycle and sort of continue the process forward. If we could advance to the next slide, please.

So as we continue to sort of think about how this plays out, we have recently sort of announced our Silicon One G200 device. Now, we're very well aware that we're not the only 51.2 terabit Ethernet switch on the market, but to reiterate what I said before, we are the best 51.2 terabit piece of switching silicon on the market, and it is built specifically to optimize AI/ML networks. We have a lot of technology in this device in terms of load balancing, link failure avoidance, we've managed to halve the latency of our devices. We've doubled the performance with keeping the power exactly the same. So we are doing things in this industry that nobody else is doing, and importantly enough, we've also announced to the industry that we're building our own 112G services.

This is an incredibly important piece of IP. It takes a huge amount of effort to do, and we're very, very proud of the performance. To the end customer, this means that you can build cheaper networks, specifically for AI. We have capabilities in this device to allow for long passive copper cable, linear drive optics, co-packaged optics, all are possible with the G200 SerDes that we've invented. Now, if we could advance to the next slide. The other piece of G200, and this goes back to this notion of how efficient our Silicon One architecture is. Everyone else is starting to throw things overboard in order to fit in the silicon die allocated, based on technology. One of the things people are throwing overboard is Ethernet MAC.

That might not sound very exciting, and might not sound very interesting, but if you look at how to build an AI ML network, what you're able to do to onboard up to 33,000 by 400 gig switches, is you can do that with two layers of networking with Silicon One, where others require three layers. Now, why does that matter? It matters simply because it requires 50% less optics, 40% less switches, a third of the network layers, and that saves a megawatt for every one of these clusters. Very, very, very significant savings as a consequence of the efficiency of Cisco Silicon One. If we could jump to the next slide, please.

Now, thinking about this, and again, sort of one level more, our notion of convergence, but being best of breed that we started outside of AI ML, applies very much to AI ML as well. So as the industry moves to Ethernet, what we're seeing is that different customers make different value propositions. And what we can give our customers is that flexibility of choice because we have a converged architecture. Customers can use fully generic Ethernet, giving them the ultimate in compatibility in terms of which Ethernet switches they use within their network, Cisco or other, are all fully interoperable. The far end of that spectrum is number two, what we call Fully Scheduled Ethernet. This guarantees a non-blocking networking performance that gives you ultimate performance with a very low job completion time for AI ML networks.

And then we also have the middle ground of Enhanced Ethernet, which is taking IP that we created for Fully Scheduled Ethernet and bringing it and layering it on top of generic Ethernet, giving people a middle ground. So again, what we're finding here is that our customers each want a different answer, and we're here to meet the customers where they want to be then. If we could advance again, one more slide, please. Now, here's just a summary slide to give you a sense of how do we stack up and how do we compare against InfiniBand, and how does Ethernet, Enhanced Ethernet, and Fully Scheduled Ethernet play out? InfiniBand was great for high-performance compute. It was great for, for non-multi-tenancy or single job performance.

But as you think about AI ML infrastructure, you've got to worry about multi-job performance, and you have to worry about the pace of bandwidth improvements. As you move to the right in this picture, you're getting better and better performance. And again, because we have a technology that can do Ethernet, Enhanced Ethernet, or Fully Scheduled Ethernet, we work with our customers in an open and honest way for them to understand how these things actually relate and compare. Everybody else in the industry is picking one or two of these and trying to sort of convince customers that that's the right answer. If we could advance one more time, please. Now, all of this wouldn't matter at the end of the day, if the performance gains between Ethernet, Enhanced Ethernet, and Fully Scheduled Ethernet weren't significant.

If we were talking about a 1%, a 5%, a 10% mover, nobody would care. But at the end of the day, what you end up seeing is a difference of performance that we can do between Fully Scheduled Ethernet and Ethernet is incredibly significant. So there's a parameter called Job Completion Time, or JCT, which measures the amount of time that jobs take to complete on AI/ML infrastructure. And what you see is that Fully Scheduled Ethernet is about 2 times faster than generic Ethernet.

Now, what that means is you can complete your jobs quicker with the same network using Fully Scheduled Ethernet, or you can build half the size network at the same job completion time. And again, what we've done is we've taken a bunch of those technologies, layered them on top of generic Ethernet for what we call Enhanced Ethernet, and that gets about a 1.5 times improvement over generic Ethernet. So again, very, very significant movement in performance of AI ML workloads based on sort of Cisco technology. And if we could advance one more, please. Now, one of the questions we get a lot is: How have we been doing in the web scale business with Cisco Silicon One, and what's the impact of what we've done? Today, Cisco Silicon One is available in the Cisco 8000 with IOS XR.

It's available in our Catalyst 9500X and 9600X with the IOS XE operating system, and it's available on Nexus via NX-OS, as well as multiple third-party hardware builds with multiple operating systems like SONiC, FBOSS, and others. Today, we're happy to announce that we've actually penetrated five of the six global tier one web scalers with Cisco Silicon One. Now, some might assume that we're talking about routing roles when we make that claim. What we're talking about here is actually deployment within actual data centers. So these are some of the hardest customers to penetrate, some of the longest evaluation cycles, and we've had exceptional results based on Cisco Silicon One. It's a real testament, I think, to what we've managed to do as an organization.

And then the second piece I wanted to highlight is, it's very easy to talk about power efficiency, which we do a lot with Cisco Silicon One. It's much better to have an external customer write a reference about how much power they're saving. There's a pointer here towards a press release that DT has announced. They managed to drop their power bill by 92% by adopting Cisco Silicon One. So again, these are not 5% or 10% movers. These are huge needle movers that we're actually talking about. And if we could advance the slide one more time. So as we've been sort of evolving this, we are moving faster than anybody else in the industry. Since December 2019, we've come out with 14 different devices.

It's about a pace that's 11 times faster than any other competitor in the network, and we're continuing to push that forward because we have a converged architecture, because we've invested so much in this. We're now just sort of enjoying the fruits of all of that work that we've done over the last eight years. And one more time, please. So at the end of the day, why do we end up winning with Cisco Silicon One? We end up winning because we have the right technology, we have the right investments, we have the right scale, we have the right cost points, and we have the right business models. And if you think about us versus others, we're the only company which have all of the technology from silicon, hardware, optics, and software.

That allows us to innovate across all of these hard, dividing lines to come up with optimal final solutions. And we're also the only company who's got all the business models from silicon-only, white box to full system. It's about meeting our customers where they want to be met and enabling them to be successful. And so with that, that is my last slide. Tal, over to you for questions.

Tal Liani

Technology Analyst, Bank of America

Yes. Perfect. Thank you. So I'm just gonna start with a question that I got from the audience, and then I'll go back to my questions. I'm just gonna read from the screen. "Assume AI clusters use Ethernet. In an AI cluster, when scaling up the number of GPUs, does the cost of networking goes up in a linear fashion or higher or lower?

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

So, Eyal, do you want to take that one, or do you want me to take that one?

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

Please take it.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

So-

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

I will do the calculation in my head.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

That's good. So several statements. So, the way these networks are built out is what we call a Clos topology. So that is layers of networking that get aggregated by the layer above it. If you're scaling the network within layers, so let's say, for example, you have a 2-layer Clos network, and you're scaling it horizontally. That is a perfectly linear scale of cost, right? As you have to add a layer, as you run out of the radix of the chip... That's sort of what I was talking about a little bit before with, because we have a higher radix chip, we can build wider, flatter networks. As you add a layer, there's a cost discontinuity which happens, and then from that point, it's linear again. So you can think of it almost as a step function of increasing cost.

I think if we compare it to InfiniBand, as a statement, those are also built out of similar topologies, and so they have a similar step function. The differences being Ethernet gives you better radix, gives you higher port speeds, gets you better cost per bit. And so although they're both step functions, they're different levels of step functions. Eyal, do you wanna add anything to that?

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

Just, just in terms of port calculation, okay, i.e., the number of the switching ports or the optics, in most of the AI clusters, it's about 1.5x, okay? Because you can think about what Rakesh just described, 1x for the first layer. Of course, you add more GPUs, you need to connect them, so that's the 1x. And then you need another layer to connect those, top-of-rack switches, and that give you the 0.5. So it's more or less 1.5x. In computer science, it's linear. It's linear.

Tal Liani

Technology Analyst, Bank of America

Another question is: Is there a risk with the fabless business, or what is the risk with a fabless business? Do you have diverse geographic and commercial partners for manufacturing of your silicon?

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

Okay.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Go ahead.

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

Okay, maybe I will take it. The question is, call it correctly. As Rakesh said, we moved from an ASIC model to a fabless semiconductor business, and that gives us, give us a lot of benefits. It doesn't matter if you work in an ASIC model or in a fabless semiconductor. Eventually, there are fabs that manufacture your devices underneath. Those fabs today are mostly TSMC of the world, although we experiment with others as well. You know, TSMC have their own geographical distribution, but we're exposed to that, as I believe most of the industry. If I may add one thing here, Tal, and I would like to relate to the fact that we are a fabless semiconductor. It's not an easy thing.

Up until 5, 6 years ago, as Rakesh said, Cisco was an ASIC producer, but it was not a fabless semiconductor. So, for example, people are asking us why we didn't have those business models, silicon-only components, white box, and, and, and, full systems 10 or 15 years ago. Because the web scalers, that's really what they wanted. It's not a decision about the business model. Cisco could have made a decision at the time, 10, 15 years ago, or any other, they, that we will open that up. But to offer a semiconductor business model or a white box, it is not just a decision. You have to have the right technology. You have to have a silicon that you do in a fabless way, as Rakesh said, and maybe I will just give an example.

If let's just say that a silicon cost out of TSMC is $1,000. So if you work directly with TSMC, like the fabless guys or like we are doing today, it costs you $1,000. If you work in an ASIC model, which you still define the IP, the code, but you have the guy that is doing, providing the IPs. Some of them are important, like the SerDes, doing the back end and work with the fab, this is called the ASIC model, you will pay $2,000 for that SerDes, more or less. Okay? And if you buy a off-the-shelf silicon that the merchant vendor did, the fabless merchant vendors did, you're going to pay $3,000 for that. Okay? Now, when you go to those web customers, they are buying their silicon directly from fabless.

If Cisco would like to compete there, they have to compete with a cost structure of $1,000. We cannot be ASIC model, and we cannot do off-the-shelf, of course, merchant silicon. So we have to keep it in mind, and what we did, and that's a big transformation for Cisco, in the last 6-7 years, we transformed the team, and we build those muscle, those capabilities, and there are a lot of them, starting from the back end, from the manufacturing, from the testing, from the IPs. Cisco today, Rakesh mentioned that, is developing their own, the critical IP SerDes. We developed our own 112 gig SerDes. I think it will be no surprise if we say that we are developing our own 224 gig SerDes, and it's considered to be best in the industry type of product. So that's, that's on that.

Tal Liani

Technology Analyst, Bank of America

Mm-hmm. You touched on the issue of Ethernet versus InfiniBand, and the question is: What would drive it? What would drive the migration? In your view, Ethernet eventually provides a good solution, types of Ethernet maybe, but provides a good solution, or as good or even better. What drives the change? Because right now, you know, we have NVIDIA presenting later, and we know, at least by reports, that right now InfiniBand is the way to go for the back-end data centers. What drives the migration to Ethernet? What kind of benefits? How can Ethernet or what you offer replace what InfiniBand is providing today?

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Yeah. A few thoughts from my perspective. So I think you're absolutely right, Tal, that if you look at the market today and you look at the amount of back-end interconnect that is InfiniBand versus Ethernet, I think you'll see a percentage that is quite heavily weighted towards InfiniBand rather than Ethernet. I think at the end of the day, it comes a little bit in terms of where technology has grown up from, i.e., for a long time, high-performance compute was sort of the big thing that was being built in terms of disaggregated computing infrastructure. So you would put GPUs, you would connect them together with InfiniBand. You would use a large distributed compute structure to run, for example, a weather prediction algorithm on top of it to figure out where the next tornado is going to strike.

As a sort of gold rush towards AI ML has happened, these GPUs have been identified as being a key piece of technology in terms of solving AI ML infrastructure. There's already at-scale deployments with InfiniBand for high-performance compute. It is actually the natural thing to just migrate a proven entity and try and scale it up. I think that's what we're sort of seeing today. I think the question I would ask is maybe a slightly different one than you asked, Tal, which is: If you forward project, why is the interface actually InfiniBand? What advantages does InfiniBand have over something like an Ethernet or Enhanced Ethernet or Fully Scheduled Ethernet that we talked about before? And I would contend that except for the backwards legacy of where it came from and being deployed, from a technology perspective, it actually struggles, i.e., it's a sole source technology.

What does that really mean? Sole source technology means slightly limited investment structure. It means pace of innovation comes down, which means that the radix , the bandwidth of the switching infrastructure comes down. And to the question that came in earlier about the cost linearity of the scale, as the radix and the bandwidth of your chip shrinks, you need to add more layers of the network in order to connect all of those GPUs, and you start magnifying any sort of cost differences. Third is about resiliency, similar to the question that was asked before about fab technologies, which is if there's a single vendor providing the infrastructure, and all of a sudden it has to scale to the number of AI/ML-based deployments, are you really gonna risk all of that infrastructure on a single vendor?

Or do you want multi-vendor to give yourself supply chain resiliency? And fourth is what we talked about a little bit before, which is if you start having multiple people build GPUs, which we're already seeing today, right? They're no longer going to be tying themselves to InfiniBand-based Interconnect. So this transition to Ethernet seems really quite inevitable to me. And actually, the fact that you see even vendors who build InfiniBand-based switches coming out with Ethernet AI interconnect alternatives is again a sign that that trend is happening. I think what we need to realize that it's not an overnight transition, right? It will take a few years before it becomes the sort of de facto standard for AI ML interconnect, but from where I sit, from a technology perspective, it seems quite clear.

Eyal, I don't know if you want to add anything to that.

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

No, I think you said it well. If I try to summarize in my head, in my mind, I don't see... We don't see, being deep in technology, any advantage of InfiniBand versus Ethernet. In fact, with the pace of innovation that is currently happening over Ethernet, Ethernet, I will contend, will surpass in terms of performance, any InfiniBand technology. Second one, the ecosystem and the competitive nature where we are. Like I just showed 51.2T switch, and it comes as no surprise that there's, it just, it was just introduced this year, and everybody's working already on the next generation. And I will contend again that that will be presented next year. So in, in one year, everybody's in a race to do 51.2 to 102, and that's not going to end there.

While there is an application that consume bandwidth, we love bandwidth, okay? There will be such a rapid pace that not anyone, which is any technology with a single source, will be very hard to compete with. You can see the cost structure even today that when you compare the two. The competitive landscape is just too hard to beat.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Got it.

Tal Liani

Technology Analyst, Bank of America

Who is the customer for the infrastructure equipment for AI? Is it just the cloud, the cloud titans, or are we seeing large enterprises, telecom, and service providers kind of also deploying AI infrastructure?

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

It's a great question, Tal. Several thoughts on that one. So the short answer is, I think you'll see AI ML-based deployments across the board, from large-scale enterprises to service providers, to web scale or web titans or hyperscalers, depending on your terminology. But I think part of the problem that we all struggle with is AI ML is such a high-level term and means many different things. So if I sort of break it down one level more, I think if you look at the largest training infrastructure, the things which are required to train something like a ChatGPT style model, right? The amount of infrastructure necessary to train a model like that, I think aligns well in terms of cost to deploy, cost to maintain, facilities to manage.

That type of thing, I think, will live in the large scale hyperscalers or web scalers. So the largest training models, I think, will live there. I think there'll be a lot of inference, which happens there as well. Smaller training models, multi-tenancy. People will use it as a paid service to run that infrastructure in hyperscale. But that's not to say that if we, for example, look at enterprise data center or service provider data centers, that they're not gonna do smaller scale training or retraining models that they acquire for their specific use cases in infrastructure, and also lots of inference. Now, I think one of the other interesting points that sort of plays into Cisco's strengths here is, if you think about what each one of those customers ends up needing, it's all a slightly different infras...

Slightly different mixture. So in the hyperscaler, they might just want silicon and optics. As we talked about before, we offer that as a components business. They might want white box hardware, where they write their own OSes on top of it. We offer that infrastructure. If they wanna buy full systems with OS, whether it's our own OS or our open source operating system like SONiC, we offer that as well. But they're really definitely gonna do at least their own orchestration software on top of that. As you move down towards enterprise, campus, and service provider, I think they're gonna want more of a canned solution. So they'll want, in addition to full system infrastructure, I suspect most of them are not buying silicon only, but full system infrastructure along with network orchestration software to manage that full solution, right?

They want something that's easier to deploy, in a sort of self-contained fashion.

Tal Liani

Technology Analyst, Bank of America

Got it. My next question is a combination of two questions, but it's... They're connected, so I'm gonna ask them together.... Historically, many years ago, Cisco didn't have a good, good position with hyperscalers, but recently you announced $500 million in orders from AI. So first thing is, what changed? How did you manage to change position? And then can you dig a little deeper into the $500 million, not from a numbers perspective, but just to understand what is the composition of it, what kind of applications, and just more information about where Cisco is positioned now with hyperscalers.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Tal, do you want me to take that?

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

I can start, and then you can add. First, let's just start on the $500 million. The composition is both silicon and Cisco fabric, and it's being deployed directly for AI clusters. So it's not the front-end that is serving the AI cluster. This is really for the back-end AI clusters where the GPUs are connected. That's number one. Second, I will say, this is just the start, and what we see in the market, and let's just be completely honest, what we see in the market. The guys that are really starting are the hyperscalers, just like I said. What they're buying today, in 90% of the cases, they're buying an NVIDIA cluster with everything in it, with InfiniBand, with optics, and with the GPUs from NVIDIA.

And there's a saying that everybody talks about AI, that but the really people that are making money is NVIDIA company. But why we are optimistic? Because on top of those people are buying the closed, out-of-the-box clusters, all the big AI clusters, sorry, hyperscalers, are playing and piloting, and some of them are already in initial deployments of clusters where the fabric and the networking is Ethernet. Okay? So we certainly see them experimenting. We are part of it, okay? The $500 million is part of that, but the real deployments, we believe, for Ethernet, going to start 2024 and 2025, and it will going to pick up from there, and that's why we believe, because the previous discussion, it's going to be 75% or 80% of the market in three, four years will be Ethernet.

Now, similarly, the same thing will start happening in enterprises or service providers. Actually, people asked me three months ago, how much time it's going to take for them to catch up for the AI training and all that. And I said, "It's going to take probably a year or two." I was a little bit surprised to see in the last two weeks, I saw already deals of banks, service providers, okay, buying out-of-the-box clusters, smaller ones, that they're going to train their data on top of it. And based on that pace, I believe that they will open it up and will move to Ethernet probably also much faster than what we expected. That's on that. Rakesh, do you have anything to add?

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

No, I think you covered the second part of the question. First, I just want to roll back to Tal's first part of the question, which is: What changed, and how all of a sudden is Cisco relevant to the hyperscalers? And again, it all sort of plays on top of each other, right? None of these things are in isolation. But, I think it's similar to what we talked about before, which is, Cisco, to some degree, again, if we're honest with ourselves, did miss the transition a little bit in terms of web scalers bandwidth growth and the desire to move to sort of a components-based model. Right?

What we started investing in back in 2017, 2016, 2018 timeframes, was trying to get the right set of technologies inside of Cisco, whether that is silicon with Cisco Silicon One or optics, gray or coherent optics, right? Making sure that we had the right technology that exists, that would want to be consumed by a hyperscaler. That was case one. Case two is, once we have that right technology, we have to get the cost points down to the point where we can engage them on the terms that they want from a components-only business. That goes back to the COT model that we talked about before in terms of buying wafers directly or moving margin stacks and getting our costs down. Once we have the right technology and the right cost points, it allows Cisco to begin offering these business models of silicon-only white box over systems.

We go in, and we engage these customers to talk with them about what our technology is. We offer flexible business models, and what we're finding at the end of the day with our engagements with these customers is a real appetite for alternatives, and the fact that Cisco Silicon One really gives them something that nobody else really can. Now, why do I say that? These customers end up writing their own operating systems. They end up building their own hardware. If they adopt Cisco Silicon One, they can adopt a technology that can be deployed everywhere in their network. Everybody else is coming to them with point solutions, so they do all of this work, and then they deploy it in one location. Then they have to do all this work again and deploy it in another location.

We offer them this notion that once you adopt Cisco Silicon One, you can deploy it across your entire network, and we have actual hyperscalers who, from top of rack switch, all the way through their data center network, all the way through their WAN, and across their edges, all based on Cisco Silicon One, end to end. That is a huge leverage if you think about trying to deploy things at scale, trying to deploy things at scale with easy maintenance and lower cost. Like, everything about a web scaler and their desires aligns to the value proposition that we have with Cisco Silicon One, and it's why it's so appealing, I think, to that customer base.

What we're actually finding out is, what's fascinating about offering all of these business models is as we engage these customers, we work with them to expose what the business models are. They pick the set that is right for them and the deployments they're thinking about, but what we're finding is migration up and down. So they might start with silicon only for one location, they might buy white boxes from us in another location, and they might buy full systems for us in a third location. And conversely, they might start with full systems in one location and end at silicon only, right?

And so we're seeing this migration that once they get used to Cisco Silicon One and understand that it really does what we say it's going to do, they get quite excited about it and start finding out ways to put it in other portions of their network. So I think that's really how we've become relevant to them. And because it's not just silicon, it's silicon and optics and systems, we can have a conversation with them like other companies, frankly, can't.

Tal Liani

Technology Analyst, Bank of America

Got it. I got two questions that are on the same topic. I'm just gonna combine them. And the question is about the competitive landscape for Silicon One with Broadcom, with AMD, even smaller companies that are out there. How and what are you competing on, basically?

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Yeah, do you wanna take it or want me to take it?

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

No, I can take it. And let's just look at the marketplace. So with Silicon One, we are competing for the networking for the AI. So there are two other of the, you know, fabless semiconductor guys that are providing that solution, Broadcom and Marvell. NVIDIA/Mellanox are offering something hybrid. They are offering a box, not directly a silicon, in most of the cases. And that's the, that's the landscape. AMD doesn't have the networking. For us, they are a partner. I believe that we are partners for them as well. They are providing the GPUs. Similarly, Intel, okay? So in terms of our competitive landscape is the Broadcom, is the Marvell, and the NVIDIA to some degree, because of that. And that's it.

Tal Liani

Technology Analyst, Bank of America

Oh, great. Another question came up is, when you think about the specs and you think about the Tomahawk 5 of Broadcom and Silicon One, can you speak about the kind of how it stacks up versus competition, for the use cases of the cloud, cloud titans?

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Yeah, I think as sort of I mentioned before, right, I think we're very well aware that there's other 51.2 terabit silicon on the market, right? You mentioned Broadcom, there, there's others as well. And so there are multiple people sort of vying for the same business at end customers. If I think about what our value add is, why do we win? It breaks down into a few different things. One is the converged architecture that we already talked about, right? Which is once you adopt the Silicon One architecture, you can deploy it in multiple different places or networks. So there's a huge benefit to our end customers for efficiency in deploying technology. That's one statement.

The second is, I think, most people, if not everyone else, in the 51.2 terabit realm, is what I would call struggling a little bit to fit all of the complexities of high bandwidth silicon and are starting to drop things overboard. So we talked a bit before about the radix of the chip. I think that's something that we have in Silicon One that is unique. It goes back to how efficient is our underlying architecture. The fact that it's so incredibly efficient allows us to have space in silicon to do other things. The fact that it's so efficient also allows us to have very, very good power efficiency. And if you think about what that means to deploying networks at scale, that is a huge lever.

If you ask me, I actually think that power is a fundamental limit of networking as a whole, as an industry. We took power efficiency to heart when we came up with the Silicon One architecture, and it goes into every, every decision that we make in the silicon. The other piece is, I think, again, it goes a little bit back to the efficiency vector, is we have a very flexible, capable 51.2-terabit device, so we have programmability while still being incredibly efficient. We've taken latency as a high, high order bit in the G200 architecture, and we've actually managed to halve the latency of our 51.2 terabits. We believe that we have an industry-leading latency offering with a Silicon One G200 device, and that matters when you think about inference clusters.

It matters when you think about determinism for large inference clusters. Finally, how we design our silicon, because we're also a systems company, the way we think about the problem is quite different from a standalone silicon company. We think about optimizing the final deployment and the final product... not just the silicon. Now, what that actually means is, if somebody takes our piece of silicon and somebody else's piece of silicon and builds a box around it, and if we pretend for a second that those two have identical specs, by the time you end up looking at the full system, you'll end up with a lower cost, more power-efficient, final solution.

Again, when you think about the scale of the networks that are being built for AI/ML, it's incredibly impactful if you can end up with a cheaper and more efficient final solution. Eyal, anything else you want to add to that?

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

Just a different perspective. If you take a step back, Cisco, four or five years ago, had zero market—close to zero market share in the data center, inside the data centers. Yes, we were selling systems to connect data centers, but inside the data center, to connect the servers, not to talk about AI today, we didn't have the right ingredients. Rakesh mentioned five out of six. It's gradually growing, and for us, it's all additive. Add to that the AI clusters, which is a new opportunity, and currently, we are well positioned because we transformed us to a fabless semiconductor company, to some respect, with white box, with systems. We believe we can capture that as well, and that's going to add more to the growth engine that we, we currently have. And one last point. People ask me about the competitive landscape.

Cisco is in a unique position because I mentioned that different model, you know, COT is $1,000 for silicon. ASIC model is $2,000. Off-the-shelf merchant silicon is $2,000. Of course, if you sell silicon or white box, if you sell silicon, of course, you cannot do ASIC model or off the shelf because the web scalers, if the chip costs $1,000, they expect something less than $2,000 goes to them. So you cannot do it in ASIC or in a merchant. But even a white box, okay, when you look at a white box, the silicon content of the white box is about 25%-40%.

So if you work with an ASIC model or a merchant, your penalty is so big that I doubt you're going to win any long-term or mid-term deal based on that. If you look at the system, again, same economics, okay? Anyone that's going to sell systems, mid-term and long terms, his gross margin is going to be impacted. That's why I believe that we are in a unique position, for the mid-term and the long term.

Tal Liani

Technology Analyst, Bank of America

Great. Eyal and Rakesh, I feel that we just started to warm up, but we ran out of time. Thank you so much for the presentation. Great presentation. I learned a lot, and thanks very much for the Q&A session. For the audience, if I didn't answer your question, please send it to me in an email. If I don't know the answer, I'll ask Cisco crew to help me with the answer. Thanks so much. Have a great day.

Rakesh Chopra

Fellow of Common Hardware Group, Cisco Systems

Thank you. Appreciate the time. It was fun. Bye-bye.

Eyal Dagan

EVP of Common Hardware Group, Cisco Systems

Bye.