Please welcome Head of Investor Relations, Jimmy Sexton.
Good afternoon, everyone. Thank you for joining us here at Snowflake Summit in Las Vegas. For those tuning in, thank you for joining the live stream. I've heard it's been a nightmare getting here from New York. I hope all of you had a chance to attend and/or listen in to the kickoff keynote last night with Frank and Jensen from NVIDIA. I think it laid out, you know, our roadmap and partnership with NVIDIA very cleanly, and you know, we're gonna dive a little bit into that today. It is also now available for a replay on our website. If you didn't get to see it last night, please, you know give it a listen.
We also had an incredible main stage keynote this morning. I think, you know, we're gonna double-click on a lot of the product announcements that we made, but that is also available. Outside of today, we also have a week full of jam-packed customer sessions that we encourage all of you to attend. We can tell you one thing, but I think hearing it from our customers is, you know, more important. Before I dive into the agenda, I want to acknowledge our safe harbor. Here's our lineup for today. First, you know, you're gonna hear Frank Slootman, our Chairman and Chief Executive Officer, talk about, you know, really our North Star in mobilizing the world's data.
You've heard us talk about this dating back to 2020, during the roadshow. I think today it is even more relevant with everything you're hearing around the different generative AI use cases. Next, we're really excited to show a prerecorded video with Frank and Satya Nadella from Microsoft, talking about, you know, the next phase of our partnership, coming off the heels of that press release you saw yesterday. Next, we're gonna invite Christian Kleinerman, our Senior Vice President of Product, to talk about kind of our pillars of innovation and kind of how, you know, generative AI plays into this, as I realize that's, you know, top of mind. Christian's gonna invite Sridhar Ramaswamy up from Neeva.
That is the deal that we announced last month, the cofounder of that company, and talking about how this next phase of Snowflake and Neeva can work well together, and how you engage, you know, with that data. Next, we're gonna have Chris Degnan, our Chief Revenue Officer, come up with Rob Smedley, the Vice President of Data and Technology from Disney Parks. They're gonna talk about kind of the foundation that Disney is building on Snowflake and, you know, kind of the roadmap, you know, for them in the future.
Lastly, we're gonna have Mike Scarpelli come on stage, talk about, you know, our long-term opportunity, how we get to our long-term model, and hopefully answer questions with some new metrics and disclosures to help you better understand our comfort around that model. Lastly, Mike's gonna invite Frank and Christian up for audience Q&A, and at that time, we will have mic runners, so you'll be able to answer, you know, your question on the broadcast. With that, I will invite Frank Slootman, our Chairman and Chief Executive Officer. Thank you.
Hello, everybody. How we doing?
Maybe my most important thing today is to tell you that I'm still here. I just stepped up on the podium. I wasn't stepping down, so I just wanted to make sure you heard that loud and clear. It's always amazing how these things seem to air on the first day of our conference, you know, when our board member, Mark McLaughlin, and myself and the company sort of emphatically denied the rumors. That's the world we live in. You know, I hope you had a opportunity to hear our session with Jensen last night which I think was super cool. The partnership's very meaningful, you know, to us.
This morning, if that didn't convince you that we have absolutely a ton going on in the world of AI, you can take a shot now if you'd like, because there's a drinking game going on. You know, not much will, but if you need more conviction, we're here today, and I also recommend that you sort of wander around to the partner pavilion, because you can see large language models, you know, running in Snowflake containers all over the place, and you'll see a bunch of our partners, you know, literally running, you know, large language models on Snowflake already. The train has left the station and, you know, we're running really, really hard to fill in behind that.
I think Christian Kleinerman's session this morning is, or this afternoon, is really important because he's gonna really lay out the whole framework, all the different ways that we are going to enable these language models. This is really important because, you know, we cannot really predict, you know, how customers are going to do this. You know, there's loosely coupled, tightly coupled. We see a lot of people, you know, using Microsoft's hosting of OpenAI Azure-to-Azure type of model, so that's sort of in the Microsoft world, is one model. There's other models that are being attempted at the same time, it's very early going.
The great thing is, we just have this wonderful architecture and, you know, a lot of the things that we just delivered, are really intersecting, you know, with all the possible options and configurations that people are gonna wanna entertain. It's not quite as simple as Jensen said last night, I mean, you just go, put this AI factoring on top of your data, and poof, you know, you just get pelted with insights and intelligence. I like his vision, by the way. I just absolutely love it, you know, it's gonna take a little bit of work to make that all go.
Jimmy just said, you know, when we went public, which is now almost three years ago, it feels like dog years to me, but, you know, our tagline at the time was, "Mobilizing the world's data," and that is as relevant or more relevant today than it even was then. You know, when you see this avalanche of AI interest, you know, coming at us, it's like this is exactly, you know, what we wanted to do, what we're trying to do, and we're just getting a huge help, you know, from the world of technology to do things that were just unimaginable not too long ago.
As most of you know, I've been around a while. I told a story about, you know, learning COBOL, several decades ago. People then, you know, referred to COBOL as a common business-oriented language, which is like, "You got to be kidding me," because it was just, you know, not very business-oriented, and it certainly was, you know, very hard to learn even in those days. Compared to assembler and machine code, you know, it was readable syntax, and it was a step up. In the '80s, and I was certainly very much alive in the '80s, the movement towards structured data, structured query language, I mean, that was a huge step up.
People then said, "Hey, you know, this laypeople, relatively, you know, unsophisticated people can now query these relational databases." Obviously, SQL got became much more complicated over time. The fact that we're now seeing that we're going the last mile, where we're completely obliterating these limitations that we've lived with since the early days of computing, is just incredible, you know, to me. You don't even need to be literate these days because you can just talk to a database and come back with very meaningful feedback and insight. Data, you'll hear about that a lot, is the foundation of AI, and we're sitting on a ton of it, and we're sitting on proprietary data.
We're sitting on public data, every combination thereof, every different data type, from structured to semi-structured to unstructured. You heard about it this morning. This is just a great place to be. Now we're just racing to enable, you know, every possible configuration, you know, to derive these benefits. Good time to be alive. We said over and over again that if you want to have an AI strategy, you need to have a data strategy. If you saw Mihir, the CTO, he had a bigger title than that, but I always call him the CTO at Fidelity, talk this morning, I think that was actually very important.
You know, I don't know whether everybody picked up on that, but Fidelity has a textbook, strategy and implementation, not just of Snowflake, but in general, how you run data strategy in large institutions, large, enterprises. I mean, it's incredibly curated. It is incredibly consolidated, it's trusted, it's sanctioned, and they have full control. I cannot begin to tell you that these guys are just primed, you know, for AI, and they know it, right? That's why this is so important. You cannot just take, you know, a data lake, which might sometimes refer to data lakes as landfills, because it's just, you know, full of files. If you think you're gonna just, you know, stick a model on top of there and flip the switch, and everything will be okay, I think you'll have another thing coming.
It's very, very important, you know, to get your data strategy in order, in order to position yourself, you know, for a lot of the benefits that we're envisioning that we're gonna get from AI. Machine learning is well underway. That's a whole, you know, a whole separate chapter that we're absolutely, fully engaged in. I said, you know, this morning, I mean, we have, you know, 70% of our large consuming customers, which are customers that consume more than $1 million a year, you know, they are using Snowflake for machine learning, and I see it and hear it, you know, all the time, that people are running dozens and dozens of models now. Years ago, in the early days of Snowflake, when I was there, I mean, it was still a rarity.
We were mostly doing 24-hour cycle, you know, big batch analytical processes. I mean, now, this kind of data science is really becoming mainstream. The data strategy part, this is why we talk about it so much. It's such a key enabler to AI strategy, but it's an enabler to everything. A lot of the conversations I have with customers, I mean, there's just immense frustration with people not trusting data, and this is sort of one of the key benefits that data warehousing brought to the world, right? I mean, we kind of poo-poo it now because data warehousing was incredibly capacity-constrained. We always have the running joke, you know, it was sort of begging for a 2:30 A.M. time slot, you know, three months from now. It was just impossible to get anything done.
What they did bring is this sense of curation and trust and sanctioned data. I will tell you that, you know, there's one large financial institution, a customer, they have a very large data lake, which is not Snowflake, and but they have Snowflake as well, and they have trouble getting people to use their data lake. They're all going to Snowflake. Why? It's trusted, it's sanctioned, right? Understood. The data lake is like anybody's guess, and that's really, really important. You know, we go forward to these higher levels of function, we have to have a data strategy that is working. I think the Fidelity presentation this morning was like a poster child for how to do that. We're sitting on a ton of data. Data has gravitational pull.
It's coming. This is an avalanche that's coming towards us. You know, in the beginning, you know, people are gonna be looking for, you know, simple stuff. You know, I call it augmented query. You know, I just wanna be able to ask some natural language questions. I don't know whether you saw the Forbes article in about State Street. You know, they have added to their dashboards questions like, "Why is my portfolio down today?" It just, you know, gives you a nice, intelligent, you know, textual answer to that question. That's low-hanging fruit. You know, everybody's gonna do that, right? We've just changed, you know, our query methods from whatever method we had before to natural language.
You know, it's great to go to your board of directors, show that stuff, show that you're part of the in crowd, and you got it down, and all this kind of stuff. That's just far, far from the end game that people have in mind. You know, when we talk to the large enterprises out there, I mean, they're talking about collapsing their call centers. They're talking about collapsing their sales infrastructures. They're looking for massive redefinitions of the economics of their businesses, pricing optimizations in retail, right? They're looking for but a lot of it is cost and economics, you know, oriented, you know. It's this is the demands that are gonna come at our data, because if you heard Jensen last night, the data holds the key.
It holds the key to the intelligence, right? We just need, you know, AI, the large language models, to be able to extract, you know, what is in that data. That is really the key thing that we have to do. It starts with having the data. I mean, the chat I had with Satya Nadella last week, we talked a lot about this, and they are keenly aware that, data is where the whole game starts. We have a healthy flow of data. You know, we're doing over 3 billion queries today. Those are Google-esque, you know, type of volumes, and those numbers are growing, you know, robustly.
You know, our footprint in the Global 2000 is expanding quarter to quarter, and, you know, we are campaigning all over the world hard to continue to invest, because we think that is the game, is to really establish ourselves. Once we establish ourselves, you know, we grow. You can see that from our net revenue retention rates. That's really the 2 selling motions that we have. One is the acquisition of new accounts, the other one is growing in the ones that we have. That will continue unabated, and that sort of opens up the future opportunity to AI, in addition to the current one that we have.
One thing that we talk about when we talk about the Data Cloud is, like, the fundamental philosophy, the fundamental conviction that underlies it is when you have a Data Cloud, you bring the work to the data. Historically, I mean, we lose our conviction very easily because we have for, since the beginning of computing, it's been so easy, you know, through FTP and APIs. Why don't we just send the data over here, over there? It's very difficult to maintain that conviction. Our whole Data Cloud strategy, right, that's sort of the big sell to the world, is like, "Hey, you know, all the work that we're doing in terms of workload enablement is aimed at that." That is full alignment with our business model, right?
That's why we're investing so many resources in, you know, not just being able to do data warehousing, but expanding, you know, through Iceberg Open table formats, which really opens up the data lake opportunity, transaction processing, you know, global search, cybersecurity, all the different categories of workload is incredibly important to this strategy succeeding. We have historically never had data platforms that were so multi-capable as Snowflake already is, and what it is to become, and it's working. You know, you see the workloads growing and growing and growing in leaps and bounds, and that's what we have to see, because that means that the strategy is working.
The other part is, you know, because we come from an on-premise world, you know, we have always kinda looked at, well, the enterprise perimeter is sort of, you know, our playground. That's our scope. That's sort of, you know, what we live with, and anything outside of our enterprise perimeter, it's really difficult and scary and untrusted and all these kinds of things. Having data connections outside of your enterprise perimeter was, like, almost impossible to get that done. We've really redefined, you know, databases and database ecosystems. You know, I refer to them as orbits, and really data universes, and they're really defined by your business models. They're defined by your business relationships, they're defined by your, by your ecosystems, and they're dynamic. You know, you will grow, you know, connections, you will terminate connections.
You know, it'll be different from a day-to-day standpoint in terms of what you have, but also, you know, how much data is flowing this direction, that direction, both directions, all this kind of stuff. There is much more fluidity to data. For AI, this is really, really important because you wanna train. You're not just gonna train on your enterprise data, you're gonna train on relevant data, right? I mean, the hardest thing in my conversation with Jensen, I've had several chats with him over the last couple of weeks, is like: Look, you know, we're gonna come up against harder and harder questions that people wanna ask of their proprietary data. It's like, how we enable that's really what the partnership, you know, is about, because we can't possibly know.
We're super optimistic, everybody's high-fiving and all this kind of stuff. The reality of software, the way we've lived it, you know, over the last 30 years, is things are always harder than what they, what we think they will be, and they take longer and all these kinds of things, like building a house. Having, you know, a lot of the right resources that we know, you know, we're gonna have very demanding requirements to be able to ask hard questions. You know, last night I brought up the example that I've seen in the business of DoorDash and Instacart and so on. You know, people that spent enormous marketing resources to drive their top line, and then they have churn, which is kinda. You know, churn is sort of the thing that, it's like inflation.
It just evaporates, you know, the value. I onboard a customer, and they place an order, and then they don't come back, or they come back a month later, and then they're gone for three months. It drives them insane because their models are not working, their marketing investments are not making sense. It's like, how do I grow my business if I can't get to a model? Not understanding why churn happens is just maddening. You know, being on the board of Instacart myself, I hear about it every quarter, and it's one of those things, like, where does data, you know, begin to start, you know, answering that question? We know that it can, right?
You're getting to the hardest questions that people have in their business in the world of auto insurance, and we have a lot of auto insurance customers in between GEICO and Progressive, Liberty Mutual, and so on. I mean, they live on telemetry data, right? Because auto insurance, I mean, they're driving their costs down, they're driving their prices down, but it's all about pricing risk. Well, the way they price risk is by understanding, you know, what the claims, you know, risk and the claims profiles are for people's driving histories. They barely need any other data than that kind of data to be able to run their businesses. Some of these insurance companies are doing really well, comparatively, driving their prices down, and yet their profitability is going up. That's all data, right?
You cannot even conceive of running these businesses anymore without having that kind of data and the analytics and intelligence to be able to extract the value from it. We see this happening all over the place. You know, data used to be an assist to running business. Now it is really the core of running businesses. As we get more and more sophisticated with things like large language models, it'll be inconceivable. I mean, we will be running everything through these models. That's really where we're headed. I'm gonna switch gears on you because we're gonna talk a lot more AI, so opportunities for taking another shot here the rest of the afternoon. We, as you know, we signed a contract with Microsoft, and yesterday we went public, you know, with that.
I think Mark, we more or less doubled the size of the contract that we previously had, which was done two and a half years ago. There's good velocity in that relationship. The conversation I have with the CEO of Microsoft is like: "Hey, you're about half the size in the Snowflake business that you should be, you know, based on your market share." I know why it is, and the reason is, you know, we don't have incentive alignments in the field. It doesn't matter what you think or what I think, what matters is who gets paid on what, right? That directs and drives behavior. He fully agreed with that, and I said, "We, A, we have to change it, number one, and B, we have to codify that in the agreement.
Otherwise, this won't change, right? We will not sign up for big numbers, you know, unless we have really strong assurances that we're gonna get alignment in the field." That's really how our Amazon relationship has become so productive, because the behaviors and the dynamics in the field really dictate how these things work, and it has very little to do, you know, whether the CEOs like each other, or not. They've been very forthcoming on a number of fronts. Maybe Mike will talk about it, you know, later, more details if he wants.
I also asked Satya, I says, "Hey, you know, I wanna record a clip because I get asked about the Microsoft relationship all the time, and they hear me talk about it, but I want them to hear you talk about it, okay?" He, you know, to his credit, he readily agreed to that. We recorded that clip, so we're gonna play that for you now, and I'll let it speak for itself. Satya, it's great to be with you today, having the opportunity to talk about our partnership. You know, we've been working together for quite a number of years, but this is a new chapter, and we thought it was a good time to create a little bit of context and clarity for all our respective stakeholders.
I'll quickly rattle off what we think are sort of the major dimensions, and then you can sort of weigh in with your unique perspective in terms of what it means to Microsoft, what it means to you, what it means to our shared customers. You know, we've been running, you know, on Azure for years now. We share thousands of customers already. Azure is really a big part, you know, of our world. This is a dramatically extended relationship. I mean, we did our last contract just a few years ago, and now we're back and we're doubling up the ante here. The velocity is tremendous. It's very exciting.
Azure has also been the fastest-growing part for Snowflake, and as a share of our business, and I think you agree with that, we just have room up, and we're really looking at what does it take to do that. The second aspect is, you know, greater field alignment, and this is really an important part of the partnership. You and I can, you know, have great alignment, but that you get 14 layers down, it's a whole different thing. I'm very excited about the progress we're making in that regard. Finally, I think this is really great, working on joint solutions, the ability to co-sell joint solutions, especially in the areas of data science, you know, Power BI, obviously with Snowflake, machine learning, and AI, Azure ML, Azure AI.
We're really excited about, you know, how the relationship is coming together, certainly from the Snowflake perspective. I think our audiences are gonna be super interested in hearing, you know, your perspective as well. Go ahead.
Absolutely, Frank. First of all, it's such, you know, it's so great to be able to sort of, you know, launch this next phase of our partnership. As you said, we've been working together, but I do think that this is, you know, one of those points where you are already such a mainstream part of the core enterprise, and so therefore, Microsoft and Snowflake coming together to address what are pressing needs of our mutual customers so that they can do more is fantastic to see. I think you captured it well, right? I mean, it starts with really both the Azure platform and the Snowflake platform coming together, because they're two mission-critical things for any enterprise.
The more we can work across all over the world, across all the different types of configurations for mission-critical applications, we are excited about the next leg of work that we will do in bringing the best of Snowflake, the best of Microsoft Azure, and at the infrastructure layer. As you said, at the end of the day, it's about field alignment so that we, you know, your sort of frontline and our frontline can go in in an aligned fashion to solve some of these critical problems that customers have and critical challenges customers have, having the shared incentives. We're really looking forward again in this next phase, to really take out some of the friction and more align ourselves at the face of the customer. I think that's fantastic.
Perhaps the third dimension is what's most exciting, right? In some sense, what's the value add? What's the sort of the real coming together of Snowflake and Microsoft represent? I think you said it well. Snowflake is where a lot of the most critical data of an enterprise is, and data is the most critical asset, even in this new age of AI. In fact, AI doesn't happen without sort of data. Therefore, Azure AI and some of the OpenAI models coming together with the data in Snowflake, I think that's a place where there's so much demand. I'm very excited about what we can do for our customers. Power BI on the other side, right? As an experience layer on top of this data.
We already have, I think, significant amount of Power BI usage on top of Snowflake. Us taking the next step on that will be exciting. Even Purview and some of the data governance capability beautifully integrated into Snowflake, or even some of our ETL, like, you know, our data factory. These are all things, I think, in a very customer-centric way. What excites me about this next phase is, you know, every customer in a time like this is looking for what's that edge they can get. On top of their digital investment. I think Microsoft and Snowflake coming together to help customers get more out of what they're doing with their infrastructure, with their data, with their AI, is something that the two of us can, I think, uniquely contribute to, and I'm really looking forward to that.
Yeah. Thanks, thanks for saying that. We think the AI angle has sorta put additional context on the relationship, you know. As we talked, this is not the same thing as scheduling your next trip to Yellowstone. This is about asking really hard questions, you know, of your own business, and how do we, you know, collectively enable that. I think these are exciting times, and, you know, thank you for the support, but also thank you for your leadership in the world of AI, because the whole world is moving forward, and this is a great time to be expanding our partnership.
I look forward to it as well, Frank, as you said it well, which is, in some sense, we now have a new reasoning engine. With the data engine, when you bring those two things together, you can really fuel the next generation of productivity for every business process and every enterprise. Really looking forward to this next expanded phase of our partnership.
Terrific. Thanks for taking time today.
Thank you so much, Frank.
Thank you, Frank, virtually thank you, Satya. I hope all of you are picking up on this theme, that really Snowflake is at the intersection of this revolution within AI, hearing Frank speak with probably two of the most important people in the midst of this revolution, we believe is really powerful. To lean into more about what Snowflake is doing with our product, I would like to invite up Christian Kleinerman, our SVP of Product.
Hi, everyone. Good to see some familiar faces. I'm sure I have not met all of you. Hopefully, many of you were able to see Frank and Jensen last night. Hopefully, many of you got to see the keynote this morning. I will recap a subset of the announcement from this morning and contextualize it on why it matters for you as investors. Very important to say, I am not covering everything we covered this morning, and this morning, we didn't cover everything we have at the conference. The innovation runs broad and deep, and this is a great opportunity for all those of you that are here in person to talk to our customers, talk to our partners, and hopefully, you can sense the excitement that we have at Snowflake, but most important, that the ecosystem around us has.
With that, the talk this morning framed our innovations in three different themes. One is the concept about a single platform, and we will not get tired of emphasizing that Snowflake is Snowflake is Snowflake, a single product. Once an enterprise integrates with their security system, their identity system, their overall enterprise infrastructure, all of the additional capability we build fits right in there, and that's a big part of the value prop that we have. We're also simplifying the cognitive load. We like asking our customers: Can you even tell us how many services your favorite cloud provider launched last week or last year? Think about if you want to embrace a multi-cloud world, it is really, really difficult to just keep up with the number of products and the complexity that comes from the cloud providers.
Whereas we're very focused on taking a lot of the effort, complexity on our side, and simplifying, simplifying, so that it is easier to adopt Snowflake. A single platform, we'll talk more about chapter two or theme two, was around how do we help our partners and customers both distribute, deploy, monetize data products. It could be a data set, it could be an app. We'll talk about it. A lot of what you heard Frank talk about, a lot of what you've heard us talk about for the last 2, 3 years, is how do we do away with a trade-off and dilemma that exists in most organizations today, which is I want to do more with data, but I also want to be able to govern it and secure it. Go and ask CIOs, "Are you at odds?
Do you have tension with your data science team? Invariably, there is some tension, and sometimes it gets resolved towards data scientists can do less, sometimes get resolved towards we are taking some risk on security, and our whole value prop of bringing computation to the data is there should be no trade-offs. Go get amazing value out of the data, don't give up on security or privacy or the type of analytics you can do. Let's dive into these three themes in a little bit more detail. The first one, I already alluded, the very for us, very important. It's a single platform, you wouldn't believe it, like, it's 12 years, 13 almost, since Snowflake came out to the market, our architecture continues to be a massive differentiator relative to most of the solutions out there.
Our architecture has three tiers, core storage, as much compute as you want, a global services or cloud services layer that coordinates all of this. There's a fourth element of our architecture, which is what enables us to provide cross-region and cross-cloud experiences for our customers. I've mentioned it in prior conversation with some of you. A lot of people can easily say, "Oh, I am cross-region and cross-cloud," because they took some VM or some container and made it run. We are truly cross-cloud, where we can give our customers and partners an experience that transcends any one region, any one cloud. Some of the stats you see here, Frank alluded, the number of queries we're running on a daily basis is quite large at this point, 3.1 billion queries. We included the number of rows that one...
the largest customer table that we found, a single table with 50-plus trillion rows. We took our five largest customers by data volume and compressed. They have 177 petabytes in Snowflake. I cannot emphasize compressed enough, because our compression ratio could be 5x all the way to, say, 20x. This could be five customers have more than an exabyte of data inside of Snowflake. Going to the specific announcements that we shared this morning. I'm gonna keep it at a high level without going into the details, but the way I think all of you are gonna see at the largest organizations have the data platform wars, for lack of a better name, play out is, what are the engines that control the rights and the governance of what data?
All of you understand assets under management really, really well, better than I do. Think of the concept of data under management or bytes under management. That's how we'll see, at least for the largest organizations, where there is a concept of open file formats, open table formats that anyone can query, but who is the custodian of that data? The announcement this morning is the introduction of a single unified Iceberg table type, which will let customers choose. Do you want Snowflake to just be a read agent of the data, or do you want Snowflake to be the custodian of that data? The words that we use here are unmanaged and managed. The most important thing, the reason why we shifted our plans from a year ago a little bit, is to make sure that there are no trade-offs on performance.
We will give customers that choose to use open file formats and open table formats, performance comparable to what they see with Snowflake internal tables. We announced, it was October last year, someone should fact-check me, the acquisition of Applica. At the time, GenAIs and LLMs were not this singular topic that we discuss all day long, but we had already seen a glimpse of the value that this type of technology can bring to certain problem spaces. In particular, we said, "This is pretty amazing. You can ask a natural language question of a document and be able to get answers." Those answers become structured data, which then you can use for applications or pipelines or maybe even store in a traditional table.
What we have announced this morning is something that has been in preview for a few weeks, so we have some positive customer feedback already. It's the ability to have documents that are stored in Snowflake. You can use this Document AI technology and ask questions in natural language and extract values for those questions. Very important, it's not a language model, it's an image and text model. If you haven't seen the demo from this morning, there was image handwriting recognition involved, where if a document has some parts of it that are text and parts of it are image. The image can be run through OCR, optical character recognition, we will be able to extract those value out of that.
Think of the use cases of legal department that wants to ask questions on how many contracts have these type of terms or limitation of liability above some number. Those are the use cases, and we're quite excited about where this is at, which is in private preview. One of the early debates that we had at Snowflake when we were saying we wanna do more on AI and ML, was, are we catering to people that know ML, data scientists that understand which algorithm I'm gonna use, and which function, and training, and loss function, or do we cater to analysts? Over the last couple of years, our answer has been very simple: We're gonna appeal to both, 'cause both are personas that are really keen and really close to the value prop of Snowflake.
What we announced this morning, these are functions that are now in public preview, are ML, machine learning, powered functions, but targeted to analysts. If a SQL analyst wants to run a forecast on, say, sales data or any time series, they can do so without having to know the underlying machine learning technology. We announced forecasting, we announced anomaly detection, and the third one is a personal favorite of mine. We call it Contribution Explorer, which what it does is it helps you answer: What are the conditions that may have contributed to a metric changing? Typical example, maybe my sales, same store sales, quarter-over-quarter are down. Why is it? Then this can run through a number of dimensions and say, "Maybe it's this product line, or maybe it's this specific location," something like that.
Important for us, this is all a driver of consumption. It's SQL functions. They can be called from within SQL, from within Snowpark, but they drive quite a bit of CPU consumption. Unistore, it was covered in Frank's section of the keynote this morning. I'll be the first one to say, relative to what we shared with you last year, it has taken a little bit longer. I think I said this is the Holy Grail and this is really hard, and yeah, it is still the Holy Grail, and it's still really hard, but the progress is amazing. You see here some of the logos of companies that are leveraging or using the preview of Unistore. Excuse me.
The most interesting thing is we have five customers send us a note and said: "We are putting this thing in production. We know that, sorry, we know that you're not condoning it, we know that you have production rules, we know that this is covered by preview terms." Doesn't matter. They liked it enough that they've gone live in this. We continue making progress. The next milestone here is for us to be in public preview, likely towards the end of the year. In reality, all of these milestone transitions are a function of customer feedback and where the technology is, but it's making really, really strong progress, and we have people waiting in line to leverage it. Category or chapter number 2, or innovation theme number 2, is how to distribute, monetize, and deploy data products.
We frame them a lot as applications, which is where the heavy lifting happens, but everything here is applicable to data products. The other interesting thing about data products is many of our data providers have started to do, "Here's a data product with a Streamlit UI," and I don't know anymore if you call that a data set or you call it an application, which is why, in our mind, it's all one and the same. One of the announcements that I think are most meaningful to the adoption of our Snowflake Marketplace or products in the Snowflake Marketplace, is the ability for our customers to purchase products from the Marketplace by drawing down from the commitment that they've made to us.
If any of you are thinking, "Oh, yeah, this is what you do with a cloud provider marketplace," yes, same concept, but the beauty of this is that this is now a cross-cloud ability to draw down. If a customer commits to us, say, $100,000, they could draw down for some compute on Azure, maybe some compute on AWS, maybe an application is gonna run on a different deployment. We think that we're gonna dramatically lower the friction that it takes to be able to leverage the amazing data products that are in our marketplace. This was announced today as generally available in the U.S., with some exclusions that you see in the footnote there.
The largest announcement we made last year was the Native App Framework, which is the mechanism by which we bring rich applications to run close to the data. The whole reason why we did this is to accelerate the time to value to all parties involved. Particularly, if there's someone building an app. Today, they spend 80% of their go-to-market cycles going through legal, procurement, and security reviews, and that same cycle is experienced in the consumer. I like an app. I like some machine learning technology. I want to use it, and yet I'm spending a lot of time just making sure that that vendor can conform with my security, privacy, and compliance needs.
The Snowflake Native App Framework just turns that on its head, brings the computation to the data, and as long as an application can vouch for the data's not getting copied out, hopefully that whole cycle of going to market gets accelerated dramatically. Probably the single biggest announcement today is that the Native App Framework Is, as of this week, now in public preview, and we showcased 25+ providers, 40+ applications already published in the marketplace. I will emphasize for all of you, the philosophy that we have on allowing someone to publish in the marketplace is it has to be finished, real products. We don't want demos. We don't want, like, a toy app. No.
The analogy we've used internally and happy to share here is we want to be the Netflix of data products, and not the YouTube of data products. I have lots of respect and appreciation for YouTube, but we want that every product in our marketplace, it's curated, it's known to work well. We run security reviews, we run assessment, we validate sample queries, 'cause we want a maximum experience for our customers. Here's some of the logos that I mentioned we shared this morning. I don't know if there's anyone I wanna call out. DTCC, what they're doing with us is completely amazing.
Depository Trust Corporation, they're trying to bring settlement data faster into the financial system, and we have seen already customers come to Snowflake and say, "I am interested in Snowflake because of such an app." And that's the beauty of what we're trying to do, is not just bring value to our existing customers, but create the Data Cloud where other customers will feel compelled to come and join. Bloomberg is another recent addition that I'm very excited about. They are now, through a Native App, enabling the ability to bring data to a Snowflake account. Third innovation theme, again, is how do you get or how do our customers get value out of the data without trading off security, compliance, or capability?
Very important for us, every so often we hear, "Well, but Snowflake doesn't do streaming." Like Snowflake has been doing streaming since a little bit over Summit five years ago, a little bit over. That's when we introduced Snowpipe and streams and tasks. Streaming has been in the product for a long time. What we've been working on is how do you lower the latency of that streaming? Where we used to do 1 minute to ingest data, now with Snowpipe Streaming, our customer can bring data into Snowflake in a matter of single-digit seconds, 1 to all the way up to five seconds. The other thing that is very important to bring lower latency and make it simpler is how do you transform data in Snowflake?
I mentioned again, five years ago here at Summit, we announced streams and tasks as the way to do it, and what we've announced now, this is in public preview at the conference, is the ability to do Dynamic Tables, which is a much simpler way to express data transformations, and Snowflake does the execution of all of that behind the scenes. Streamlit, the external framework, the open-source framework, that thing is very healthy. The growth is amazing. The number of applications being created is great. It has always been the fastest way to productionize machine learning models, and of course, in the age of LLMs and GenAI, it has become a tool of choice to productize and showcase AI and ML of generative nature.
We shared at the keynote this morning that we counted over 6,000 Streamlit apps that are front-ending large language model and generative AI use cases. The big thesis by which we acquired Streamlit, it was not just to have the open-source framework, which we love and continue to invest in, is to be able to securely run Streamlit inside the security boundary of Snowflake. That's what we've internally called Streamlit in Snowflake. It's been in private preview with a larger number of customers. We've counted over 2,000 applications being used, and this will be going into public preview in the next few weeks, maybe a month or two at most. It's in the final stages. We're finishing up some performance and fit-and-finish, but what we're hearing is quite exciting from our early adopters.
A lot of what you have heard from us around bringing the computation to the data, the technology or the collection of technologies that enable that, is what we call Snowpark. As a quick recap or reminder, Snowpark is the secure hosting of Python and Java runtimes with a number of libraries that make it easier to program to those runtimes. The most popular library we have right now is the DataFrame API. It went generally available in November 7th last year. That is when you've heard Frank, myself, and others speak about, "Dear customers, do you wanna save money from what you're doing with Spark?" Is because we see the performance of our DataFrame Snowpark API be easily 2 to 3 to 4 times faster than Spark.
Depending on which distribution and which pricing you're comparing, maybe you are only 10% cheaper all the way to 2%, to 2x cheaper. There is an extreme case. Like, there's one customer that ran a POC, they ended up being 12x cheaper than what they were running currently, and we're now in the process of planning the migration. This is probably one of the fastest, actually, I'll say, the second fastest adopted technology we put out there. The first one was SQL-based store procedures, and we are continuing to invest in Snowpark because we like a lot of what we see. Part of that investment, we announced this morning, two new libraries, in addition to the data frame library that I mentioned, that are in public preview as of the conference.
One is around doing feature engineering. How do you prepare the data to feed it into AI, ML, GenAI, LLM? The other one is, how do you do training in Snowflake? Frank alluded to Mihir from Fidelity being on stage this morning. They were one of the early adopters of this technology, and they completely loved it. They ended up asking us for early permission to go into production, even though the technology is not generally available. We also announced this morning the introduction of a Snowpark Model Registry. One of the challenges organizations have is everyone is training models, and what do you do with these models? How do you organize them? How do you version them? How do you deploy them? How do you discover them? That's the capability that we're doing right now.
This is in early private preview, but again, early customer feedback tells us we're on a great trajectory. Probably the single most consequential announcement, not only for our customers, for our partners, but I would say for all of you, is the introduction of Snowpark Container Services, which is the continuation of we wanted to have more computation running closer to the data. If we had done one programming language runtime at a time, we'll keep doing Snowpark for C# and Snowpark for C++ for the next, I don't know, five, 10 years. We decided to just accelerate all of that. What we're doing is Snowpark for a Docker container. Now either our customers or our partners can bring a Docker container and run within the security context of Snowflake.
That's the foundation of what you heard Frank and Jensen talk about. How are we gonna bring this amazing framework that NVIDIA has? They had it all containerized. As Frank just mentioned, you can go to the partner pavilion, and you see very interesting technology running inside of Snowflake. Here are some of the logos of partners and a couple of customers that have already done the work to integrate with Snowpark Container Services. I'll emphasize, this is not people that said, "Oh, I would love to do something, and if you put my logo on the slide, I'll do someday, Someday, we'll do something." No, they've done work. We've seen solutions from every one of these partners running on Snowpark Container Services. If I say there's another 30 or 40 in progress, maybe that's undercounting it.
The excitement from both partners and customers for this technology is, I would call it, through the roof. Snowpark Container Service is very important. It's in private preview, so it'll take some time to get us into the full cycle public preview and GA, but the early signals we get are encouraging. This is how all of this comes together, and I think many of you walked into this room with questions on, "Okay, what are you guys doing on GenAI, or are you getting out-competed or anything like that?" The answer is the whole thesis that we've been on for the longest time is entirely applicable to generative AI. All along, what we said is we want to tear down silos from customers, and we wanna avoid re-siloing.
One of the biggest reasons why people re-silo is because they're sending data to all sorts of third-party systems. It's only that now you have even more GenAI third-party systems you can send, and what we wanna do is turn it on its head. You already did a lot of work as a customer on organizing your data, governing your data, setting policies, setting users, role-based access control. We wanna honor and leverage all of that, but still be able to get value of GenAI. We've shown at the keynote, not only our own first-party models, but also third-party models. Nothing precludes any of our customers to taking a Hugging Face model or from whoever you want, and they can go and run it safely and securely, close to the data.
Most important is not just running scoring but be able to fine-tune the models with enterprise data. We showed it a couple of times this morning on how you can take a base model and improve the results of those models by leveraging fine-tuning with enterprise data. Of course, we have the ability to call into third-party APIs. If someone wants to call into OpenAI or Azure OpenAI, we've had extensibility for a long time. We've improved the choices and extensibility. The message for all of you is, customers want to be able to do GenAI on enterprise data securely, safely, with compliance. That's exactly the platform that we showed this morning. This is not a, "Hey, someday, this slide will be." No, we were showing many of the building blocks.
I'll be the first one also to say, we have more work to do, but the vision is clear. The down payment from the technology is in the hands of our customers and partners, and we are very excited about what can be done, and Streamlit obviously plays a very important role. We announced partnerships with - h ave we mentioned that we're partnering with NVIDIA? The partnership runs deep. To be very clear, we're hosting their RAPIDS library, and we showed this morning ML training running, I think it was four or five times faster than running on CPUs.
We're leveraging their enterprise AI technology to run inside our Snowpark Container Services. Probably the single biggest aspect of the partnership, they have this NeMo generative AI framework, which has both models, but a lot of software that helps you train and fine-tune large language models. All of that is getting pushed through a container service to running through inside Snowflake. This, what Frank was alluding to, it's literally up and running here in the conference, if you, any of you want to go and geek out. Two other partnerships, Reka. We're quite excited about the partnership with them. It's a small startup building a foundation model. So far they've been very aligned with the use cases that we wanna enable. We always like companies that are aligned with our use cases.
AI21 Labs, also a partner. They're gonna be bringing their models into Snowflake. Of course, we're having conversations with others to continue expanding this LLM partnerships. I'll end today where we started last year, for those of you that joined us last time. The most important thing is we've started this three massive, let's say, ginormous waves of innovation. We decided this year to do not only 24, 2014 plus, 'cause the plus means it's not that we're done with analytics. That disruption is starting. Many of your organizations, of you in the room here, you know that you still have 80, 90% of data on-prem and still trying to break down silos, so that disruption continues. At the same time, the disruption and collaboration continues.
This is where data sharing and function sharing and application sharing is a part of it, and you're gonna be able to have native application and container services. All of that is part of it. Of course, app development. How do we bring all of the interesting use cases, whether it's cybersecurity or supply chain or marketing analytics? We wanna be able to deliver a different platform, where data is not copied over and over and over, and siloed and resiloed all the time, because we know that it just simplifies the time to value of all of these use cases, by vertical and by horizontal. We're extremely excited about all three waves of innovation. They're all making progress, and hopefully, at the conference, you'll get validation of the progress we're making on all of these.
That's what I had in terms of a quick recap of the announcement from this morning. What we want to do is, I think Jimmy already introduced that we have Sridhar from Neeva here with us. If we can get some couple of chairs, comfy chairs for Sridhar and I to sit down. I'll give a chance to Sridhar to introduce himself, but I'll tell you, in many of the conferences that I run, Sridhar is well known. I don't think anyone ever uses his last name. It's like, Sridhar is Sridhar. Like, you say Sridhar, you know who it is. We're incredibly excited to have Sridhar here. Maybe we do is we start with that background.
I said some of us know very well who you are, but I don't think everyone knows, so we can start with your background.
Yeah. I joined Google early as an engineer, in fact and grew with the team. I ran ads and commerce at Google for over five years, and it was quite the ride for any company. I think Google revenue grew from $1.6 billion in 2003, when I first joined, to over $100 billion in 2018 when I left. Incredible wave of technology changes were pioneered by the teams in search and ads that I ran. Many of the planet-scale machine learning systems were built as early as 2004, 2005. About 4 years ago, I left Google, and I started Neeva with a mission to humbly rethink search.
Little did Vivek, my co-founder, and I know that revolution in AI was up and coming. A lot of being a start-up is about taking opportunity when you see it. Early last year, we could see that generative AI was going to have a profound influence on how we consume information, how we talk to machines, how we talk to programs. We retooled our entire search stack, which we had built with a 50-person team, to release the first AI-powered search engine early this year. Satya famously threw down the gauntlet early this year for search, and it became clear that Google and Bing were going to be putting in billions of dollars into consumer search.
Vivek and I came to the conclusion that we really would be better off taking a lot of the skills in data processing, in large language models, done inexpensively. You know, we were a, we were a well-funded start-up, but we could still only spend $10 million a year, not $1 billion a year. When we looked around to who could be a great partner for us to take the journey forward, Snowflake was head and shoulders above everybody else that we talked to. We talked to a lot of folks, but we had amazing conversations with Benoit, Thierry, Christian, and Greg, and of course, with Frank. Here we are. We are four weeks in.
Yeah.
Very excited for what has been done, much more importantly, what is possible. You know, for me, AI, the non-hype part of AI, is really about fluid language interaction between humans and computers. I think it is really hard for us to understand how much of a profound influence that's going to have on all of the software that we're going to be using. But, it very much feels like a right place, right time, Sridhar.
Yeah. You were at the forefront. I think you built a search engine leveraging a LLM before anyone else. Is that factual?
That is absolutely correct. We set a high bar. We didn't know ChatGPT was coming. When we played around with language models, hallucination was a real problem. At one level, like middle of last year, it was clear that you could give a 1,500-word blog to one of these models, and have it do what's called an abstractive summary. A summary not done by, you know, picking out words, but by actually abstracting the concept in the blog and turn it around in, like, a second. It's like, it's real magic, because like humans, we just can't read that fast, for example. On the other hand, if you ask models questions about things that they were not really sure of, they would just hallucinate. They would make up stuff.
The bar that Vivek and I set for ourselves was that the AI answers that we provided, they had to be referenceable, meaning we needed to tell our readers where this information was coming from. It had to be real time. The world keeps changing, you know, you can't have 2021 as the limit of my knowledge. It needed to understand authority. The way we did that was by using a phrase and a technique that's now, I'm sure, familiar to all of you. It's called retrieval augmented generation. A way to think about that is LLMs combined with tool use.
Just like, you know, humanity and culture exploded when we developed not only language, but also tool use and the ability to therefore then convey these tool use through generations, similarly, there's a revolution happening with large language models, where they have incredible linguistic power, but they're also savants. When you combine them with tools like search, the ability to call APIs, there's going to be a big revolution in how we get information, how we consume information, and hopefully also many real-world things, like how we go about purchasing things, but that's a whole other story.
Yeah.
Multiple sectors are going to be disrupted in big ways, most definitely enterprise data, which was part of what excited Vivek and me.
Yeah. I think you and I were chatting recently that in the hype of GenAI, which we know, as Sridhar just said, it's a technology that is gonna disrupt and change, but it's often lost, this notion that these things make up stuff sometimes. I think as you and I were commenting on the. Yeah, people ask ChatGPT and then go and verify with Google Search.
I mean even now, it's a common thing with me. I pay for ChatGPT. I have it on my home screen. By the way, I forget who it was that said it. It might have been our friend Brad Gerstner, who said, "Hey, listen, if you don't have ChatGPT on your home screen, you're, like, behind the times." I have ChatGPT on my home screen. You know, but have to go through a process of trying to figure out, "Hmm, is this question mainstream enough that I can actually ask it of ChatGPT and get an answer that I can trust? Or if it's obscure, I have to worry that it'll just manufacture something, and I have to go and verify.
Yeah.
Just the other day, you know, I was asking it for... This is just a random thing that happened. I listened to a song, and I was like, "Oh, tell me the characteristics of this rag that was playing." It completely manufactured an essay for what the, you know, what this rag was. It had nothing to do with the structure of the music. You know, these are solvable techniques, and I feel we are just on the threshold of being able to solve it. It's really language models, tool use, how we orchestrate them together. You know, it's a little bit like the early internet. You know, you're on the threshold of something magical. You can't wait for it to show up.
Okay. Maybe tell us a little bit more about the team?
Yep.
Neeva team and the technologies that you bring, 'cause it's a very unique, world-class, team and technology you put together.
This was a team of 50 that built a search engine from scratch. We crawled the web at, you know, at a scale that we imposed on ourselves, 200 million-300 million documents. We, you know, we have crawled many petabytes of data as a single team, paying for S3 storage. We ran an index that was 6 billion-8 billion pages and an incredibly low-cost system that costs on the order of a few hundred thousand dollars to be able to serve web search traffic. We also had to pick up really, really important skills last year through this year. This was driven by the fact that, you know, yes, we did not have access to 5,000 A100s.
That was out of even our budget. As I said, we were a well-funded startup. We were spending $10 million on OpEx, but them fleets of 5,000 A100s cost a bit more than.
A100 GPUs, which are in short supply.
Roughly think of them as 8K a pop. If you are like, I want 8K a pop per year, if you are like, you want 5,000, do the math. That's $40 million. We were like, "That's not going to happen." What we did a lot of was we began to understand things like, you can do with fine-tuning, you can do with human feedback, problems that can be solved by much larger models. We pioneered a set of techniques in what is called transfer learning, where you really can use the output of large models in order to train small models. That is, again, a profound change for how models are trained, because I've done evals like, you know, for the last 20 years.
A lot of Google search quality, ads quality, have all been based on humans, very painstakingly labeling stuff. A lot of companies, certainly Neeva, but also Google and OpenAI, what they're doing is they're taking the very largest models, taking their output, as then training data for the smaller models, which get to be just as good as the big models for specific classes of problems. We also had to pioneer a whole bunch of techniques in order to make them fast. I'm sure some of you have used Sydney, and I've openly expressed frustration with Kevin Scott, the Microsoft CTO, that Sydney is so slow. That's because it runs on a very large model. We also did a whole bunch of techniques to speed up inference.
These are, again the building blocks for how you go about using large language models in practice. Part of what is really exciting now about things like Snowpark containers is that we took our original fine-tuning scripts for some of the problems that we were solving and worked with the Snowflake team over the last two weeks, mind you, to be able to push our fine-tuning script into a Snowpark container. Where we can download a model from Hugging Face, we can fine-tune it, and have the output be stored somewhere. Again, you know, we'll be figuring out how to, you know, run inference on them. We have shown some demos to Christian already. We want to turn these into recipes so that all of our customers can natively use language models.
All of us are very enamored by things like chatbots and tool use, and yes, those are the sexy applications, but there is incredible value to be created in things like understanding documents and things like sentiment analysis, translation of all kinds of feedback that comes into your product. A lot of boring business problems that carry incredible value for all of the enterprises. Those are the things that we wanna make sure that we enable. Of course, you know, you also have to be where the cutting edge of technology is. Making sure that we can use the same technologies to power Copilots, both in a one-piece sense, where you get assistive experiences for everything from writing SQL, to writing Streamlit, to being able to produce dashboards.
Plus, also making sure we have this technology be usable by our customers as they think about what are transformative experiences they wanna build. These are all in our roadmap. You know, we are like kids in a candy store, pretty excited.
Yeah. By the way, that's super important for all of you, which is the - even though on the surface, yeah, Neeva's consumer search is something getting to consumer search, no. Everything Sridhar is saying is the techniques, the technology, the approach. How do you put into production at one of these language models? Easy to say, really hard to do it, and do inference in 100 milliseconds, 200 milliseconds.
100%. You know, for something to be an interactive app, you have to finish up search, to retrieve context, and then a language model run to produce a fluid answer. Like, you don't really have more than 800-900 milliseconds before people are like, "Eh, this is sort of, this is sort of slow." And to create high-quality experiences needs a lot of deep technical expertise. Again, something like Snowpark Containers is the ideal vehicle for us to be able to take some of these, as I said, turn them into recipes that our customers can use. There's a lot here.
Yeah. The one last comment on the applicability of what Neeva built is think of use cases like search for a marketplace. The Sridhar team already took, pointed their crawler into the marketplace. Tell us some of the things that you showed me, which is incredibly amazing.
Yeah. I mean, as I said, language models have hallucination problems, language models have authority problems. An increasingly important layer of software that is going to exist between us humans and the language models, is going to be this piece that sits on the side that is going to provide context for the language model. In the context of Neeva, the consumer search engine, this meant that we would run search on whatever it is that the user typed, but we would also make sure that we took care of, like, intent disambiguation. A word like Swift, for example, has all kinds of meanings. Obviously, it means fast, but it's also a banking code. There's a laptop with that same name. It's a programming language.
Taylor Swift.
Huh?
Taylor Swift.
Well, there's also Taylor Swift. Disambiguating so that you can feed the right context into the language model becomes important. Expertise in doing really good retrieval and clustering is a fairly important part of how you make a language model work. In fact, you know, all of the cool kids that are doing startups these days, they operate on what's called a LangChain, Pinecone, OpenAI kind of stack. Where Pinecone is the one that's used to do vector retrieval, LangChain is the orchestration layer, and OpenAI is the model. It's a cheap way to get a nice demo going, but you need industrial-strength versions of this.
On the Marketplace, for example, we just hooked up a retrieval engine, to crawl all the entries off the Marketplace, combined it with the language model, and all of a sudden you have an interactive app where, you know, to quote Jensen from yesterday, you can basically talk to the Marketplace. But we want to enable this on other subset of docs to be able to do the same thing, for analytical questions that you might have on your data. But it's the same technique. You set context, and the language model understands the context to answer your questions.
Yeah. We are incredibly excited to have Sridhar, the broader Neeva team, the technology, the expertise they have. We wanted to have Sridhar here just to give you a glimpse of what's in front of us that has not been shared at the conference, and it's super exciting, and thank you for joining us.
Thank you Christian. Thank you all for your time.
Thank you, everyone. Now, I'll introduce Chris Degnan, Chief Revenue Officer. Yeah, I'll let Chris introduce our customer guest. Thank you.
Good job. Perfect. All right. Hey, everyone, my name's, Chris Degnan. I'm the Chief Revenue Officer of Snowflake. I've been with Snowflake for about 10 years. Rob, why don't you introduce yourself?
Yeah. Hi, my name is Rob Smedley. I'm the Vice President of Technology at Disney Parks, Experiences and Products, responsible for all things data. Data platforms. We'll have a seat, right?
Yeah, have a seat.
Data platforms, data products, data management, data governance. Parks, Experiences and Products, if you don't know, is our global parks and resorts business. Also includes Disney Signature Experiences, which is Disney Cruise Line, Disney Vacation Club, Adventures by Disney, and it also includes our consumer products business, which is our Disney Stores, our shopDisney, e-commerce business, as well as licensing, publishing, and games.
Awesome. Well, Rob, thanks for doing this with us today. Appreciate it. Maybe just give the audience a sense of what motivated you to move to Snowflake in the first place. What other solutions did you look at when you were doing that?
Yeah, we were at this kind of crossroads. We had a large Teradata warehouse, kind of on-prem in our data center in Orlando, and we had just reached capacity. You know, as data got bigger and more complex, and our use cases got more advanced, you know, being constrained by compute and storage, just we were stuck. We were in the midst of this attempt at a data migration into a Hadoop platform. It wasn't going particularly well, and we got accelerated pressure to get out of our data center. We had to migrate to the cloud. We started looking for some solution that allowed us to really just lift and shift what we had in Teradata into the cloud. You know, tremendous cost savings associated with getting out of the data center.
It still, you know, also opened up, you know, scalable compute and got rid of some of the problems that we had, and that's where we found Snowflake. That was, I don't know, maybe 3 years ago or so, and we completed that migration in the last year, I'd say.
That was kind of my next question, is kind of you've, you've gone through some of the migration. You've got some of your legacy systems. What, where, what's the phase, or how far along are you in your enterprise migration?
Yeah, I mean in terms of moving to Snowflake, we've retired all of our kind of legacy platforms that we had, our Teradata platform, our Hadoop platform. We had some data in Redshift. All of that's been migrated to Snowflake. Due to the speed at which we were moving, we really didn't have any other option but to just lift and shift that into Snowflake. Great, new, modern platform, really old patterns and data architecture. In terms of really being able to advance our use of data, we needed that refactor. That was kind of the next step. We started that.
In the past year, I'd say we've started that refactor, taking our most valuable data and saying, "All right, we're going to modernize our data architecture, and get that ready, really for that next phase." Our initial move to Hadoop was motivated by AI and ML readiness. We didn't really get there. I think this, where we are in terms of modernizing our data architecture, that's, I think, really what it's about now for us.
Awesome.
Yeah.
What business cases do we support or does Snowflake help you support within Disney Parks?
Yeah, I'd probably categorize that in two ways. I'd say, first of all, I mean, data drives everything that happens at Disney Parks, right? You can't walk into a park, you know, with a ticket and order food and ride a ride and stay in a resort. Every decision in the business is powered by data. Just day-to-day running of the business, for sure is coming out of Snowflake. I think you know, the next piece of that, though, which is we're a little bit in the present and a little bit in the future.
This is something I saw a few years ago, I started talking about at Disney, is that we used to see kind of the big data analytics world as a analytics function, and then there was this operational function, and it felt like, I don't know, maybe back in 2018, those lines were starting to blur and now start to disappear. That's kind of the next, the next wave of Snowflake capabilities for us is, yes, all of the analytics, I can, I can wait five minutes, 15 minutes to get my data, but now really powering, you know, customer-facing use cases. That's, that's really the next frontier for us.
Awesome. You're obviously a large customer of ours, and, so how do you evaluate getting more budget and spending more budget or money with Snowflake?
Yeah, I mean, every time there's a new system or a new thing that generates data, you know, we're very project driven at Disney, and everything we kind of fund tech almost the way we fund construction projects sometimes. Every time there's a new system, something generates data, or every time we have some new use for data, that's a project for us. We, you know, we evaluate what's the benefit of that project, revenue or cost efficiency or whatever, and you balance that with the costs of implementing. That's, you know, we take that into account. I'd say the other side of that, though, is there's a clear recognition that data is a differentiator for us, that data strategy is extremely important to us.
We tend to forgive a little bit the difficulty that you have sometimes tying data directly to revenue or You know, it's hard to make those jumps sometimes. We are investing in our foundation. It's not always a matter of, you know, okay, I have a, you know, top line, bottom line benefit, so.
Probably a lot of these folks in the room have a question because You know, you've been on this journey for a while. You've been using the consumption model. How do you manage the consumption model at Disney?
Yeah, I mean, I think, you know, first and foremost, you have to put the effort into optimization. You know, when we did our lift and shift from Teradata to Snowflake, and we did this eyes wide open, you know, we had code that was probably written in 2005.
Yeah.
That we're lifting into. You know, if you expect it to run efficiently, I mean, it's not going to. You're going to have to do something to make it run efficiently. You do have to put that effort in. You know, when we first did that migration, we were like 2, 3X what we thought we were. You know, we had no idea how to estimate, and I think it was maybe a slight panic, but a reassurance we knew this was going to happen. It didn't take long for us to make incredible gains in optimizing and controlling those costs. In the process, we got very mature in our modeling of how expensive is this new thing going to be? It's very easy for us to now predict.
I know what this is going to cost. You know, we're able to make those decisions, you know, pretty well informed.
I mean, it's important, I think, as you guys-
Yeah
You know, scale your business, you have to be able to budget accordingly.
Yep.
You guys have done a good job of that.
Yep.
Okay, where does Snowflake actually sit in your data management stack?
Yeah, I mean, right in the center. Snowflake is kind of the core of what we've designed. We run Snowflake in AWS. So we tend to use some native AWS kind of components to get data in. We use, you know, Kinesis and Amazon's data migration service, things like that. We have invested in Alation as our data catalog for data governance. I'm very, very bullish on DBT. I think, as someone who comes from a very traditional software engineering background, kind of the shock that I had when I got into the data space years ago was like: Wow, the engineering practices that I'm so used to, I don't see those in the data engineering space.
With tools like DBT, it's really enabled us to use Snowflake and also have CICD and test automation and all these things that were kind of second nature to us. You know, we have other tools that we look at. You know, we're looking at DBT, BigID, to help us with, you know, compliance and governance, things like that.
It's great. All these are Snowflake partners.
Yeah
Which is great. You just heard Sridhar talk about, you know, GenAI. What's your perspective at Disney Parks around generative AI?
Yeah, I mean it's real. I mean, there is a ton of hype around it, there's a lot of work to, I think, be done to see the big revolution come. I do believe it is one of the most disruptive innovations that we've seen, you know, top three disruptive innovations in my lifetime, for sure. It's a, it's a very real thing, and we're absolutely - I mean Disney is a company, especially parks, that was built on the idea of innovation. We're absolutely looking at GenAI and hope to be a leader in that space. First, you know, focus on the data. You know, I talked about how we're in the middle of that kind of refactoring of the data, that cleansing of that data, so it's ready.
It's not ready today. You know, we can go and do some GenAI use cases, and it'll be flashy, and everybody will be pretty happy with it, but we're going to hit a wall really, really quick if we don't first focus on the data. We kind of ran over here from base camp. I just finished a talk myself, where I talked about if you don't invest now or you haven't already started to invest in cleaning up the data, you have no chance. Every day that you wait, you're going to get farther and farther behind in that race. That's, you know, I think, really where our focus is today, so.
Similar story that we heard today from Mihir at Fidelity.
Yeah.
He said the same exact thing, so.
Yeah.
It's great to hear. The final question we have is, do you have any, you know, plans to adopt, you know, newer technologies from Snowflake, like Snowpark, Iceberg tables, et cetera? Anything that's coming up for you?
Yeah, a lot. And, the list is probably a little bit longer than it was maybe on Sunday evening.
Yes.
There's some pretty interesting things coming.
Yeah.
I think Iceberg is important to me. That external storage of data, that helps me mitigate risk. You know, Snowflake, as a choice for our warehouse, helped me mitigate risk because I didn't have to lock myself into GCP, right? I can go where I want in terms of cloud, having, you know, cloud flexibility. External storage of data gives me flexibility and risk mitigation in the storage of my data, that allows me then to take bigger bets on some of these new products, some of these new features that are coming out on Snowflake. I can go harder into some of those, knowing that I'm mitigating my risk elsewhere. That, that's really big for me. Streaming, for sure.
You know, as those operational use cases get more and more important, you know, it used to be 15-minute latency was okay, then it was 5, then it was 1. Now we've got to move faster, so we'll be early adopters of Snowpipe Streaming. Snowpark, for sure. We are very big on data applications and data products, and I think Snowpark gives us a ton of capabilities there. I'll be very interested when Unistore is when we're ready for Unistore, because I think that's going to be a big innovation, especially as we're going towards those operational use cases.
Christian, we have to type faster.
Yeah.
Yeah. Rob, thank you. I think we love. You know, at Snowflake, we love customers like you, who are actually willing to try our new technologies, give us feedback early. Thank you for the partnership. We look forward to continuing to partner with you in the future.
Yeah, likewise.
Thank you. Thank you.
Thanks.
Thank you, Rob. Thank you, Chris. Thank you, Christian, and thank you, Sridhar. All right, for our final segment before Q&A, I'd like to welcome Mike Scarpelli, our Chief Financial Officer.
Good afternoon, everyone. I'm glad I'm here in person this year, having not made it last year, I apologize I didn't get to talk to many of you after our last earnings call, as I think I was dying with pneumonia. Anyways, hopefully, you guys are getting a really good view of what we're doing. I know a lot of people have had questions about generative AI and what we're doing. This is not something that we just thought about. I hope you've seen that. We've been thinking about this for many years, what we're doing. That's one of the things that I've said in the past, too. When Snowflake does something, we don't just roll out a new feature tomorrow.
Many of the things we work on take years to develop and we have other things we haven't talked about, and we're not gonna talk about them until we know, technically, we've proven it, we're gonna have a product. What I wanna do today is I really want to give you some of our highlights from 2023. You know, we're really proud of 2023. We grew our product revenue by 70%. Not many companies at that scale, to go to $1.9 billion, have been able to do that. We expanded our non-GAAP operating margin by 700 basis points in 2023. We generated more than $500 million in free cash flow on a non-GAAP basis in 2023.
Those are pretty spectacular numbers with the growth that we've done. We're gonna continue to show you guys leverage in the future. I'll show you. Let's talk about where our growth opportunity is from here. You know, Gartner has come out with this new market thing. Our market is growing. What I really want to get across here, whether it's $290 billion or $270 billion or $300 billion, it's a massive market opportunity. This is not one person take all. There's gonna be many successful people in this market. We think we are gonna get more than our fair share of this market. I want to remind you, too. It's also a very competitive space, with the three largest hyperscalers in the world, Google, Microsoft, and Amazon.
You can see from our announcements that we've had with our recent one with Microsoft and AWS, we have very good partnerships there, very good go-to-market alignment, which is getting better. Google is the one we still need to work on, and we're open to that. They're just not as open to it. You know, and we do think this market is just gonna continue to grow with the proliferation of data, and I'm sure generative AI is gonna create even more data, so I bet these numbers are even understated. This is what gives us the confidence for our long-term model and I'll talk about that a little bit more later on. We're really focused on landing the largest organizations in the world. You know, three years ago, we had 16% of the Global 2000.
In Q1, we had 30% of the Global 2000. We're really looking to land these large quality customers. It's all about quality of customer, and we have a long runway in front of us with these customers. Many of them are in the very early innings. Some are still at in the warm-up stage before even going into the game in terms of migrating data to Snowflake. I talked about quality, and I've said this before, you know, many people ask about, "Oh, your customer adds." I really don't focus on our customer adds. I focus on our large customer adds, those quality customers. This is what I really focus on. Who are those customers that are gonna be able to pay us $1 million a year, $5 million a year, $10 million a year, $20 million a year?
We will have customers and many customers paying us north of $50 million a year, and we're at that stage right now with a few of our customers. As of the end of 2023, we had 330 customers paying us north of $1 million a year. You can see that was up from 80 in 2021. We have 60 customers paying us more than $5 million a year. That was up from 13 in 2021, and 20 customers are over $10 million, and we have 10 customers that are north of $20 million a year. These numbers will continue to grow with Snowflake, and you know what's surprising to me?
Even our largest customers, I thought two years ago they were probably as big as they were gonna get, and we're still identifying opportunity, and they're continuing to grow with us. You know, I really wanna focus on. A lot of people don't understand, when we land customers, we land customers small. It's rare that customers start off at $1 million a year. Our average, if you look at all of our $1 million-dollar customers, the average size these customers land at is $150,000 a year, and then they grow. That's what really leads to our why you see our net revenue retention continue to grow as well, too, because most customers start small, and that's the beauty of a capacity business. When you talk to our customers, they're really just buying capacity.
It's whether they may sign a three-year contract, but they're buying capacity, and whether they spend that in six months, a year, or two years, or three years, it doesn't really matter to us. I actually don't get that hung up on what the bookings is. I get more hung up on what the revenue is from those customers. You know, migrations, this is one of the things that gives us confidence that our customers are going to continue to grow. We started tracking. This is really self-reported by our GSIs, but our top GSIs, and this is not all of our GSIs, they tend to be our bigger GSIs. In 2022, they self-reported $847 million in services revenue associated with Snowflake work. Last year, $1.34 billion in services work.
Customers don't spend this type of money on services around Snowflake if their intention is not to grow with Snowflake and consume Snowflake. When customers are making a decision to go with Snowflake, these are multi-year decisions. These are not a one-year decision. We're replacing systems. Some of the Teradata systems we're replacing are 20 years old, 15 years old. When customers are looking to do these migrations, they're making technology decisions for the next five to 10 to 15 years on Snowflake. That's what gives us the confidence that we're going to continue to grow and hit our longer-term numbers. In terms of time to migrate, this is the one thing, and I get asked a lot of questions by people: What are you doing to accelerate the time to migrate? You know, the biggest challenge to migrating customers is customers' timeline.
How quick do they want to go? Many of our new customers that we're landing now want to go at their timeline because they want to make sure they do it right. You heard Rob from Disney Park, he was talking about, well, they just migrated all their data and didn't really focus on the quality of data and re-architecting it. Most of our customers now, when they're doing a migration, there's a lot of work that goes into ensuring that the data, when it goes to Snowflake, is architected properly, so they can take the full advantage of that data in their business. That's one of the things that takes the time. It's still taking us, on average, 240 days for a customer to actually ramp to their initial contract value and what they're consuming.
Once they do that, then they start to grow. I also want to stress, too, it's labor-intensive, too. You can have the best migration tools to convert code. It's billions and billions of lines of code that are getting translated, but you got to make sure everything is right. There's time that still goes into this manual time to make these migrations successful. You know, our expansion patterns. You know, if you look at our recent NRR in the 150 range, that's actually back to where it was in 2021. I would say that 2022, there was probably a little bit of euphoria with customer spending. Now we're seeing it more back to what we would say was normal. Not saying it's not going to go down.
I don't think it's going to stay at these rates forever as our customer base continues to grow, but it will be above 130, and I've said this many times, for a very, very long time. What's interesting is when you look at our Global 2000, our large customers, this is what I like. Their net revenue expansion rate has been very stable. These are the ones that are a lot more disciplined on doing these big migrations. They take their time, but they are growing, and they will continue to grow. I really like the net revenue retention within these Global 2000. There's still a massive growth opportunity in front of us. If you look at the G2Ks alone, our average spend today for a Global 2000 is $1.4 million.
Our customers, those 330 customers I talked about that are paying over $1 million a year, their average spend is $3.7 million. There is no reason why a Global 2000 can't spend $10 million+ on Snowflake. I'm not going to say it's going to happen overnight, but these guys have massive IT budgets, and what they're doing, that is not a big dollar for a Global 2000. Our top 25 customers on average, spend $18 million a year. These numbers will continue to grow over time. We believe the G2K customers will get to similar or larger size than our average our average million-dollar customer between now and 2029 in our, in our tenure, in our longer-term model.
When you talk to these customers, you talk to Rob, was just up here from Disney, they have aspirations to do more on Snowflake, and that's what we're really working on, and that's one of the reasons why we talk about all these new product announcements, is to get our existing customers to grow more in Snowflake. It's really our existing customers today, with some new customers coming on, that are going to get us to where we need to be for that $10 billion target that we had put out. I also want to say, too, as of the end of last year, only 17 of those top 25 customers are G2K. There's still some non-G2K, relatively smaller customers that are big spenders on Snowflake.
We do see a shift with our largest customers from a revenue standpoint, shifting to those large enterprises, and more so because they're just growing faster than some of those other smaller customers that were significant consumers of Snowflake. In terms of our go-to-market strategy, I think this is really important. We're really aligned, and this is something we've done over the last few years, we're aligned by theater. In theaters, we have US verticals, we have our enterprise. Our enterprise is really focused. Our verticals is focused on the largest customers within the verticals we play in. Think of financial services, healthcare, and other. The enterprise is really focused on accounts with 500 or more employees. You have our corporate that is focused on those sub 500 employees, and our corporate sales is really an inside sales motion.
We now are aligned by industry, and the industry is these eight industries that we have lined up here. We're also aligned by workload. That's something relatively new that we're really focused on within the sales, going into, because it's really important, we understand, to actually go in and sell customers on workloads. This really is a workload-by-workload sales when we're going into our customers to get them to grow. It's so important, we really sell one product at Snowflake with four versions, you could say. We have our standard, our Enterprise Business Critical, VPS. One of the things I do notice, very few customers actually start out today on VPS, Virtual Private Snowflake. Used to be most financial institutions wanted that.
Now we're finding they're comfortable with the security within our own, our regular Snowflake multi-tenant environment, that they're comfortable going into that because VPS, you do sacrifice some of the things of data sharing and other things. I really want to get across, too, and our product really supports this full spectrum of workloads, from data engineering to data science, AI, ML, applications. We're super excited about the native app development within Snowflake and the things you're going to be able to do with Streamlit. One of the things we've really been focused on the last few years is really ensuring that our salespeople are equipped to sell the specific workloads. A lot of our sales enablement in the last year has all been around Snowpark. How do you sell Snowpark going into a customer?
Because many times you're really That's a very different conversation than a data warehouse migration. It took time to train people, and we're learning, and sales enablement for these specific workloads is really key, where we're investing a lot of money. In terms of When organizations land on Snowflake, they arrive with complex data states. It was interesting, the gentleman from Fidelity, if you saw when he was up there, he talked about they had 170 different data silos that they want to migrate. What's interesting, they've been three years into that journey, and he still said they have another 18 months to go. These are long. They'll be 5 years into this journey before they fully migrate all this stuff, and I want to get that across to people. These take time.
They take a lot of time. Some customers, I think, are gonna take 10 years. The first thing most customers want to do is they want to really consolidate their data silos. What Frank talked about, to really get the benefit of AI, you need to have your data in a very structured, architect properly in one location. That's what our customers are focused on doing. And I also want to stress, too, we migrate a variety of legacy vendors into Snowflake, whether it's Teradata or your traditional data warehouses, Hadoop, a lot of SQL, a lot of things we're migrating. When we win, we're really winning workload by workload, and these expansions take years. Each opportunity. This is important, too.
We may have won one workload, but when we're trying to do another workload, we're competing with someone else for that workload. It's not just given to us. Our salespeople have to be involved in those customers. That's really important. A lot of people think, well, it's kind of like an annuity. Once you sell a customer, why do you need to have salespeople? You always need to have salespeople in there because you're always gonna be attacked workload by workload, whether it's by the hyperscalers or some other technologies out there. Yes, I do listen to you, Chris. That's why we need to pay salespeople. It's actually really important. We do a lot of migrations. You know, in 2023, we did 2,410 migrations. We replaced over 3,000 vendors within our customers, and these will continue.
This is why it's super important to us that we have relations with GSIs. We get them to build practices because we can't do all this work. Yes, we have a PS organization, but we need our partners to help us that are in at our customers, you need those partners to actually advocate for you so that you're getting the work and not someone else. Snowpark. You know, in November 2022, we've talked about Snowpark for a while, but in November of 2022, it became available for Python. It was generally available. This new capability brings a new competitive landscape. It's really unlocking new workloads for us, it is taking off within us. We're now replacing Spark, EMR, and Databricks, I have the data, you can see that.
These are not legacy solutions. We compete against each incumbent vendor in this space. Snowpark is taking its share. What this graph is showing you is two Spark technologies that are running within our customer base. I can see that. The blue at the bottom is Snowpark, and you can see how now Snowpark consumption, this is looking at daily credits. What they're consuming is now outpacing Spark number one, and it's gonna surpass Spark number two. What you're seeing also is those ones who were growing within our customer base, we're growing much faster than them. We feel that Snowpark is being very successful, and it will continue to grow. Once again, we're still in the very, very early innings. You know, Christian talked about the price benefit of Snowpark, and it's anywhere from 2x to 12x.
We have one financial services customer that actually replaced Databricks, and they save $4 million a year by doing this, and a huge cost benefit to our customers. You may say, "Well, why is it that it saves those customers that much money?" You have to remember, these Spark workloads that are running in our Snowflake customers, they take the data out of Snowflake. There's a cost to move that data. They do the Spark outside of Snowflake, then they push that data back into Snowflake, and there's a cost to do that, too. Plus, they're incurring the compute and storage costs while it's out of Snowflake. If we can show customers you can do it at the same performance or better running it in Snowflake at a fraction of the cost, why would you move that data outside of Snowflake?
Not to mention, you have much better governance and security on that data. You, you know exactly where that data is. We are seeing a lot of really good traction with Snowpark, and we're pretty excited about that as well. What I will say is, we've done a lot of successful POCs. We've done a few customers that have migrated into production. Those are in the very early innings, and there's a lot that are planned over the next six to 12 months. Once again, it is a migration again. It takes professional services to do these migration of these Spark workloads to Snowpark, and customers need to prioritize it, and they need to have their people involved in doing these things, so they take time.
In terms of early signs of adoption and really looking at here, if you look at our customers over 1 million a year, 100% of those customers are data warehousing customers. 85% are using Snowpark. I would say many of them are still in the kind of the POC, trying it, that aren't fully deployed with Snowpark. 65% are using some AI ML capabilities, 70% are using data sharing. Data sharing will continue to grow. That data sharing, I want to stress with people, too, is probably one of the key differentiators of Snowflake.
Data sharing drives new customer adoption for us, because when we become the standard in certain industries and you want to get data, we have customers telling their vendors, "You must be a Snowflake customer, because that's how we want to get our data, through data sharing." If a company like DTC is successful in what they want to do, that is driving financial services firms to Snowflake because they want to be part of that. This whole network effect created by data sharing is unique to Snowflake. If you look at all of our customers, 95% are data warehousing, only 35 are using Snowpark today, 20 are using AI, and 25% are doing some type of data sharing. We do expect as those customers grow, you're gonna see more of these other workload adoptions within those customers. You know, M&A.
M&A is not something we just do haphazardly. When we're doing an M&A deal, I spend a lot of time on M&A with Christian, Benoit, and Greg and others, it's all about how is this going to accelerate our product roadmap? When we're looking at M&A, we're looking for things that will accelerate our product roadmap. You can see in 2022, we did two acquisitions. Those were more talent acquisitions. We made the decision back in 2022, actually 2021, we're gonna build up an engineering focus in Poland. Most of those people are focused on connectors. Think of ServiceNow connector and other things, but they're doing other things, too. To accelerate that opening of that office, we did 2 acquihires, 2 deals in Poland.
In 2023, we acquired Streamlit. That was an unbelievable acquisition, and we're super excited about that. Christian talked about the Applica. That was a year ago when we did Applica. It was in Q2, end of Q2 when we did that, this is all about the Document AI. It's not something we were just thinking about today when AI's been talked about. We've been thinking about this for a long time. Recently, Neeva, Tuk, Snow Convert. Snow Convert is Mobilize. One of the reasons we bought Mobilize was to help with our customer migrations, whether it's off of Teradata, Netezza, Exadata, whatever, they help a lot with that. By the way, we're gonna continue to do M&A, it's always focused on how is it going to help accelerate our product roadmap.
The most important piece of all M&A, is does the team have the right DNA to fit into our engineering organization? That is super important. I gotta tell you, Christian, Benoit and the team, they do unbelievable diligence on the people. They're always, "Would I hire these people if they were standalone people?" Really important. Forecast visibility. You know, this chart is showing an early journey of a top 10 customer. Over the course of a few years, this customer signed multiple contracts to support their consumption. You can see the dark blue line is, that's the contract they signed, what the contract rate is. The blue line is their product revenue. You can see here that some quarters, the ACV is below the revenue, other quarters, it's above the revenue.
The important piece about this is what I'm trying to show people is we don't predict our future revenue based upon bookings. We predict the future revenue based upon what our customers are consuming today, looking back historically in how we expect them to grow. If you look at our models, they're not driven by bookings at all. Yes, it is driven by number of new customers, but it's not the dollar amount or the bookings amount of those new customers. That's a really important thing, and it's different in a consumption model like Snowflake versus a SaaS company. That's why we never talk about bookings or billings. Yes, we disclose RPO because you're required to do that, but that's not how we forecast our business. Big topic that a lot of you have been talking about recently.
It's not something new to us. We've talked about optimizations all the time. I will tell you, optimizations will continue to happen in a year from now, three years from now, five years from now, just like they've happened since day one of the history of Snowflake. There's really three buckets of optimizations that happen at Snowflake. We control two of them, the third we don't control. The first one is the CSPs, Amazon, Azure, Google. These are hardware improvements, and I want to stress too, and I've said this in the past before, not all hardware improvements benefit our software. Yes, Graviton2 had a big impact on our software performance. Our customers get the benefit of all of these things.
Snowflake software improvements, probably last year, or 2 years ago, one of the big ones was storage compression that Christian talked about. I want to remind people, every 2 years, we generally have new storage compression that comes out, and our customers get the benefit of that. There's also things like our warehouse scheduling service. There's going to be another big one that's going to come out next year, which is going to be auto warehouse sizing for customers. That's one of the biggest challenges for customers today, is how do you size the right warehouse, so you're not paying for a bunch of capacity you're not using for when you're sizing those warehouse. That will be a software improvement. We control those. We control how they get rolled out.
If you were at the session this morning, one of the engineers Allison Lee, she talked about last year, they built this tracker, and this is gross, not net. They estimate that it was a 15% performance improvement for our customers because of the improvements that happened in software, but it's also the hardware coming into there as well, too. Net benefit, I want to remind people, we expect a net 5% revenue headwind every year due to the hardware and the software improvements. The third bucket is customers, and customer optimizations are very different. We do expect they're going to continue to happen. Y ou heard Rob from Disney talk about, well, they just loaded all their data into Snowflake, and then they had to kind of really clean up that data.
That is inefficient spend they had on Snowflake. That was an optimization they went through. Those are typical things that customers do. I've talked about in the past, that another thing that customers do when you load your data into Snowflake, you index your data. When you have that data indexed, it's easier to search. It runs more efficiently. If you're loading data and you don't index it or your indexing gets out, your queries don't run as efficiently, and it's not as accurate. We have customers that are going in and reindexing their data. These are normal things, and they'll always happen within customers. They're not new to today. We talked about on the last earnings call, we had one large customer. It was another division of maybe the person who was up here.
They decided to reduce their amount of storage, from being five years retention down to three. That was 7 petabytes of data. In that customer's mind, that was an optimization. They're still doing the same amount of work on Snowflake. They're just not running those queries against the same amount of data, so it saves them a lot of money, both on the storage and the compute side. That is a choice that a customer makes, but we don't control that stuff. I really want to stress that customer optimizations are always going to happen. We don't control them, but they will always happen, and we expect them to happen. FY 2029 targets. You know, we still feel very confident that we will reach $10 billion in revenue and product revenue in 2029.
We expect our non-GAAP product margin to be 78%. What I will say there, right now, we guided this quarter to - or this year we're in right now is 76%. You may say, "Well, aren't you going to see more?" Yes, we got better pricing out of Microsoft. Frank talked about that new contract. We're getting good pricing out of AWS. I will say Google is almost 50% more expensive on egress, storage, and compute right now with us. One of the reasons why I'm not forecasting it to go above, there's things like Unistore that are going to come out. Unistore requires double the storage to happen within a customer. We expect there may be new features that come out that could be a headwind to expanding that product margin more than that.
I don't feel comfortable right now going above 78% longer term. non-GAAP operating margin, we do expect that's going to expand. We're taking that from 20-25, and we're taking free cash flow from 25%-30% for 2029. One of the things here that I think is important is net revenue retention. We've been spending a lot of time looking at net revenue, looking at various maturity cohorts. Customers in their second year at Snowflake, we call the year two cohort, grow at a faster rate than those at the 6-year cohort. In 2024, we've seen both young and old cohorts expand beyond their historical rates.
We do feel this is something of the environment in 2024, where customers have been looking to save money because of uncertainty in their business. These are what's driving some of the optimizations. I don't see migration slow down within customers. It's really some of the expansions of workloads and customers trying to figure out how to use Snowflake more efficiently. We, you know, we do view that there's gonna be a slower ramp than we've seen historically, but it's gonna be a stable expansion. We still feel comfortable that we can get to that $10 billion with a lower net revenue retention rate. We think that's reasonable based upon the customers that we have and what we're seeing.
Shifting to margins, I talked about margins already, I'm not gonna spend more time here, we have seen dramatic expansion on the product margin. I think I told people at the time we were going public, you're never gonna see product margins the same way you see in a traditional SaaS business. Think of a Salesforce or a ServiceNow. Why? Storage is a big component of what is in our COGS. 10%-12%, depending on the customer, on average, is storage. That will be a headwind, too. In storage, we don't make much margin on storage. It's pretty much a pass-through. We make a little margin on it, that will be a headwind to our product margins.
This is an important piece here, too, a lot of people ask: How do you make the determination when to invest in your salespeople in particular? I got to tell you, sales is what drives a lot of our budgeting when we do things in terms of looking at productivity. When we see productivity be above that 1, the way we de-define it, we add more people. When we see it drop below the 1, we slow our hiring down. You can see in 2023, we came out at 0.9x, below the 1, we have slowed our hiring down. Doesn't mean we're not hiring people, but we're not hiring people at the same pace because we really want to focus on getting our salespeople productive.
In terms of free cash flow, this is an important thing, too, and we've been getting the benefit of this in our free cash flow, is, I've expected from the time I joined the company that customers are going to push back on payment terms upfront, but it's surprising most customers still want to pay upfront. If you look here, 80% of our customers in 2023 paid annually in advance, 20% pay on other payment terms. That's either quarterly or monthly or monthly in arrears. I do expect more customers are gonna move over time to quarterly or monthly in arrears. Why? Because that's how Microsoft, Google, and Amazon charge their customers. So far, customers haven't pushed back.
We're willing to do that with customers, but customers generally like to do annually in advance to get the benefit of a bigger discount. It's a trade-off. If the payment terms are gonna change, it's gonna have a positive impact on the, the product margins. I'm not seeing that pushback yet from customers, but I do expect that to happen, and that will impact free cash flow. That's a really important piece. I want to remind everyone, as we're talking about free cash flow, the quarters where you have the highest cash flow are always going to be Q4 and Q1, is where we have our highest cash flow every quarter, and that's really the timing of most of our customer contracts.
In terms of modeling considerations, important thing, we assume in our 2029 targets, less than 5% contribution from Snowpark based upon current consumption patterns. That could go up, it could go down, but that's based upon what we see today. We assume insignificant contribution from Iceberg, Streamlit, and Unistore. As I said, when we do our forecasting, it's based upon our historical consumption patterns of our customers, based upon the products they're using. I would say that's upside. We're also not assuming any additional tailwinds for public sector or China. We are looking to launch into China. We will be in China next year, but it's not China for all customers. It is China for our global 2,000 customers that are outside of China, that have operations in China, that want to be able to leverage Snowflake in China.
It's not a massive subset of customers, but they happen to be some of the largest in the world that are asking us to be there. The other thing, too, is, as most of you know, we've been working on FedRAMP High and IL5. These take time. We will have FedRAMP High this year. We haven't forecast anything for that because we don't have the historical data to forecast that. We do expect that we will have more than $150 million in interest income this year. That is factored into our cash flow. We expect to earn interest above 3% through 2029. I'm not an economist. I can just forecast based upon the data today, where interest rates are for us, and we feel good about that.
You know, I also want to stress, too, that a lot of our interest income is not cash. Like, you'll buy commercial paper at a discount, and when you get the amortization, that flows through an interest income, but it's not all cash to us during the quarter. Don't just take that interest income in the quarter and back it out of cash flow to figure out what the true cash flow from the business is. We will give you that number, what is the cash and non-GAAP cash piece in our Qs.
We forecast an effective tax rate on a non-GAAP basis of 26%, but our cash tax rate will be below 5% because we have so many NOLs in the U.S., but we do pay cash taxes in countries around the world where we have significant presence under a cost-plus reimbursement. Dilution, this is one of the biggest topics that people always want to talk about dilution. You know, when we look at dilution, we really don't focus on SBC. I look at dilution, and it's reason being is dilution is the best indication of current grant behavior where SBC really reflects the trailing four years of grants coming through. You know, in 2023, the vast majority of our grant amounts went to new hires and R&D employees. I'll just show you.
You got our R&D, most of our grants, and then our other new employees, you can see 14%, and then our refresh grants to our existing employees were 25% of that pool. R&D employees did account for 61%. I do think in the future, we will continue to grant more skewed towards R&D. You know, salespeople tend to get more cash compensated on the sales side. You perform, you get paid. It's, it's a really important thing, though, that engineers all want R&D or all want equity, and we're competing with the largest technology companies in the world, those who are all granting that R&D, too. Dilution is not gonna go to zero unless we do a massive buyback to offset that.
I do think dilution, I'm just gonna skip through these things, will come down over time, and it is coming down this year, partly because we're hiring less people. As we slow our hiring down, the grant expense will decrease. I will say we do use RSUs, and we heavily compensate on M&A transactions. We will continue to do that. We are benefiting from fewer grants and lower grant amounts. As our stock price, if it goes up, we give less equity. As it goes down, we'll end up having to do more equity, and the reason being is when you grant equity to an employee, it's not about the number of shares, it's the value of those shares on that date. In terms of, where we're investing, you can see here, R&D.
I talked about they're gonna continue to grow, but I do expect in 2024, 50% of our grants are gonna go to net new employees in R&D. Dilution target. We're reiterating our long-term target of 2%. You can see in 2023, we're at 3%. We're forecasting it to be 2%, and we have introduced a buyback to help manage dilution. Be transparent, have a Bonnie thing this quarter. We did buy $193 million worth at an average cost of about 120, $136 million or $136 in Q2. With that, I'm going to invite Frank and Christian, but before you do that we're gonna bring chairs up here, and we'll go into Q&A. I need a pillow.
Am I supposed to take out the pillow? Am I supposed to take this out?
Brad, since he was keen to sit in the front row.
He's brave.
Good afternoon, guys. Awesome couple of days. Ton of innovation on display. I've got one for you, Christian, and one for Mike. Christian, you put up a simple yet powerful slide showing in 2014, disrupting analytics, then disrupting collaboration, and now you're disrupting development of AI ML apps. How does your prior disruptive innovations in separating storage from compute and data sharing uniquely position Snowflake for this next wave, and what are the milestones and metrics we should look at to appraise your progress in AI apps? For Mike, your long-range guide, in the past, when you've talked about it, you've explicitly said that you expect to be growing 30% in fiscal 2029. Just wanna clarify if that expectation has changed and if that at all impacts the margin update that we see as well. Thank you.
Wanna go first?
Sure.
Oh, you want me to go first?
Oh, I'll go first. I removed that because I don't need 30% growth to get to the $10 billion in 2029. I don't expect it right now to be 30% in 2029. Be close to it.
Some of the architectural innovations from the early days of Snowflake, you mentioned separation of storage and compute, are foundational to everything else we're doing, to the three waves of innovation. Data sharing, which is the collaboration disruption, would not work as well and as magical as it is without that separation. Same thing happens for applications. Our vision is we wanna have a common data substrate and all sorts of apps working in it. That would not be possible without that same abstraction. For us it's very important that we have actually a very clean architectural blueprint that enables all of these waves of disruption. In terms of metrics, you asked us how do we track it. Mike shared some of the numbers that we use on...
We have perfect telemetry on what customers are doing. We will be able to report and share with you progress on workloads, data warehouse, data science. Also specifically, is it LLM or not? We have visibility into all of that.
Whoa, Catherine, you can just decide.
All right, thanks. Kirk Materne with Evercore ISI. I'll echo Brad's thanks for having us out here. Maybe for you, Frank, there's a lot of discussion, obviously, about AI, but it's interesting listening to you all that without data preparation, you know, the ability to take advantage of a lot of the things that are associated with LLM and AI just won't be the same. You know, when do you think your customers sort of make that connection more fully? Meaning, you know, it seems like there's gonna be a FOMO element of this, where if one company falls behind another from an AI perspective, they're gonna have to catch up. If they haven't been doing the right things around data, that could be really difficult for them to catch up, say, 1 year from now or 2 years from now.
Can you just talk about that a little bit in terms of are the customers making that connection right now in terms of data preparation, data cleanliness, and AI? Just really quick for you, Christian, you know, long presentation today, I'm sure you're tired of talking, but, you know, we're getting a lot of questions about vector databases, and I was just kinda curious if you could just touch on that really quickly about how that fits into the architecture.
Yeah.
Thanks.
I'll go first. First of all, you know, you reference here from Fidelity again, you're on a five-year journey, you know, to basically clean up the mess from whatever the however many previous decades. It's not a quick journey, okay? They're not gonna wait for that. What they will do is, you know, they will take certain business segments, you know, with certain data sets, and they will enable those with large language models, and I'm already seeing that, okay? Now, you know, hopefully, along the way, you know, they get some religion around having a Data Cloud, because that's the reason why you hear us talk so much about it, because everything gets harder when you have, you know, a siloed environment.
It's not going to stop people from lighting up, specific segments, specific businesses, specific data sets. You know, as we said earlier, you know, people wanna go to their board meetings and show what they're doing and, you know. That's why I said augmented query and things like that is sort of my term for that, low-hanging fruit. You're gonna see a lot of that stuff, and people are gonna be high-fiving and like, "Yeah, we're doing it. We're part of the party." Great, right? You know, we're looking much harder at, you know, the ability to ask very hard questions of the business, and we're already, we wanna sorta look, you know, towards, you know, what that challenge entails, and that's what our partnership with NVIDIA is about.
I know Jensen is super interested in that because it's the final frontier, right? Where, you know, you get a copilot who's, like, way smarter than you'll ever be, you know, in terms of understanding your business. I mean, that's the. There's no end game in this. It goes on and on and on, but that's sort of a state that we're aiming for here.
Yep. Vector databases, we subscribe to the notion that it is not a different database. It's a specific representation of a vector or an embedding, and then you run some operations along those or on those vectors. If you look at how we've done everything around data science for Snowflake is we first done extensibility to enable all the use cases, and then we go on first party, enable or simplify the use cases. We showed this morning the extensibility version, and I'm not ready to announce anything else, but we're looking at how do we simplify the use of vector databases.
Thank you. Mark Murphy with JPMorgan. Frank, Microsoft recently commented that it expects that its AI services will drive about 1 point of growth in Azure in this current June quarter. I'm just wondering, given all the work that you're doing with NVIDIA, with Microsoft, Satya Nadella commented on this, with the OpenAI services and the linkages there, and the fact that, you know, most of the Forbes Global 2000 is going to have to prepare its data estate, right, for training these models, can you come up with any kind of rough estimate? For instance, what % of all the compute that's happening today at Snowflake do you think is related or tied in with these generative AI models or large language models?
If there is no way to approximate that, is there a way to step back and say, given your favorable positioning, in this arena and all the developments that you're launching here, can generative AI be, you know, sort of a tangible tailwind on the, on the growth of Snowflake, you know, going forward several years into the future?
I think Christian might be able to comment on whether, you know, whether we can see what workloads are of that sort versus everything else, but I absolutely expect it to be a tailwind. Just by virtue of democratizing access, it will be much easier for many more people to ask many more interesting questions, you know, of the data. I think that alone, you know, it's what Sridhar talked about earlier, this natural language interaction really redefines our relationship with data. I think we're going to benefit from, you know, just positively. You know, in terms of, you know, how we separate that out, I know, I mean, the things are running inside containers.
You know, we wanna be able to, you know, identify that, but why don't you -
We have really good visibility into what type of activity the compute is being spent on. The number Mike just showed on % of customers doing AI, ML, we see which Python libraries are being used, and we classify them as AI, ML. There's gonna be a direct number, and then at some point, we can give you update on what Mike shared today. The hard thing to piece apart is, people are gonna run to try to cleanse the data and to. What Frank was saying, this data strategy that is the input into AI, I think that's gonna be harder to correlate, and we're already hearing from customers.
Someone told me last week, "I want to start tearing down silos," because there's no having a conversation with your data if your data is in 5 different database systems. That second part is hard to correlate, but the direct impact, yeah, we have visibility, and we'll be able to share.
Hello, Kash Rangan with Goldman Sachs here. Congratulations on an amazing summit. One, I guess for Frank and one for Mike, maybe Christian, you can chime in as well. We've listened to a lot of software companies saying how they are uniquely positioned to take advantage of generative AI, and all those explanations are very compelling. Frank, what, in your view, makes Snowflake very unique in this era of generative AI? Follow-up from Mike. You laid out your long-term projections, and it didn't seem like you were incorporating aggressive assumptions for Snowpark. There's a string of initiatives, the Iceberg, Streamlit, in the store, et cetera. Let's say you do hit a home run in one or two of those non-core adjacencies, what is your best possible outcome, upside outcome to the $10 billion? There's a third question, sorry.
If optimization ends, can you reaccelerate your top line? Thank you.
Kash, the answer to your question is, you know, we're sitting on exabytes of proprietary enterprise data, structured enterprise data. That's number one of the answer, you know, to your question, is the amount of proprietary data that we're hosting and managing, you know, on behalf of our customers. It has gravitational pull. People are going to wanna, you know, as Jensen said, you know, last night, you just turn on the AI factory, and then we have natural language interfaces, and then we're gonna be asking all kinds of interesting questions. That's sort of level one of the answer, you know, to your question, is the amount of proprietary data that we're hosting and managing, you know, on behalf of our customers.
The second thing is, this proprietary enterprise data holds the key to levels of intelligence about institutions and businesses that are far more interesting than, you know, what I call planning your next trip to Yellowstone. Not that I have anything against that, but it's like, holy cow, you know, that's the sorta - or summarizing The Great Gatsby and all that stuff, you know. This is a, this is about redefining the economics of industries, right? Think about what can happen in, in healthcare and in call centers and in telco. We think that the potential, you know, economic impact of these models, you know, by virtue of the fact that we're on structured enterprise, proprietary structured enterprise data can be extraordinary. I don't want... Are we unique?
No, because we're not the only people that live in this world. Are we extraordinarily positioned for the opportunity? Yes, absolutely, yeah.
Yeah, I'll say in a consumption model, just as quickly as a customer can slow down consumption, they can increase their consumption. It is very possible for revenue to reaccelerate. You've seen that historically with us. There's nothing to say it could not happen in the future. In terms of what is the upside, your guess is as good as mine. I'm not going to speculate how big Streamlit, Unistore, Iceberg tables can be. As I said, we forecast our business based upon historical consumption. I don't have any data to support that yet, I'm not going to forecast that. If we see some uptick in that next year, expect that the model will be updated, and I would expect it will be on the upside, if that is the case, if we see that.
One thing, I mean, Just very quick follow-up, right. Customers are already taking off without us, right. Because, you know, I said earlier, like in Azure accounts, you know, Microsoft is hosting its own OpenAI instance, and people are just, you know, querying those services. They don't even need us to do that level of implementation. This is already, you know, happening while we're sitting here talking. I do expect there's gonna be, you know, there's a lot of push, you know, for people to adopt levels of these kind of services.
Yeah.
Hey, thanks. Michael Turrin with Wells Fargo. Appreciate all the content and the time this, the past couple days. I wanted to spend time on the industry cloud strategy. You had the eight clouds up on the go-to-market segmentation. That's been a clear point of focus on the product side over the past year. I think the question is: Do those help in terms of current positioning around some of the AI conversations that are coming up? I'd imagine customers are looking for industry-specific ways to solve their data problems, I'm wondering if already having those cloud industry-focused products, is it all helping jump-start some of those conversations? Thanks.
I'll comment. You know, first, look, you know, the industry clouds, they really shape the contours of the data estates, okay. Because it has, you know, like in financial services, you know, the S&P and FactSet and all these different people that are part of it, and, you know, through, you know, Cybersyn and people like that, you know, we're really augmenting, enriching, even weaponizing data. That all becomes, you know, part of the contours that people then can, you know, enable that with these large language, you know, models. I think it will have an effect on that.
I think the industry data clouds are really important because the network effects, which are incredibly powerful, you know, and obviously we've seen that in financial services, you know, very pronounced. I'm expecting this in supply chain management. It really means all the manufacturers, all the retailers. That is a huge part of the backbone of the economy. We're gonna see tons of network effect there just to get visibility and understanding of supply chains, which we historically, you know, have not had. you know, I think the way we're shaping the data estates in these industries and then the incremental opportunity of driving incremental intelligence from that, it's gonna be great. I mean, that is the strategy. I mean, it's not a sort of a horizontal, abstract thing.
I mean industries, you know, are really, you know, they're coalescing around their unique issues. Every industry is a totally different conversation, you know?
Hi, it's Brent Thill at Jefferies. Mike, I think you said in the last earnings call that things were flat, you weren't seeing a big inflection. When do you expect those customer behaviors to change? Are you seeing any signs of light out of the tunnel here in terms of optimization? Just give us a sense of what you're seeing.
Well, as I said, optimizations are always going to happen, and they're gonna continue. I'm not seeing any big optimizations now, but a lot of them we only find out about if we're not involved on the PS side, we don't find out about them until they're happening. In terms of what we comment on the call was literally the month of April, kind of starting about day 10 or whatever, is pretty much flat week-over-week growth in revenue. Coming into May and into June, consumption is back where we'd expect it to be. We look at it on a daily basis, week-over-week growth. I will add, too, that in talking to the sales team, the sentiment from the sales team is a lot and once again, this is more from a bookings perspective.
This isn't how we forecast revenue. The sentiment with customers over the last 30 days seems to have improved a lot.
Over here. Ittai Kidron from Oppenheimer. Thanks again for the past couple of days, very interesting. Question to you, Christian. From talking to customers over the last couple of days, the benefits that you bring to the table in data warehousing are absolutely clear from a cost and performance standpoint. Some of the concerns, however, that were raised with respect to what you're trying to do with analytics and machine learning, is that the cost benefits and the performance benefits don't quite translate in the same way in those type of use cases. Maybe you could talk about the ROI for the customer in those type of implementations. Nobody doubts the technical capability of the platform of doing it all, but does something in the math change in that perspective, and performance?
It's a very interesting question. When we did Snowpark, the bigger benefit that we believed in was the ability to eliminate these trade-offs between doing data science and compromising potentially privacy or security. I mentioned in the session this morning, organizations oftentimes are at odds. The data science team trying to do cool downloading libraries from the internet, and then the team that is in charge of governing it. The whole thesis of Python was of Snowpark, was remove that trade-off, and that continues to be the top line or the headline benefit. It just so happened that our processing engine is so much better than Spark, which is what most people use, that then we saw this massive, not only performance benefits, but by implication of the business model, cost benefits.
I don't think we would have ever started saying, "This is we're gonna do a cheaper version." No, it's all about governance, and frankly, Spark is not very good, and that's how those benefits. Those two still stand. Even for machine learning and training, we showed this morning something that was 5 times faster than Snowpark, that was 2 times faster than Spark. Economics, but probably most important is governance.
The other thing I would add is operational simplification. I think that's, you know, people are not looking for more complexity, they're looking for less. If you can run it in the same platform, I mean, you're eliminating a lot of the steps, not just cost, you know.
Yeah.
Thank you. Glad that you're still here, Frank. Mike, you're feeling better. Couple questions. The first question may be for Mike. You had an interesting slide, just kind of talking about how some of the customer cohort expansions this year have been a bit slower than typical seasonality. I think as you look towards your FY 29 targets, you're expecting that there is still some bit of a headwind on that initial consumption, but then it sort of normalizes in year two to year four. I was just wondering if you could talk about what's driving that confidence, and should we expect a return to that normal type of consumption expansion starting next year?
Second question, just related to the marketplace, capacity, announcement, just any more color you could provide on the potential impact for the financial model, be it bookings or margins?
Actually, I'll deal with that first. When you allow customers a capacity drawdown on a contract, you can't include that amount in your RPO because remember. That will have an impact on RPO, even though we have a contract and it's a firm commitment. We will limit the maximum amount that a customer can apply towards that marketplace drawdown. As an example, if it was a $1 million customer, and we said, "You can have $100,000 of that could be applied against buying through the marketplace app data or applications," we'd only be able to record, assuming no revenue was recognized, $900,000 in RPO. The $100,000 doesn't show up in RPO.
When the customer, if they consume, say, $50,000 of that $100,000, the only thing that hits our revenue is the revenue piece that we take, because it's a pass-through to the customer. Say we made 4% on it, we'd only record $2,500 in revenue. We wouldn't get the $50,000 in revenue. Is the way the accounting works for those marketplace deals, because we're actually not reselling, we're just being the intermediary, and collecting the cash on behalf of the partner. Now I'm going blank on what your first question was. Repeat it, please.
Yeah, just around the normalization c onsumption patterns.
You know, one of the things that we would see historically, that your two cohorts, their consumption just spiked. What a lot of that was with customers just trying to move their data as quickly into Snowflake, and there was a lot of inefficient usage of that data. What we're seeing now with customers, a lot more disciplined, they have been hiring people that have lived through Snowflake migrations to be with them. I don't see quite that same ramp in the early years. Yes, it's still ramping a lot, but they don't expect that. What gives me the confidence in the net revenue retention expansion, you saw the Global 2000, that stayed pretty consistent.
I expect the Forbes Global 2000 to be a much, much bigger % of our overall revenue by 2029 than it is today. Those guys will continue to grow for years.
You know, let me add just one thing to this, because I sort of see this happening in, for example, large financial institutions. You know, some of you are representative of these institutions, so you probably know how this works, right? They plan over three, four, five years. They have an extremely disciplined, regimented, you know, rollout. It frustrates the hell out of our salespeople, by the way. They're like: Holy shit, how am I gonna make money here, right? They're like: "No, we're on our plan, and they are methodically marching down the field, and nothing is going to distract them." I will tell you one thing, it is going to materialize. It is just not a wang bang quick. You know, I'm pushing it into the end zone.
You see less and less of that with large institutions, that is not how they operate, you know?
Thank you. Brent Bracelin, Piper Sandler. I actually wanted to drill down into this push into apps and AI. Mike, for you, specifically on monetization, should we think about Snowpark as the primary vehicle to monetize apps and AI, or is it broader? For Christian, can you talk a little bit about and clarify the app ambitions? I think of Snowflake being an app development platform, unclear if Snowflake might want to build their own apps as well. Just clarification there. Thank you.
I think they're both questions for Christian.
Okay. I'll take it in reverse order then. Will we build our own apps? We've had the conversation in many contexts. Right now, the horizontal opportunity is so large that we would much rather have partners go and develop that for us. Maybe at some point, there's a category where we wanna go and do more ourselves, but right now it's partners primarily, and literally the three of us have this conversation every so often. What was the first part? I'm sorry.
Monetization.
Yeah. Snowpark is the runtime of our app stack, whether it's Python, Java, or containers. That truly is what drives the bulk of the compute. There are other parts that play a role in our app stack, Streamlit, which is how you build a UI, but Streamlit itself runs on Snowpark. I would say it is fair to say that Snowpark is the core engine for the app platform. It is, at the end of the day, what we will monetize, and I think you said in different forums, that if all of this plays out the way we think, you could see at some point, the revenue from the app runtime, Snowpark, be larger or comparable or meaningful to what we see on the data side.
One thing I would add to that. It was kinda interesting to listen to Mihir, the CTO from Fidelity, this morning, because he said: "Look, there's data engineering, and there's software engineering, right? Data engineering is Snowflake. Software engineering is Snowpark. There's function, and there's data, and there are two hemispheres, right?" We have, in my opinion, beautifully integrated these spheres.
It is beautiful.
It is beautiful, and we like beautiful. It's important that it's beautiful. We're not hackers, okay? I think it's important for you to have an appreciation that we're addressing the function in addition to the data, which is a massive scope expansion for us as a company. We think this is really important in the cloud because, you know, when we were living, you know, on premise, you know, we'd access databases, you know, ODBC, JDBC, because you had a security perimeter around it. People weren't worried, right? In the cloud, who's managing that? It has to be you, right? You can no longer say like, "Oh, it's not my problem." Well, sooner or later, it will be your problem. We took a highly secure, high-trust, you know, posture towards that. The...
It's not just the function, it's the way that we deliver, allow that function to be delivered is what this is about. My belief is that, you know, software engineering is a much bigger deal than data engineering in the. I mean, if we set off this renaissance in software development that I talked about this morning, because that's sort of how I think about it, because we have so lowered the bar in terms of investment, in terms of all the things that have to happen for you to build and sell a software application, you're gonna get this orders of magnitude, you know, software explosion because you just now can. You don't have to put any money up upfront.
Two men and a dog, you can build something, you know, in four days, you know, put it in the marketplace, sell it, monetize. All they gotta do is cash the check, right? That's really what we're trying to do. We're trying to redefine the software engineering sphere from what it historically has been, because we think the cloud enables and allows that. I just want to give you a little bit more, you know, sort of background on how we, how we're coming at this. This is, I don't want to say it's revolutionary but it's definitely a really different take on software engineering from what it historically has been.
Hey there, Patrick Colville from Scotiabank. Thank you so much for hosting us. I thought the most interesting slide you put up was the product adoption by customers. 95% of customers using data warehouse, 35% using Snowpark, 20% AI/ML, and 25% data sharing. I mean, do you think Snowpark, AI/ML, and data sharing are gonna reach that, like, 90% penetration? Or do you think, you know, the mid-market customers, and I guess smaller enterprises, won't adopt those products? I guess my second question if possible, is we didn't hear too much about Unistore. Is there any update there in terms of when that might go GA?
On your question there, those customers, I broke it into customers consuming less than $1 million and customers over $1 million. Just because they're consuming less than $1 million doesn't mean they're not a big customer.
No.
There are many of the Forbes Global 2000 that we've landed are not million-dollar customers yet because they're in the very, very early innings. I don't see any reason why that won't be closer to the million-dollar-plus, and I still think even in those million-dollar-plus, there's a lot of upside with getting more adoption, especially on data sharing.
Yeah.
I do think data sharing is gonna be a norm that all of our customers.
Yeah. Yeah, I think data sharing is gonna go to 100%. Snowpark is gonna go to 2,000%, okay? I've said this publicly over and over, right? If you read or write, read and/or write to Snowflake, we're gonna own that work. I guarantee it. We will not stop at anything because it's cheaper, it's faster, it's safer and it's simpler. There's no damn reason in the world why you wouldn't do that, right? Because a lot of things that people are doing are unnatural acts. It's incomprehensible what's going on.
Yeah.
In fairness, you know, we didn't have all the primitives, you know, to support a leader, but we do now, you know?
Yeah. The prior question answers part of your question, which is if Snowpark is a runtime for applications, customers are gonna end up using Snowpark, maybe directly or indirectly. Like, if you're a bank and you wanna use DTCC's new native app, you will be doing Snowpark, even though you may have not set out to do that. Your second question was on Unistore. We will be in public preview, give or take, at the end of this year, and feedback ultimately informs general availability, but we've front-loaded all the big architectural changes, and we expect to be no more than six months from public preview into GA, plan for roughly a year from now.
Hello, I'm Andy with Hypergrowth Blog. Excuse me. What price structure are you guys considering for Snowpark Container Services? Can you discuss the option for GPUs? Is it gonna be built through NVIDIA, or is Snowflake gonna manage that and pass it through?
Yeah. I'll take it.
Christian.
We did actually a lot of research on what should be the margins for Snowpark Container Services, we found two extreme perspectives. For enterprise customers that want to consolidate infrastructure, simplify data governance, those use cases, our traditional margin structure is like, "That's easy. You're simplifying my life so much. There's a lot of value." For application developers that will compare Snowpark Container Services to similar orchestration, container orchestration products from the cloud providers, they needed a much, much smaller margin structure. We ended up going into preview with an in-between margin, and we will have to adjust and figure out things as they go, which is one of the reasons why Mike was saying new products may have different margin structures.
In terms of the GPUs, those will be managed through us, so we will procure them with the hyperscaler, and then they'll just go through the normal billing through us. The interesting thing, though, is with our buying pattern, power at AWS, we get easier access than many people to GPUs, and they're really hard to get today.
You know, that's actually important because, you know, if you're, let's say Databricks, right? They rely on their customer to be able to procure the GPUs. What if they don't have them? I mean, we're the largest ISV that Amazon has, and we're obviously one of the largest that Microsoft has as well. That changes the relationship.
Hey, guys. Brad Gerstner with Altimeter. Great event. Thanks for taking us so deep in all of this. Mike, I really wanna drill down on the $10 billion. You know, I think you said, "We believe we can still get there." When I listen to it, listen to, you know, the drill down, I think less than $500 million of that, so less than 5% is gonna come from Snowpark and de minimis from the other places in the forecast, so less than $500 million from that. We think that AI is gonna be a general tailwind because every corporation has to, you know, cleanse their data, get their data into the cloud, so generally, we think that's a tailwind. Christian Kleinerman says Snowpark could be as big as the underlying, you know, kinda data warehouse itself.
We all have to leave here and try to build a forecast, and the hardest thing I, you know, I struggle with is if we thought core data warehouse was $10 billion before. And if we think AI is an accelerant, and you've got all these concentric rings now coming together around the core data warehouse, why still $10 billion? And why do you think, you know, perhaps the exit run rate is not as high as you did before? If you'll allow me a second question. Frank, I thought the segment with Satya was amazing. And the piece you both focused on appropriately was, you know, frontline alignment, getting that Sales force alignment. He talked about it publicly, which was great. Can you give us any more detail on, you know, what's your objective?
What's success for you in terms of Salesf orce alignment? Thank you.
I'll start with, I think it was three years ago, we said that we'd do $10 billion in revenue in 2029, and felt very good about that, and there was a lot of cushion in that. The next year, I said we'd do $10 billion, and there was a lot more cushion in that. This year, we're saying $10 billion. There's still cushion in that, but once again, we forecast based upon the historical consumption patterns we have. I don't have any support for how much is Unistore, how much is gonna be AI, and that's upside. I just don't have that, and I'm not gonna guide to that. Once we start seeing that, we'll update that model every year. I feel good about the $10 billion based upon what we have today.
Just your second question. You know, Brad, we really wanted to get Microsoft to a similar place where AWS is in terms of field alignment, incentives, how people are getting paid, and so on, because we know that model works. When I say that model works, look, you know, we compete, okay? We win the technical wins a lot, okay? Most of the time. What happens to people that do not get paid at all, you know, on in such losses, they're gonna try and double and triple down any way but someday, and it ends up in a really ugly mix, and that creates, you know, headwinds in the relationships, and the distrust in the field builds and, you know, partnering becomes almost impossible, right? I just want to get to a state where...
I told, you know, Satya, literally, you know, I said, "Look, you know," I says, "when you lose, you know, you guys are, you know, thrown, you know, millions and millions of dollars of free services at it, right, to try and reopen the conversation. If that doesn't work, you know, you're gonna trot out Databricks as a first-party product." He says, "You just don't stop, you know, when you lose." He says, "That can't happen, okay, when we have to sign up for a much bigger relationship." He agreed with that. He's like, "Yeah, we need to normalize that because we compete, you know, either we win or you win, and then after that, you know, we partner." By the way, this happens with Amazon. Amazon loses plenty of times.
They don't lose their mind over that. They don't. That's well, that's where we need Microsoft to be. Don't lose your mind. We're still consuming Azure here. We're gonna consume all the other Azure products. It's like this is not a bad thing. You know, Azure, Snowflake on Azure is a win for Microsoft. You know, will we just get to day one? I mean, I just had a Microsoft guy take a selfie with me today. Shit, that has never happened before, you know? I mean, look, you know, little baby steps, I guess, but, you know, we codified, you know, these things in the agreement, right? That's really. We can win, you know, technically against Microsoft. We do it all the bloody time.
As long as we have a normal posture, you know, in these relationships, it's gonna work. That's really what we wanted to convey to you today. I want you to hear Satya talk. You don't need to hear it from me. It means very little, but you hear it from him. You know, he's aligned, you know, with us on that sentiment, and was willing to step up to that.
Hi, Derrick Wood at TD Cowen. Frank, you've mentioned Databricks a couple times, so I thought I'd ask about them. I mean, you guys started in the cloud data warehousing market with the relational database. You're now, you know, going into unstructured data and AI, ML, and they've kind of started at the Apache Spark side of things and are trying to get into SQL. Obviously, you talk about this market is very big. It's gonna support a lot of players. You do have roadmaps that are starting to kind of converge a little bit. Just curious how you think about, you know, your philosophy of the market versus theirs, and what your advantages are gonna be going forward?
Yeah, you know, look, you're correct. They're different worldviews, you know, first of all, right? I mean, I've said, you know, before that, you know, Databricks is great for people who wanna adjust their carburetor with a screwdriver, you know? The rest of us are driving, you know, EVs, or they at least they have fuel injection, okay? It's just a different type of thing. You know, we view Databricks really as the descendants of, you know, Hadoop and that whole generation of platforms and technology. We're descendants of Apple and Tesla. You know, we're trying to abstract people from complexity, right? It's a very different choice.
You know, I sometimes talk to, you know, public sector organizations, and they're like, "I'm so glad, you know, running Snowflake, we can just get SQL engineers." They said, "We can get those all day long." They says, "We couldn't lay a finger on a Python guy to save our lives." He says, "They won't stay. We can't afford to pay them." It's a very different, you know, approach. Will there always be, you know, Python guys that like Databricks and accounts? Yes, I think there will be. That is just, you know, part of the makeup of our industry. You know, Databricks has also lived much more on the function, the software engineering domain, rather than the data engineering domain, and I think that, you know, yeah, you're right.
They're trying to come to the database domain, and we're sure as hell are coming into the software engineering domain with a vengeance. I mean, Snowpark is aimed at that, and, you know, we have the advantage that starting with data is a really, really good starting point, okay? I'm feeling good on that dynamic, you know.
If I may add, I would say structurally, many of the choices from Snowflake early on have been validated really strong. The model running on our VPC, on our hardware, that we can normalize the service across cloud providers. How do we extract metadata from data? All of those things, go look at where Databricks started and how they're slowly coming and following our model. Competition is good, but we think that we have a better foundation, and we continue leaning on that.
Yeah, single product, right? I mean, how many engines do they have over there? I've lost count, you know. It's hard to build a single product, but it's incredibly powerful, and it benefits the customer greatly. Those are convictions that we have. I mean, the product does have to be, you know, good, rather than I'm just gonna throw something out there with a small group of engineers and just check a box. That's not Snowflake style, and I love that about the company. I didn't bring that. That was here, you know, when I joined, and I really admire that, you know?
Hi, it's Greg Moskowitz from Mizuho. Christian, you know, when we were sitting here a year ago, you had talked about Iceberg tables and the belief that many customers would standardize on that over time. Curious to hear how the uptake has been versus your expectations over the past year, and more importantly, does the addition of unified Iceberg tables, you know, with two modes, right? Managed and unmanaged. Do you think that accelerates the adoption curve? Then for Mike, you know, circling back to your comment, I think it was in response to Brent's question, about, you know, customer sentiment seeming to have improved a lot over the past 30 days. Anything you're hearing from Chris and his teams, anecdotally speaking, that might shed some more light on that? Thanks.
I'll start with Iceberg. Yeah, the point is very accurate. We think that the unmanaged managed mode will accelerate adoption. What we learned in the last 12 months was if I already had all my data in a data lake and in Parquet files, and I wanna go to Snowflake with Iceberg, we created, frankly, too steep of a step to go to a point where all operations need to be coordinated by Snowflake. The announcement from today is we're introducing a step in between, this unmanaged mode, which meets customers where they are, lets them leverage their existing files in Parquet, and then if and when they choose to, they can go graduate to a managed mode where Snowflake takes more. It was entirely driven out of accelerating adoption. We're quite excited about it.
Mike, answer for Brent and I'll reiterate. You know, daily consumption patterns we've been seeing in June have been very good. I would say back to where more we'd expect it to be, unlike in April and into May, where we didn't see very much growth week over week. As I said, customer sentiment in talking to salespeople seem to be good from a bookings perspective, and deals are shaping up. We've been closing deals. We closed a big deal with a financial institution. We closed another one with a big healthcare tech company today. I feel good about bookings, but that's not revenue. Sentiment is shaping up in terms of the sales call.
Yep.
When I sit in on that on the weekly call.
Right. I don't think he likes you very much.
Hi, it's John DiFucci from Guggenheim. Snowflake was the pioneer in data warehousing in the cloud, and you took advantage or, not took advantage. You leveraged the architecture of the cloud, and you did it first, and you did it better. Now, as you've acknowledged, there's competition out there. Frank has spoken, I think Christian too, about data gravity, and we know that's real. So you've expanded beyond data warehousing to data adjacencies, and now even to app development, which is, frankly, pretty cool, and it's actually what good companies do. They expand for their customers and make it better.
If you stay at the current rate of monetization for all these opportunities, and the overwhelming part of it is data warehousing, do you think Snowflake can continue its success as measured by growth, which is also reflective of customer satisfaction, we know that, in the, in the medium to long term, and I guess as it relates to that $10 billion target out there? I know this is a little bit like Brad's question over there, but it's really like, if you stay where you are, doing what you're doing, but it's still data warehousing, is that opportunity big enough to allow you to hit that target?
Well, I'll, you know, I'll start first, right. Data warehousing is really in most companies, is the foundation of data engineering. This is the only place where they have trusted, sanctioned, optimized data. I wouldn't sneer at data warehousing, like, "Oh, it's not big enough." It is foundational to the world of institutions and enterprises. The problem historically has been, I think I've said this a few times, you know, it's been a business of, you know, begging for a 2:30 A.M. time slot 3 months from now, because on-premise, it was extremely capacity constrained, because you would consume a cluster in no time, right?
The growth you've seen from Snowflake, which has been extraordinary, thank you very much, it's been created because of that enormous pent-up demand that has built up literally over decades. You know, I've been in a world of analytics, not nonstop, because as you know, I've been in other places, but I've seen this in the eighties, this problem. I've seen it in the nineties. It's been excruciating. You know, we're now finally in a place, you know, where data is starting to become a real thing rather than reporting yesterday's news. You need to get some context on what this is really, you know, about, you know, before like, "Oh, it's data warehousing." That's yesterday's news. You know, we look at data warehousing really as a starting point, you know, for customers, right?
They need to be able to report what happened yesterday and update their data. Of course, you know, we're going to streaming and, you know, observability and all these kinds of things. These are all natural things that are gonna happen because they can. You know, before, you know, we didn't have any prayer, you know, of really getting beyond, you know, re-reporting, you know, what happened today, before the week before. You know, monthly closing the books was extremely hard. We couldn't even focus on the more elaborated, more sophisticated pieces. I think cloud computing as a foundational, you know, platform has opened up everything, has opened up the opportunity for Snowflake. Snowflake wouldn't be here without, you know, cloud computing, you know, fabrics. Now, you know, I think it's...
I, as a question, I take exception to what's really going on here, you know.
Yeah, I was gonna say exactly the same thing, which is, and you touched on all of it, that I don't even know if traditional data warehousing exists anymore.
Right. Exactly.
Frank and Jensen talked about, you're gonna have a natural language conversation with your data. You tell me if that's data warehousing or BI or a new thing, but it's the new normal.
Yeah.
Okay, we have time for one more question, I'm being told, because we unfortunately have a customer event. Then Jimmy and the IR team will stick around for questions.
He's owed a question, for sure.
Yes.
Thanks very much. I hope it's worthy of the last question. Brad Reback from Stifel. Mike, I think you had a slide up there that showed 58% growth in your GSI business year-over-year on a self-reported basis. A couple of weeks ago, you announced a new head of alliances, Tyler Prince. What's the opportunity there? Obviously, he comes with tremendous background on the app side from Salesforce. How should we think about that playing through? Thanks.
Sure. You know, Taylor has, or Tyler has very good relationships with the large GSIs, whether it's Accenture, Deloitte, EY, and others. We think the GSIs are going to be very important. Those GSI practices, We're already seeing kind of an inflection with an Accenture. I think in Q1, we booked over $300 million, and Accenture was our number 1 or number 2 in Q1. Once again, these are self-reported numbers. There's no reason why our top GSIs can't have half a billion to $1 billion annual practices, and that's what Taylor is really -
Tyler.
Tyler, is. We're bringing him on to do. It's not just the GSIs, it's also resale partners as well, too, as we move into Asia more and alliances, he will own as well, too, in that group.
I do think the GSI relationships, and as Accenture is a good example, I mean, I personally wrestle with these guys, okay? Especially Accenture, because they're growing like a weed. I mean, they're doing incredibly well, but they do it because they just bump into it. They back into it because we are selling, we are spawning all these projects, and because of their high level of presence, they go like, "Put me in, coach," and the business just takes off. After a while, you know, even, the people at Accenture are like: "Christ, are we organized here for Snowflake?" Well, they weren't, right? I mean, there was nobody in Accenture who owned Snowflake as a business.
You know, it's like pushing a rope, right? Now, they have gotten to the point where they have taken a very, very senior person off another line of business, that this is, you know, this is Snowflake. This is Snowflake only. It's becoming a business group. I cannot tell you how important that is in your relationship and with SIs to get to the point where they become a business group, where they have targets, they have provisioning, they have to report on it on a weekly, monthly basis. Everything changes at that point. It's very, very nonlinear. You gotta do hundreds and hundreds of millions, you know, with them, for them, before they start, you know, paying any serious attention to you.
We are now reaching these thresholds with these SIs, and I think it's really important, you know?
Okay. With that, thank you, everyone. I really appreciate you guys making the trip here today. For those of you on the call, thank you for joining us and we will probably see you around later on tonight.
Thank you.
Thank you.