Morning, everyone, and thank you for joining the H.C. Wainwright 27th Annual Global Investment Conference. My name is Kyle Mieri. I'm an analyst on the H.C. Wainwright Corporate Access team. We're very excited that all of you could join us today for a productive day of one-on-one meetings, corporate presentations, and panels. For this session, we are thrilled to welcome Ryan Steelberg, CEO and President of Veritone, trading under the ticker VERI. Ryan, take it away. Thank you.
Thank you, everybody. Good morning. Kate, why don't you introduce yourself, and then I'll take over.
Sure. Hi, everyone. I'm Kate Goldsmith. I work on Veritone's investor relations efforts, so I'm excited to share a bit about management's background and the company today, as well as a few of our exciting growth levers.
Thank you. Veritone is a pure AI company, which sounds strange because we've been around for over 10 years. We actually took the company public in 2017. And I'm going to go right into what we do for those who aren't familiar with the story. But let me give you a little bit of background on myself. I've been in the ad tech space starting in 1994. For those who are in the markets way back when, it was the company AdForce was the company we started. We took it public in 1999. So I've been in sort of the internet advertising media ecosystem for a long time. The reason why that's important, why it's relevant, is the introduction and leveraging of AI for ad tech became a very important aspect when we were, frankly, trying to understand more context inside audio and video.
For years, we were serving banner ads based upon structured HTML. I'm going to skip ahead two slides because my entire presentation and everything we do at Veritone is all about the explosion of unstructured data. That's why we started the company. That's why we exist. The name of the business is a little play on Veritas and tone, truth in the signal, the signal meaning messy data everywhere. Why Veritone came from an ad tech company? Because ad tech is a data problem. When you have to serve 50, 100, a billion ads per hour to millions of people around the world, you have a few milliseconds to ingest some ad request, figure out how I'm going to deliver an ad back to you, and do it quickly. Leveraging neural networks at my times at previous companies, my own businesses, I've taken several businesses public.
I sold my last business to Google. I headed up all of their offline ad efforts. But it is a data problem. Specifically, when the iPhone phenomenon dropped and user-generated content started to explode, we in the ad tech space had an issue. We didn't understand what was inside the audio and video at scale. Meaning, if you're on ESPN and you are watching a certain type of programming, we need and want to know what you're looking at. Why? Candidly, because I want to serve more targeted ads to you. Whether you're watching daytime television, sports, or news, it is important for us to understand that. The problem with audio and video at scale, it's hard legacy-wise for machines to understand what's inside it.
It was a great opportunity for cognitive AI, where we really started this business and leveraged, was could we assemble or orchestrate hundreds of different AI-based models to start to ingest and interrogate data sets, huge volumes of audio and video for our customers to understand what's in it. You can't target against something. You can't search against something unless you've ingested it and started to have an understanding of audio and video. Think of the world of web search in the old days, where you went from Infoseek and AltaVista and Excite@Home, and then Google finally comes along. The web, the text-based web without a proper index would be very challenging to do, and let's be very clear, Google Search is way more than just a search engine. Because of that indexing, they created the ecosystem, if you will, for ad tech.
Think of understanding, taking unstructured data and creating structure around it creates a tremendous amount of opportunity. Because of our background, the areas that we went into were primarily the same type of relationships and customers that I've been working with my entire career. We went to the biggest media and entertainment customers, the Disneys of the world, the iHeartMedias, the CNBCs, the NCAA, the companies that produce and generate either ad revenue, now obviously subscription-based revenue. The largest programmers, producers of content, are our customers. When we started this business for the first several years, we were licensing our technology, our AIware technology, to service the largest media and entertainment customers to, again, ingest all their data, leverage our AI to index it, understand it, and help them monetize it. The scale that we deal with is massive.
We ingest and process hundreds of thousands of hours a day of audio and video from the largest companies you know, media and entertainment customers. We now service the federal government. We're going to get to that in a second. But again, to set the baseline, if you think Veritone has hundreds of the leading media and entertainment customers worldwide that we ingest, index, and help them understand and monetize their audio and video assets, and we've been doing this for over 10 years. Our primary product offering, the base level of everything we do, is AIware. AIware is our proprietary stack. Again, that is platform agnostic. We can actually deploy this entire stack or modules of our stack in really any environment. That's commercial cloud, such as AWS Commercial or Azure Cloud, AWS and Azure Gov.
We are also FedRAMP compliant for our work with state and local law enforcement and the federal government. The key here is scale, our ability to ingest and orchestrate hundreds of different AI models, now large language models and Gen AI models I've talked about in the next segment. But the key is efficiency of our ability to ingest, index, understand, and help our customers generate either cost savings or generate intelligence from their data sets or help them monetize it. I'm going to give you two end-to-end customer examples of exactly what we do. Let's take ESPN. Everybody knows ESPN. ESPN is both a creator of content and an aggregator of content. Every night if we're watching SportsCenter or you're going online, you're seeing the speed of them ingesting and preparing and distributing content to all of us.
How do they do that so fast? In the old days, they would have hundreds of interns literally sitting through and taking all these different data inputs of content being produced all around the world. And they would have humans literally sifting through and tagging these different content elements. ESPN, which has been a customer of ours now for several years, we've automated that almost entirely, so we ingest their primary video feeds, what you and I see on television, ESPN, ESPN2, their actual linear programming feeds, all their audio programming, AM and FM, what you hear literally in local New York radio stations, LA radio stations, their podcast content, also their features, 30 for 30, et cetera. All of that content gets ingested into an aiWARE instance for ESPN, which we manage as a hosted service.
And then depending on what type of audio and video it is, we then will orchestrate and act upon it with different recipes of AI. Let's do some simple ones. If it's video, we always will be running face detection, object detection, transcription. Think of it as creating a metadata layer. I have this unknown blob of content, and I want to know every second that Michael Jordan's face is on screen, if the Nike logo is in the background, and then with an aperture of five minutes, they're talking about GOAT, the greatest of all time. Does that make sense? So think of just base one is ingest, apply AI cost-effectively. Anybody can build a POC and throw $1,000 an hour of content. But when you're processing literally tonnage of content, you got to be very cost-effective. And then the different models.
In our orchestration, some models are very good at what's called general person or celebrity pickup, meaning if Michael Jordan's face comes on screen, we can invoke and run a model that will generally pick up Michael Jordan's face. Does that make sense? But often, there are unknown individuals or their hosts. If we also help their research team, they didn't know every millisecond that their very expensive talent that they're paying to be their host on their television, on their shows, is operating. So we also will hypertrain the models to make sure we know exactly when their highly expensive talent, Dan Patrick and others, faces on screen or when their voice is speaking. Mission critical. So how do they make value out of this? Why are they paying us hundreds of thousands of dollars a year to ingest all this content, index it?
For this type of customer, an ESPN that generates revenue through subscription, like OTT and also advertising, for them, they really have three main use cases. Number one, advertising and sponsorship optimization. Meaning, if we watch any game, if you're watching the US Open tennis last night or you're watching the Buffalo Bills game last night, the hosts themselves are part of the ecosystem. You'll see logos in the background. The commercial breaks are sponsored by Geico. All of that goes into a package. It's not just about the 30-second commercial break anymore. So how do you then look at it? If you're trying to sell a $100 million a year package to Geico, you have to present to them all of the value and the metadata about all of the brand exposures.
So meaning that host, when it's a timeout and everybody's still tuned in and you see the Geico logo in the background and the host is talking about it, it's some of the most rich real estate for advertising. Makes sense? Because when the commercial happens, that's when you go grab a beer and go get a pizza. Makes sense? So understanding with our technology, the minutiae or the accuracy of those mentions is part one of many of the use cases that our clients use. Another one would be the research department. I touched on making sure they understand the efficacy of their host. They're spending tens of millions of dollars a year investing in these celebrities who are now doing play-by-play. They actually monitor ratings in real time. They want to know if that person performing on Nielsen.
Now they can look at first-party data from YouTube or secondary coverage online. So another huge use case is, again, when you have structure and intelligence in your audio and video, not just about the general programming, but down to the individual who's in the programming, it provides you great insight for research. You guys may have heard that ESPN several years ago did a huge overhaul of their talent. They basically, I think it was like a third were "turned over." Let's just say data, that was a data-informed decision. We were part of that ecosystem. So again, just one of a couple of different examples. A third example is speed. How quickly can ESPN, when they're ingesting now all this data, they can't afford not just the expense of having an army of manual taggers, humans tagging the content, but they need to tag it.
And now the content is so much. There's so much content now that the amount of content that they have to ingest and start to understand, it's like an order of magnitude it was just a decade ago. AI is the perfect solution for that. Now I can index it, package it. And so it's almost incredible. It seems like we're getting targeted stuff the second it happens. An event happens and boom, it's being distributed on your social media feed. If you have great structure right now, we've done it cost-effectively. The speed of indexing this content allows you to do things like that. So one of many. Our customers in this general category are the biggest brands you know: CNBC, CNN, the NCAA, the Masters Tournament, United States Tennis Association. Our CFO, Mike, here was at the US Open last night.
We're not only just a cloud provider. We're actually on-site. AIware, a local version, is running at the US Tennis Association right now. It's not just indexing the content that you and I see that goes over the air. It's every single camera feed 24/7, nonstop. For days you're thinking about. Imagine all the great content that fell on the editor's floor. Because again, before the web and before it was cost-effective, what would you do with hundreds of cameras running 24/7, basically? That is gold. That is gold right now because we can now personalize it on the web. This has been the staple. What I just described, we've been doing for over 10 years. Again, we have hundreds of the biggest studios, broadcasters that do this. Where has the business gone from there? About five-plus years ago, DARPA reached out to us from the government.
And they said, "This is really interesting. What you guys have done at scale, leveraging AI at huge levels of audio and video and unstructured data for the media and entertainment industry, can you help us out? We have major data problems across our federal government and state and local law enforcement. We produce drone footage, satellite footage. Officers, everybody is now having body cameras, dash cams. We are a security state. We may not have the politics behind it, but there are cameras. We're producing data everywhere. But unlike ESPN and unlike media and entertainment, it's not their core product." Imagine if you have a data problem. You would imagine a Disney and ESPN would have some understanding of their primary data offering. Their primary data asset is their audio and video.
But when you shift and you look at, and I'll be specific, let's take Beverly Hills Police Department. Beverly Hills Police Department, their primary function, their primary product is not the security camera footage that's being recorded in the city. It's not the body cameras. Their primary product offering is obviously sworn officers. That's their primary product offering. However, they have a data problem. There's not one case that happens or an investigation that they're now not pulling evidence that includes some form of audio, video, or other data set. And you're like, "Well, wait, that's really" think of DNA. DNA wasn't structured, at least how we could process it in the old days. We had to create machines that could break down DNA. And by the way, back in the day, it would take us months forever to do trace sampling of DNA. Now it's incredibly fast.
So think of now what we've become experts on and sort of honed our skill set with the largest media and entertainment companies. We're now applying that technology to state and local law enforcement and the Fed. So today, Veritone has taken this exact same stack, exact same stack, and we are now servicing hundreds of local police departments, sheriff departments, and some of the biggest sheriff departments and police departments in the United States. And as we've expanded, we're running into some other players like Axon and others, which you guys may know. But the big difference between a lot of these legacy systems and what we do is Veritone is an open platform.
You can throw any type of unstructured data in the kitchen sink into our platform, and I will deliver you out structured, clean, indexed. I'll say AI-ready data for you to whatever your initiative is, saving money, efficiency like ESPN or a police officer. I'm trying to solve the case faster. Period. I got to go through the Boston bombing, something local here. I mean, it's a celebrated thing. They had to rent out warehouses and hired thousands of people to sift through videos. We can do that in minutes now. So our expansion from a very successful starting point in media and entertainment has now allowed us to go into both the state and local law enforcement and the Fed. We are now doing a lot of similar work with the Air Force, Defense Logistics Agency, and multiple groups.
I got three minutes left, and then I'll turn over quick for questions. Lastly, something I've been doing this, this is my seventh company. I've had some great success, and sometimes when you achieve a moment of scale in a certain area, at times a derivative business emerges organically. This happened to us about a year ago. I'm going to go quickly. We could spend hours on this one. It's called VDR, Veritone Data Refinery. And let me explain what that is. We have now ingested and processed tens of millions of hours, almost 100 millions of hours of premium audio and video. Everybody knows the insatiable appetite that these new mega large models have, whether it's a large language model classic or it's new, the JEPA models that the superintelligence team at Meta is building. We know them as ChatGPT. Let's just call that the phenomenal models.
These things are voracious animals that need data to train upon, the insatiable amount of data. All the models that you guys use the most historically were predominantly trained on text-based data, like ripping off the open web and building ChatGPT 3, for example. But over the last year, they've gone from pure text training to video, to image training, to now audio and video. We're all blown away by VEA 3 and these movies, Showrunner, that are creating like you put in a prompt and it creates a video.
They're like, "How are they doing that?" Veritone now is an organic line of business with very little CapEx, is now taking these huge corpuses of indexed audio and video, and we now are working directly with the largest model developers, the biggest names you know, and we are now working with their data science teams using our proprietary index data for video and helping them train their next-generation models. It's a spectacular, fast-growing new line of business for us and at scale. We've publicly disclosed a lot of this stuff where, in effect, we thought this would contribute, frankly, a couple of million dollars for the year. And I think we've updated where our immediate pipeline is already $20 million for this. So let's go here. Let me go right to this VDR slide. Yeah, data refinery. Love this business line.
It's actually using the exact same aiWARE technology stack. A lot of the clients I've mentioned are participating in this. So it's a whole new revenue line. So right now, if I go to the NCAA, the NCAA pays me for everything I've been doing in the past, all the ingestion, the AI indexing. Now I'm actually doing a revenue share back to the NCAA because I'm helping broker clean up and now sell their data sets. So everybody familiar with Scale AI? Scale AI became the 800-pound gorilla for manual data labeling, helping prepare legacy data sets for the biggest AI companies. Veritone is in that business now, and it will be a very large contributor to it. And I'll do it quick for questions. But Veritone, so think of it as we have a stable business, and we're still operating customers that have been with us for years.
Our retention rate, they just don't leave us because it's very sticky. Once they've given me all their data, I can just continue to iterate. We've expanded into the Fed and state law enforcement. And then the exciting new line, which we expect spectacular growth for the next few years, is Veritone Data Refinery, which is very organic and germane to our business. Any other questions? No, I think that's it. Questions? Yep, please.
How does the journey getting to the police force and possibly the military? Is there talks for you guys to partner with one of the big players, whether it's a defense contractor, the Palantirs of the world, to get into that market instead of trying to do it on your own?
It's a great question. So thankfully, we've gone direct, and we have so far.
Our big master service agreement contract with the Air Force was direct. Our contract with the DLA is direct. So we are a prime contractor and against a few of those. That being said, we've historically, before we sort of got to a certain scale and we got the right clearances, we were going through Deloitte and others who had long-standing contracts. It was a disaster going through some of those indirect ones. So from a technology integration partner, yes, we absolutely will start to see us start to do partnerships with some of the, I'll call the platform players out in that space. I live in Southern California. I'm not going to drop names, but a gentleman who wears, let's say, colorful pink Hawaiian shirts, that's an up-and-coming on munitions and things. We don't need to, to be clear.
We do think that Veritone will achieve a lot of its great success and plans by being a direct contractor and not a subcontractor with some of these groups. But I'll give you a parallel. In state and local law enforcement, we have done technical partnerships with many of the suppliers of cameras, GTAC Solutions, Motorola, and others. So again, we want to be the generic open system and be, I'll call, device agnostic or munitions agnostic. So again, it does lend us that we will be doing partnerships with certain groups. Palantir at times can be competitive with us a little bit, to be clear. They're huge. They have legacy tech. Their AIP solution kind of sounds like something I built 10 years ago called AIware, where it's kind of newer to the ecosystem. But again, right now, we've been successful competing and winning with some of these contracts.
But again, not against it at all, but I think at least if it's more pure software-type players like Palantir, I think we'll be going direct for a while.
I'm guessing you guys did over your 12 months. You guys have done like 100 revenue, which if you start getting into this market, it's like that could easily be a weekly revenue.