Status Update

Oct 16, 2024

Speaker 4

Lawyers, haven't you had enough of this? Who wants to pull another all-nighter? You? You? Didn't think so. Who's gonna review a ton of evidence? You? You? No, Cecilia is. Behold, DISCO's new AI assistant. It's a legal machine. Hear it hum. This is change. This is freedom. This weekend, power weekend!

James Park

Director of AI Consulting, Disco

Yeah!

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Beast.

That escalated.

Operator

Awesome. Good morning or good afternoon, where everyone is calling from. My name is Kiana, and I'm so excited y'all could join us today for the Using Gen AI for Doc Review: Skills and Best Practices. I'm so excited y'all could join us, and I wanted to go through a few things before I kick it off to our fearless speakers. Housekeeping, if you're having any trouble whatsoever during this webinar, whether it comes to sound, being able to chat, Q&A, anything like that, please feel free to reach out to me, Kiana Millican. There's gonna be a parentheses that says, "Host," next to me in the chat. Feel free to chat me with anything you may need. Couple reminders, this webinar is being recorded, and we will send you the recording tomorrow afternoon, about twenty-four hours after the webinar.

So you can share that or just re-watch it for your own. Feel free to ask questions anytime throughout the webinar in the Q&A box at the bottom, in your toolbar at the bottom of your screen, and you can ask questions all throughout the webinar. We'll try to address some throughout the webinar, but we will set aside some time at the very end to answer any questions that we may not have been able to get to. Lastly, if you wanna see more of Disco, feel free to scan the QR code. We'll have this on a couple of slides throughout this webinar and this presentation as well, but feel free to scan that if you wanna chat with your CSM, if you're a customer of ours already.

And if not, if you just wanna have a chat about what Disco can offer, feel free to scan that as well. Now, I'm done with all my talking and all my housekeeping, and I'm gonna go ahead and kick it off to our fearless speakers today. Go ahead and take it away, James.

James Park

Director of AI Consulting, Disco

Sure. Thanks, Kiana. Hi, everyone. Thank you for joining. My name is James Park. I am the Director of AI Consulting here at Disco. I've been in the e-discovery industry now for sixteen years to all of my legal career, since passing the bar exam, and all of it at the intersection of technology and law. I've supported clients through all types of matters, second requests, CIDs, antitrust matters, or employment matters, and all of it using analytics, TAR, and now starting to advise our clients on how to best leverage their Gen AI software suite. So thank you for joining.

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Hi, everyone, I'm Robert Harrington. I'm the Senior Director of Machine Learning and AI. I've been at Disco for a little bit over five and a half years, and I'm actually now the head of the AI team that does all of the research into, you know, ways that we can use the new AI that's, you know, on the market for you guys, for attorneys. And then I work very closely with our engineering teams to implement the AI, you know, into the app. And of course, I have the pleasure of working with James, you know, with all the work that he does. My background is actually physics. I got my PhD in 2007. I did physics for about 17 years before switching over to AI.

I've been doing AI for pretty much all of the last seven years. So very happy to be here. Very excited to talk to you guys about Gen AI.

James Park

Director of AI Consulting, Disco

Cool. So before we dig in, I'd like to start with a quick poll to gauge your thoughts on the areas that you think Gen AI will be the most impactful. So, here are the choices, and please select three: document review, fact investigation, finding case law, analyzing case depositions, and finding evidence. And we'll give you all a few seconds to enter your votes before we share the results. I know what I would vote for.

Robert Harrington

Senior Director of Machine Learning and AI, Disco

I'm gonna vote.

James Park

Director of AI Consulting, Disco

Yeah, I tried, and it won't let me. Okay, so, wow! So, 92% of you think document review, followed by analyzing case depositions. Interesting. And then, kind of the other three, similar, about 30% or so. So very, very interesting. So, when we conducted this survey independently, if you go to the next slide, somewhat similar, so we saw document review as being the number one impactful area, and I think that is consistent with what you all said today. So 81% of our respondents in the poll identified document review as the most likely to be impacted by Gen AI.

This shows that the legal profession anticipates significant, efficient gains in workflow that has traditionally been slow, tedious, and burdensome human for using humans to conduct document review, but so what are the kind of big challenges that impact document review and where the Gen AI can be really impactful? Really, we see three primary challenge areas. One is volume and complexity, right? I think this is probably the number one challenge. We know that the volume is continuing to expand and expanding at an accelerated rate. I think I saw a 2022 prediction or poll or data that said, you know, roughly 4 million emails are sent every second. People generate an estimate of about 1.7 megabytes of data every second, and that was in 2020.

And the prediction is that by 2025, the amount of data made each day is expected to reach 463 exabytes. And I actually had to look up what an exabyte was. Exabyte is 1 million terabytes, so 463 million terabytes each day by 2025, right? So data sources continue to evolve. You know, back in the day, when it was just emails and maybe Word documents or Office documents, it was manageable, right? With the explosion of volume and different data sources like, you know, text messages, social media posts, cloud-based files, and with those cloud-based files comes a host of other challenges, like drafts and links to attachments on cloud storage sites, things like that.

All of those sources present its own unique consideration, which then leads to time, right? Cost and time. So all of this exploding data means that someone is going to have to review these when a document review or litigation or investigation happens, and that, with the amount of data volume increasing, that costs time and money. Even with TAR, or technology-assisted review, you're talking about a lot of data, right? So even with the savings of time and money that TAR affords, it's still the cost and time continues to grow. And more review means more time to complete a project. So, and a third challenge, I think, is accuracy and quality. So, all of this data means you have to use a lot of human reviewers, with our current technologies.

As we know, I think, and it's been demonstrated through a number of studies, that humans are inconsistent, not only across multiple people, but also the same person, depending on certain external factors, like how long they've been working at it. Did they get enough sleep? Are they hungry? Are they not feeling well? Those are all of the different factors that go into some inconsistencies that we see across, you know, different people and even the same person. So how can AI help, right?

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Okay, so I just wanna talk a little bit about AI. I wanna, you know, raise the hood and maybe give you some insight into what's happening under the hood. And I don't worry, I'm not gonna go too deep. But basically, AI is either unsupervised or supervised. And what unsupervised learning means is the models are making evaluations or predictions without the user having to go in and do a bunch of labeling or to, you know, tag a bunch of documents. So some examples of unsupervised learning, which you may have seen, you know, anything that does clustering, like the topic clustering, you know, that we have at Disco.

Anomaly detection is something that is, you know, relatively straightforward to do without having without giving the models a bunch of examples of what to look for, some types of analytics. But this is all unsupervised, meaning the models just, you know, just work kind of out of the box. You don't have to give them a lot of input to train on. On the other hand is supervised learning, where you have to give it examples. So the common examples of this that some of you may be familiar with are TAR review, TAR one point oh, TAR two point oh. The way these models work is, you know, behind the screen, as you're tagging documents, the models are learning from your tagging to make predictions.

Whenever they make a prediction that disagrees with what you tagged, well, then it says, "Oh, oops, I screwed up there," and then it learns. It keeps learning through iterations until it starts making predictions which are in line with what you're expecting. Then whenever we reach that point, then we say, "Hey, this model is trained," and then we can use it to reliably make predictions. Now, generative AI is a little bit strange because, at its core, it's based on supervised learning, because there are these huge models that trained on, you know, essentially the internet and, you know, any information that the people that created these models could get their hands on. They're generally used in lots of different ways.

And most of the LLMs that are available now, they're general-purpose large language models, and they could be used in lots of different ways. And the beauty of these large language models is that you can use them in ways that they weren't specifically trained to be used for. So when they train these models, they're basically just learning though to understand language, and we call it artificial intelligence. I mean, the models, they're not really intelligent, but you know, they're coming pretty close to it because what they're trained on is, given a certain amount of text, they're predicting the text that would come next. Or maybe you can block out some text, and it's predicting the text that would have been you know where the text was that's blocked out.

So the better it gets at doing that, the more closely it's kind of approximating human intelligence. Next slide, please. So just a little bit more about large language models. I mean, you know, you hear about AI, you hear about LLM. So when it's called a large language model because it's a large model, and it learns language, okay? But they learn from basically human-generated data. Now, that's not common, there are some LLMs that are able to train on synthetic data, and I would guess, you know, as time goes on, there will be more and more of these. But at its core, you know, even if it's synthetic data, the models that generated that synthetic data was trained on human-generated data.

So to the extent that humans understand or don't understand various concepts, then they put this information out on the internet. That information is what these LLMs are learning. So they're going to have basically, you know, the same level of understanding that the internet has. Of course, within the internet, you know, it's very broad. There are gonna be some very technical documents that those LLMs will train on, so it will have some understanding. What these LLMs are really good at is extracting and processing text. So to the extent that I'm able to, I try to focus our use of LLMs on what they're actually good at, which is exactly that. You know, it can be seen as a strength or a weakness.

They really require specificity, and if you give it ambiguous instructions, it's going to, you know, do what you're asking it to do. But if there's ambiguity, you're gonna leave it a lot of room to interpret, and it's a little bit hard to know how it's going to interpret what you're asking it to do. And we say that it has the contextual understanding of the internet. It may not understand something very specific if you ask a very detailed question, like, let's say, in the medical field or maybe in the legal field. Again, it might understand it, but, you know, it may not have seen a ton of documents about that specific field.

Okay, so now under the hood, we're using large language models for auto review, which James will tell you about more in the future. And this uses descriptions that you're passing in that will tell the large language model what to look for for specific tasks. So for these descriptions, it's important to be clear and to keep the instructions as simple as possible, because then you can be more confident that the LLM will do what you want it to do. They tend to behave fairly literally, and then they tend to read the instructions and respond to them sequentially.

So just a little bit of information for you to understand, like, how LLMs work, just so that when you're working with it, you know, the more you understand it, the more you can make sense of how to make it work the best way. Next slide. Okay, so on to the next poll. How soon do you expect your organization to integrate Gen AI into its routine legal processes? So, A, you've already done it. B, less than six months. C, six months to two years. D, never. Or E, you're not sure. And we'll give a few seconds for you guys to respond. Okay, so it looks like it looks pretty even. So about 20% have already adopted. Another 20% say less than six months.

A little bit over a quarter, say, six months to two years. 2%, never. And then 29%, unsure. Okay, can we move on to show you the results of our poll that was done earlier in the year? Next slide, please. Okay, so it's actually very close. So according to the poll, 20% have already adopted, 20%, less than six months, well, 19. And then six months, we actually separated out six months to a year and one to two years, but the combined total is about 30%. 1%, never, and then 20%–26% unsure. It's actually very close to what you guys did, so a nice check on our poll. Next slide.

James Park

Director of AI Consulting, Disco

Those of you that said that never, challenge accepted. It's for me personally, even you know beyond just being in the space, using generative AI for my daily things have really helped me, so really love to see you know the organization continuing or starting to adopt Gen AI tools in their toolbox, so how do the you know the kind of the two main flavors of document review, and we'll talk about TAR in a minute, but really you know human-powered review versus Gen AI-powered review, and I put TAR in the kind of the human review bucket. What you see here on the screen are essentially a high-level processes, right?

A lot of people think, well, Gen AI is such like a, it's a completely different thing, it's brand new, and it's untested, but and things like that. But really, if you look at break down the process, it's actually fairly the two processes are fairly similar, right? So they both start with the review protocol. And the review protocol is really just a set of instructions about what tags to apply when. You review some samples based on those review protocols, and in Gen AI-powered review, these are specific samples. In human review, you start reviewing documents, and then you go through some QC, review those samples of QC documents, see where humans are applying tags correctly and maybe where they're erroring.

And then you course correct, right? And you do the same thing with generative AI. You review samples, have GenAI review the same sample, compare the results, and you course correct. Course correct is typically providing clarifying instructions, and that's same for both human and GenAI-powered review. In humans, you might, you know, if there is a reviewer that is consistently getting something wrong, you might have a one-on-one and provide some additional instructions. If it's a more systemic, you issue a kind of updated or clarified instructions to the entire group, right? For the GenAI, you update your tag descriptions, and you run it over the same data to see if those tag descriptions are now giving you the results that you expect based on the human review of the samples, right?

So up to this point, the processes are fairly similar. Of course, the big difference is, you know, who does the bulk of the review? In human review, humans do the bulk of the review. In GenAI-powered review, the AI does the bulk of the review. Because of that difference, that leads to another difference, which is that, in GenAI-powered review, you also validate the results by calculating recall and precision metrics, right? So, once you complete the review, you can then defend those results by showing that, it, the process was reasonably proportional, and you have the recall and precision metrics to back up your claims. So you might be thinking, well, you know, you validate TAR, right? So that's not new, and you'd be right.

We've been using recall and precision metrics for over a decade now, so it should sound familiar to many of you. It's at this point, I think it's generally accepted way to validate a methodology or search and review methodology to identify documents to produce and to meet your discovery obligations, but how you leverage AI for TAR and GenAI-powered review is quite different, so if you go to the next slide, so TAR leverages machine learning algorithms that learn features of relevant documents based on the examples you provide, right? So that means, at the beginning, it doesn't know very much, but as you provide examples, it continues to learn. This is the supervised learning that Robert mentioned earlier.

In a sense, this is like Netflix, and you may have heard this analogy before, but I think it's a very appropriate analogy, but really, any platform that suggests what you might like based on things that you've previously liked. In Netflix, they have the thumbs up button, and recently, I don't know how recently, but something I recently noticed, which is that now they have two thumbs up, right? To give it even more, even stronger signal that you really like something. The more thumbs up or two thumbs up or thumbs down that you provide Netflix, the better Netflix becomes at predicting movies and TV shows that you might like. And that's how TAR works. You provide examples of responsive and not responsive documents.

The algorithm gets better at predicting other documents that you think, that it thinks will be responsive or not responsive. So TAR works based on examples that you provide. GenAI, on the other hand, you don't need to provide any examples. You use your natural language to write instructions to the AI about what a contour of responsive document looks like. And obviously, this works for any substantive tag, so it doesn't just have to be responsive, not responsive. It can also be issue tags or you know, particularly hot button issues, et cetera. As long as you can describe the contours of what you are looking for, the LLM can then find those for you. So in a sense, where TAR is like Netflix, GenAI is like having a...

If you're looking for a very particular type of article and you have, you know, thousands or even millions of articles to sift through to find those articles, it's like having a very knowledgeable editor look through all of those and really pull out only the ones that you care about. Another analogy, and I want to be a little more, maybe personal analogy, and this might resonate with some of you. Growing up, my mom would cook Korean food for me to eat, right? And if you've had Korean food, you know that there are all these little side dishes called banchan that accompany every meal, which are phenomenal. But I don't like all the banchan, right?

And when I was very young, my mom would cook a variety of banchan, and over time, she would learn what I liked and disliked by seeing what I ate and what I didn't, right? And then, because she's amazing, hi, Mom. She would make more of the things that I like and less of the things I don't like. This is like TAR, learning based on examples, right? Now, kind of switching gears, as I got older, rather than she learning what I liked and disliked based on what I ate, I could just tell her, "These are the things I like, or these are the flavor profiles I like, these are the kinds of things that I like, these are the things I don't like." And, hint, I don't like anchovy, right? And then she can just make-...

more of what I asked for and less of the things that I told her I did not like. This is like Gen AI, learning based on instructions. So what are the key considerations when creating these instructions? Well-

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Oh, yeah, so yeah, for the next couple of slides, I just want to give you some, you know, suggestions on how to tell the AI what you want it to do when you're using the Gen AI, so basically the Gen AI will predict tags based on the text that's found within the four corners of the documents, so if you give it instructions that rely on, you know, let's say, the metadata, you know, the date that the file was written, if it's not actually in the file, it's probably not going to do that very well. The other thing is, generally it's better if the tags are... if the definitions of the tags are independent of each other, just so that they stand on their own.

You know, don't assume that the tags that the LLM will understand something from one tag because of information you gave it in a separate tag. The tag should always refer to the content of the document. You know, the types of tags that you would feed into auto review are, you know, you don't want to feed in the kinds of tags that are just administrative tags. Like, you know, if you're doing your review and you tag a document, needs further review or, you know, ready for production. These are not the kinds of tags that the Gen AI is going to be able to understand very well. You know, stick to the tags that are actually driven by the content of the document.

Now, you know, we talk a lot about how to use AI in the best way, but really, I want you guys to all, you know, take a deep breath and realize you guys are already good at this. Because it's just a matter of using language, and y'all use language, you know, for a living. In fact, in the early days, and even still, whenever my team was doing development of these new gen AI models, if we were struggling to get the AI to do what we wanted it to do, we would go to one of our attorneys on our product team, and we would say, "Hey, look, the AI is not listening to us.

How can I reword this so that it will work better?" And I found over and over and over again that the attorneys were so much better than the scientists at just using language. So you guys have got this. It's just a matter of playing with it and figuring out what works well, and we give you everything you need in the app to do exactly that. So you basically try something. If it doesn't work the right way, then tweak it until you get something that you're happy with. Okay, so the language that you're using already in your RFPs, your review protocols, those you can just take and pretty much map to the...

Map the relevant descriptions for each tag and use those directly as your initial, you know, attempts in the same way that you would do it working with human reviewers, as James mentioned a moment ago. So next slide, just a little bit more some more hints. So just when you're writing your description, be explicit. Consider the order of the information that you're giving it, because, as we said, the LLM does rely on the order. Stay within the four corners of the document. Keep your descriptions short and simple, and if you need to, if you find that a description for a tag is becoming too long, then, you know, break it up into one or more tags. And that's something that you might have more success with.

But really, test it, tweak it, and then try it again. And trust your skills. You know, you've been doing this already. You're just instead of working with a human reviewer, you're working with an LLM, which in some ways, you know, it's just different, and you'll get used to it. So next slide, James.

James Park

Director of AI Consulting, Disco

Yeah. So, I won't go into a lot of detail about Disco's Gen AI-powered review, but if you're interested, there is a QR code on the screen, and we'd be happy to provide you with more information. So, briefly, Cecilia, if you haven't seen Cecilia, Cecilia is our gen AI suite of products. We currently have five gen AI or Cecilia skills, and Auto Review is our newest release, which is our skill that uses large language models that Robert so helpfully illustrated or described in tagging documents. So what you see here on the screen is the app where you enter tag descriptions and get an output, right?

Cecilia Auto Review allows you to essentially perform that first pass review using just a tag protocol, like you would use, or do the review using contract attorneys. So you write tag descriptions like you see on the left-hand side of the screen, and your entire population can be categorized, you know, within hours, depending on volume, obviously. Not only that, but after you run AutoReview... Go to the next slide.

Operator

Thank you. Each tagging suggestion comes with a specific explanation as to why or why not Cecilia applied that tag, right? And you can see that explanation in the overall document list, which is on the right-hand side of the screen, or when you're looking at an individual document, which is the left side of the screen. And, you can also export these tag suggestions and their explanations, right?

So imagine trying on a human review, imagine trying to get this level of detail from each reviewer for every document, right? So in some sense, and one of the things that I've heard a lot about TAR in the past is that it's a black box, right? We don't really know. We can guess why certain documents are scored a certain way, but we don't really know. Here, the ability of LLM is bringing more transparency into its decision-making process, right? Now, you can actually understand why an AI is either applying or not applying a tag, which is super important for us to understand the mechanisms that the AI is using or the reasoning it's using to either apply or not apply a tag.

And more importantly, if Cecilia is getting the decision wrong, how do we course-correct, right? How do we iterate, tweak, that Robert discussed? How do we do that effectively so that, on the next run, Cecilia will get it right? These explanations are the key to allow you to do that. But this isn't magic, right? It has to, in order for you to get good, defensible results, you also need a solid process. So what is that process? Go to the next slide. So this is our process. By no means is this the only process, but we think it's a good framework to ensure that you have good instructions and defensible results, right? So, we first start with what we call an alignment sample.

It's a small sample that allows you to have humans review some set of documents, have an LLM review the same set of documents, and essentially look for where human and Cecilia disagree. Find those and see who's right. If Cecilia is right, then you update the tagging, right? If human is right, then you need to understand why, so use those tag explanations and adjust the instructions, and you continue iterating on that sample with tweaks and see the results until those instructions work well on a sample, and then you run a second sample, and the reason we run a second sample is because there's something, you may have heard of this, called overfitting. Typically, it's talked about in a sort of model being overfitted to a specific sample.

So it works really well on a small sample but doesn't work well on a larger population. It's the same idea. We don't want our instructions to be so narrowly tailored to that first sample that it works really well on that small sample, but when you run it over another set of documents or the full population, the instructions no longer work well, right? So we want to mitigate that risk by pulling a second sample, seeing how the instructions work against that second sample, and if it works well on that second sample without any tweaks, right, then chances are good that the instructions are general enough that it will work on the rest of the population. One of the things that you'll want to do, even on the second sample, is, again, look for human-Cecilia conflicts, right?

And then look at those. Before you start iterating on the tweaks, right, look at those conflicts, see who's right. And what we found is that, at least in our testing, when you have these human-Cecilia conflicts, more times than not, a human will actually agree with Cecilia's judgment and Cecilia's explanations, right? And it's really those explanations that are, I think, key. Oftentimes, you look at an explanation, see why Cecilia tagged document. You find that passage and say, "You know what? That's something that I missed," or, "I can see that point," you know. So I think in our testing, about 80% of the time, humans actually ended up agreeing with Cecilia's results. So once you go through those conflicts, see what the metrics are.

If the metrics are good without any tweaking to the tag descriptions, then the tag descriptions are probably good. If, on the other hand, you go through that conflict on the second sample and your metrics are not at an acceptable level, then you rinse and repeat, right? So you iterate on your tag descriptions, continuing that on the second sample, and then once the tag descriptions are working well on that second sample, you pull a new, fresh sample, and you essentially continue that process until the tag descriptions is working well on a fresh sample. Once that happens, then you can run it against the full corpus, and then you validate your results, right? I mentioned, you know, the validation methodology before. We use two sampling methodologies to ensure that our precision and recall are within acceptable limits.

And, you know, we can go into more detail about what that methodology is if there are questions about it. So what does Gen AI-powered review with rigorous process get you, right? So here, like I mentioned, Cecilia Auto Review lets lawyers do first pass review via generative AI. Cecilia never sleeps, so it's as fast as 120 contract reviewers working an 8-hour day. Cecilia has, or often we see, you know, precision and recall than at least published levels for human reviewers. It's consistent, it's fast, and it provides explanations for every single tag.

And during our pilot with an Am Law 50 firm, a senior e-discovery attorney told us that, "The recall and precision rates are very satisfactory and defensible." And we'll have another quote in a little bit, but this is a pretty high praise from e-discovery expert. And, you know, on the speed, we generally see right now throughput of about 3,800 documents per hour, which is pretty powerful, right? So, if you go to the next slide. On one particular leanly staffed case, an Am Law firm looked for GenAI tool for their first pass review. And it's a small population, but on a population that would have taken six hours to review, AutoReview got through the documents in ten minutes, right? So it takes a day of review down to something like a coffee break.

And the results were pretty incredible. For this particular matter, we had 81% recall and 93% precision. Again, higher than published rates for human review. This type of result is why we, and, as you saw in our early poll, many in the industry are super excited about capabilities that GenAI brings to document review. So if you go to the next slide, here's the other quote, David Charney from Orrick. I won't read the whole quote, but some of the, those kind of key words, "Game-changing, fundamentally improve the practice of law, the value we deliver for our clients." Right? So this is excitement. This is why we're all here with you today.

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Okay, so in conclusion, what do you need to succeed in the new landscape of AI? So again, first and foremost, trust yourself. You guys already have what you need to do this. You guys already understand language. You already know how to describe the tags because you've been doing it with your human reviewers. The process of iterating with an LLM to get the tag descriptions correct is very similar to what you've been doing already with humans. Just keep in mind, there are two key differences. One, the LLMs are much faster, so you need to do more work upfront before all the documents have been reviewed to make sure that it's okay.

With human reviewers, you know, if only a day or two go by before you've realized that the descriptions are not quite what you want them to be, well, in one or two days, your human reviewers will not have reviewed, you know, probably the entire corpus, whereas the LLM will probably, you know, it can be done by then. And then, two, the LLMs will be more literal than the human reviewers, but at the same time, they'll, they'll tend to be more consistent. Okay, so be sure to, that you have tag descriptions that you trust before you push the big green, the big green button to review your entire data set. That's the biggest advice I can give you. So I think... yeah, James?

James Park

Director of AI Consulting, Disco

Yeah. And, I do see, well, I know we're going to have Q&A sessions, but I do see a couple of questions in the chat. So, one is: What data goes back into the LLM training set? John Shim, thank you for the question. The answer is no, right? So, we know that, there's been a lot of talk of violation of confidentiality and having people accidentally asking tools like ChatGPT, feeding it with confidential information, that then went into training the LLM. Yeah, so none of that goes into our LLM, so we're SOC 2 compliant, ISO 27001.

We do not have the AI train on any of our data, so our data is completely secure within our AWS environment, and none of the client data or the prompts go back to training the LLM. Michael Sylvain: How does LLM deal with foreign language documents? Robert, I don't know if you want to...

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Yeah, I entered in the chat, but generally, the LLMs are able to understand the languages that are commonly occurring on the Internet. I mean, we haven't done any, you know, detailed study specifically for AutoReview, but we found that for, let's say, Cecilia summaries, you can give a document in a foreign language, and it's actually able to do the summary in English, and we've tried this on a handful of languages. So yeah, it's. One of my researchers, he was in Ukraine a couple of years ago, and he was actually finding that the models worked really well with Ukrainian documents. So it's kind of anecdotal, but so far we haven't found a language that the documents don't understand.

Usually these models, when you look at the numbers of languages they're able to understand, they're like over a hundred. We don't have any. The vendors, they're not really willing to tell us exactly which languages, though, because it's just whatever languages they saw on the Internet.

Yeah. Thanks, Robert. And I'm just gonna go through this slide real quick, and then we'll go to answer some existing questions. So tips for GenAI purchasing, right? So whether you decide to purchase GenAI products, some things that you might want to consider. Obviously, there are a lot more, but we, at a high level, these are some basic tips, right? So have an explicit set of requirements. What problem do you want it to solve? What are the musts versus nice-to-haves? And what is the success criteria, right? So having that explicit set of requirements will help you decide whether a tool is meeting those requirements. Know your baseline cost and quality. So are there trade-offs between costs and benefits, right? It doesn't always have to be cheaper.

There might be other things, like, speed, performance, consistency, downstream cost savings, upstream cost savings. So these are all the things that you can consider, but you need to know what your baseline is first before you can judge whether or not, this GenAI, tool, whatever the tool is, is generating sufficient ROI for you. Set expectation for transparency. How will you know whether or not it's performing as expected? How much input will you have, right? So, you want to set expectation of kind of how much transparency you want from your tool. And then, I think, you know, it's so important to have that real-life pilot. Whether it's a pilot, or a proof of concept, things sound great, right? All the kind of the benefits sound amazing until you actually do it sometimes, right?

So seeing how the tool works for you in the wild is going to be the best barometer for whether or not the GenAI tool will work for you. So there are other things you want to consider, and how to drive buy-in and all those things can be kind of separate webinar on its own. But at a high level, you know, we think these are some basic tips that you want to consider when you're thinking about purchasing GenAI. And obviously, if you want to speak with us about anything we want to or we shared with you today, feel free to scan that QR code, and we'll be happy to talk with you. So I know there are a couple other questions, so we can address them now.

James Park

Director of AI Consulting, Disco

José Manuel González: "Does this mean that we should be able to create descriptions based on RFPs and have GenAI automatically pull documents responsive to each request?" It's a great question. The short answer is yes, but the long answer is probably a little more nuanced than that, right? So if you think about an RFP. So there are one when you enter your prompts, we at least have character limits, and RFPs can be very repetitive. So what we actually suggest is, first, outlining all of the RFPs to specific topics, right? And this is be similar to how you would identify tags that you want people to apply.

And then, you know, for this tag, RFPs one, five, eight, twenty-two, twenty-five are relevant, and then write a tag description based on a combination of those RFPs. And that's to just streamline the types of or the information that we're sending to the LLM that it, you know, it's streamlined, and also, you know, to minimize the number of characters or tokens that you're sending to the LLM.

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Kevin Daly asked a question, I wanna make sure that we respond because we get this question a lot. "Does Disco use client documents to train the LLM, and would it ever become part of the aforementioned public databases for LLMs?" Yeah, the answer is absolutely not. Now, when I mention that these LLMs are trained, I mean, these LLMs are trained on huge datasets before we use them. But once you know, once they're given to us and we start using them, there's no additional training that's happening. The only thing I mean, it's not learning from the iterations that you're doing. The only thing that's happening is you're giving better and better tag descriptions, but there's absolutely no learning. Yeah, so I think that's the short answer.

Now, we talked about TAR 1.0 and TAR 2.0. Now, I mean, that technology, that is learning from your documents, but that's learning to train a model that's only used to do scoring on your documents. So there are no use cases where we're using your data to train any models whatsoever, that are gonna be used, for any other customer's, data. You know, it... Well, okay, putting aside Cross-Matter AI, which allows you to use TAR-type models across an organization. So that's, that's one little caveat.

James Park

Director of AI Consulting, Disco

Yeah.

Robert Harrington

Senior Director of Machine Learning and AI, Disco

Did anybody-

James Park

Director of AI Consulting, Disco

I know we're out of time, and I know there are a couple of questions that we did not get to. We apologize, but well, we'll be sure to follow up with you to answer your questions, and obviously, if you have any additional questions, feel free to reach out. Jules shared a link in the chat. Obviously you can scan the QR code, so please get in touch with us, and we'll be happy to speak with you more.