All right, well, welcome. Thank you for joining us for a live demonstration of Zebra deep learning-based OCR tool found in our Aurora Vision software. Before we begin, allow me to take a moment to introduce our main presenter for today, Mr. Armando Lopez. Armando holds more than 15 years experience in machine vision, acting as one of our Primary Sales Applications and Engineers here at Zebra. He holds an electronics engineering degree from Monterrey Institute of Technology in Mexico and also carries the Certified Vision Professional certification from A3 Vision and Imaging. Chances are, if you have a question about machine vision, Armando is going to know the answer. Welcome, Armando. Thank you for joining us. To introduce myself, my name is Timothy Labrie-Cameron. I serve as Inside Sales for North America here at Zebra.
I've worked in machine vision for almost five years now, having joined Zebra from the team at Matrox Imaging. I act as the first point of contact for incoming machine vision inquiries. We may have spoken before. I help with marketing initiatives, such as this webinar. Let's jump right in. We're, like I said, looking at Deep Learning OCR today. Conventionally, optical character recognition can be an extremely challenging application. There's a lot of variables at play, as you probably know. Well, we here at Zebra have essentially solved that challenge with our Deep Learning OCR. It's a pre-trained CNN that allows for easy setup and deployment of almost any optical character recognition project, essentially saving you the time and cost that would typically be associated with a project of this potential complexity.
Today we're going to show you the power, the versatility, and the ease of use of our Aurora software while using the Deep Learning OCR module on, I think some pretty challenging images. After the demo today, we are going to be conducting a live Q&A with the audience, so please feel free to use that Q&A board you see to the right of your screen, should you have any questions during the presentation, and we're going to do our best to get to answer as many of those questions as possible, given our short time together this afternoon. With that, I'm going to hand the mic over to Armando. Armando, teach us why Deep Learning OCR is such a powerful machine vision tool.
It is. It is. Well, first of all, Tim, I'm going to invite you to all of my meetings with that fantastic voice. I don't have that voice. Sorry about that, right?
Just a fancy microphone. It has nothing to do with my actual voice.
I need that microphone.
Yeah.
Yeah, we have a lot to cover today. We're going to just jump into... I'm going to turn off my camera. I just wanted to say hi, right? I'm a real person, right? I'm not a ChatGPT robot, right? I'm going to talk and show you why this is cool and why this is nice and why this is going to help you, right? Nobody buys smart cameras or visual systems because they are cool, right? We buy them because they're going to help us, right, somehow. Before I actually stop showing my face, or before I turn off my webcam, I'm just going to let you know what kind of equipment I'm using. I'm using this smart camera right now.
This algorithm, this super cool OCR algorithm that we have, we can use it in any of our smart cameras and also in our PC-based platform. I just wanted to show you, right, what we're using today. Okay, well, I'm going to turn off my video for a second. You don't need to see me. And let's just jump into it. Just like Tim said, I've been doing machine vision for many years, almost 15 years, and there's I've seen a lot of different applications, right? One of the least favorite application of engineers is usually OCR, right? Why OCR? Because it's kind of picky. It's kind of, it's very strict, right? You're going to see that through the presentation, but you need to have very controlled scenarios.
Deep Learning OCR opens the door for more things. Okay, let me just change screens in here. There you go. This is a quick overview of what we're going to cover, right? A quick history lesson about reading, right? Is reading easy for humans, right? What is OCR? A super cool demo. Not just a demo, a super cool demo. What is deep learning, and what are the differences, right, between Deep Learning OCR and conventional or teachable OCR? Are they friends or are they enemies? What do you think, Tim? Are they friends or enemies?
Um...
We'll see.
Yeah, we'll see. We'll see in the upcoming bout. Yeah.
All right, let's start. Right? It's like, oh, my God, let's just start. I can't wait anymore.
That's my actual picture, by the way.
Is reading easy for humans? No. Right? I don't know if you remember how it was learning how to read, but it took you years, most likely, right? Or maybe months, if you're a savant or genius, but most likely it took you years to learn how to read. It's very difficult, right? It requires a lot of skills, a lot of focus, a lot of trying, a lot of going over and over and over, right? Decoding, language structure, vocabulary, background, word recognition. You need to see what you're trying to read. You need to understand that comprehension. To be able to read, right, us humans, to be able to read, it takes a lot of time. It takes a lot of effort, right? Actually, it's very difficult, right? Going again, just a quick lesson, right?
I mean, talking or speech, that's just natural for humans, right? A baby would talk just by being exposed to adults, right? The human brain has mechanisms or built-in systems to handle that, right? Opposing to talking, reading is a cultural invention, right? Reading is obviously related to writing. We invented writing maybe 4000 or 5000 years ago, right? Reading is a complex, focalized skill that needs to be taught, a skill that takes time to master. I have a two-year-old daughter, and I'm trying to teach her how to read already, and it's not an easy task, right? How is this related to machine vision? It's like, well, you know, I don't wanna talk about humans, right? Let's talk about cameras, right? How is this related to machine vision or image processing?
Vision system algorithms or machine vision is trying to mimic, is an abstraction of humans, right? If you think about it, you know, eyes, brains, is cameras processing, right? A camera is trying to do what a human is trying to do, and specifically OCR, right? A camera is trying to read. What is OCR, right? For 100 Zebra points, what is OCR? Tim already spoiled it, right? It's optical character recognition, right, no points for anybody. Optical character recognition. What is that? It's a processing technology that extracts text content from an image, right? If we put it in different words, it's a camera being able to read from a picture, right? We have that cute little robot. My daughter liked that robot. That's why I chose it, right?
Is OCR new? No. OCR has been around for many years, right? Around 50 years, right? Someone invented OCR, roughly 50 years ago. Who invented OCR? We don't have time for that, right? You can google that on your own time. Does OCR works well, conventional OCR? Yes, it works well if what you're trying to read is consistent or if it's the same as what you were expecting. Meaning, what are you talking about, Armando? conventional OCR is gonna work well, or traditional teachable OCR is gonna work well if you're trying to read maybe a basic standard format, like OCR-A, for example, and if the image always looks the same, then conventional OCR is gonna work just fine.
If you take the time to train in the vision system software, the characters that you're trying to read. If you say, "Hey, I have a lot of time, yes, I'm just gonna start saving all these images, and I'm gonna start teaching all these different letters with different lighting and different positions and different angles and different backgrounds and whatever," right? Conventional OCR is gonna work fine if what you're trying to read is very consistent: same font, same size, same color, same background. If the contrast is the same or if the contrast is something you're expecting, the contrast, same light, same reflectivity. If the new image that is coming has the same focal plane, sharpness, that is not blurry, that is not, you know, like, it has to be exactly the same.
Conventional OCR is gonna work well if everything is exactly the same as you were expecting.
But we don't-
Uh-
We don't live in a perfect world, do we?
Not anymore, no. Now, that's a lot of ifs, right? That's a lot of ifs. Like, I don't live in that world. Let me give you a visual, right? It's like, it's like, what do you mean by it has to be the same settings or the same conditions? Well, let's try this. Let's say you're with your friend, OCR, you know, having a drink. "Hey, OCR, read this," right? Your friend, the OCR algorithm, is gonna say, "No problem. I got you. That says Zebra. I'm already trained to understand those letters and the way they look. I know those letters. I know that contrast. I know that size, that font. That says Zebra." Perfect! Conventional OCR is amazing. What happens if something changes? "Hey, conventional OCR, now read this." "Uh-oh, I don't like it.
I don't like it. Oh, I'm not sure. It kind of looks similar to what I know, but, no, it's just, it's too different. I don't know what that is." Conventional OCR is gonna struggle. What's the difference between the first image and the second image, for 100 Zebra points? The difference is... Oh, who knows? Tim, do you know?
Yeah. Well, exposure, contrast.
Of course.
Yeah.
The difference between that second image is overexposed, right? Less contrast. Our friend, OCR, is struggling with that one. He doesn't like it, right? Just in summary, if the material changes, the reflectivity, the contrast, the background, if something changes, if something is out of control, if something is unexpected, conventional OCR is not going to like it, right? Now, if you think about it, I just put some examples in here. These are very different scenarios, right? We have the same part but with different contrast. We have maybe this is, like, dot printing, right? Maybe, now, you know, a consumable product, expiration date. This is ink printing. We have maybe images that have very little contrast, or we have parts that are very noisy. This is a terrible image.
We have things with inconsistent backgrounds, right? There's a diagonal across this zero and this nine. You, if you are a machine vision expert, you're gonna say, "Hey, I can make it work." Of course, I can make that work. I just have to teach, save all these images and create a very rich, very powerful library for fonts. I just have to, you know, teach all this every single time. Yeah, okay, yeah, you can do that. Good luck with that, right? I mean, how much time is gonna take you to do all of this? You need to have skills. Maybe you have the skills, perfect, but still, it's gonna take you time. It's gonna take you time to teach your library to take all of this. What if the next credit card? This is my credit card.
Don't write down this number. What if the next credit card has a different background? It might, it might not work anymore. What do we do? It's like, Oh, my God, what do I do? Well, what if I told you, this is from The Matrix. What if I told you that Zebra has a deep learning-based OCR algorithm that can read all of that just out of the box without doing pretty much anything, right? Like, Armando, you're crazy. Yes.
Ooh, ah.
Yes. Sorry, is someone clapping? Yes. We are going to do it, right. How? Show me Deep Learning OCR. That's a movie also. Okay, let's do it. Let's do a super cool demo, right? Please let me know if you can see my screen. Can you see my screen?
We see it.
Now, these images might look familiar, right? It's what I had in the PowerPoint. Is this a live image? Yes, this is a live image. That's number two, that's number three, right? That's my only hint. Now, this is Aurora Focus. You can use also this algorithm in any of our platforms, like Design Assistant, but we're using today the Aurora Focus. Now, let's say my application is I need to read all these different types of words, and I just have two minutes because I want to go for lunch, right? My friends are waiting in the parking lot. I have two minutes to make all of this work. Well, here in the identification tool bucket, we have deep learning-based OCR. I'm just gonna drag and drop, right?
I'm gonna move this box somewhere on the screen, right? For example, if I move it here, you might see that I'm already reading. I haven't touched any of the settings. I'm gonna say that again. I haven't touched any of the settings. Can I be specific and just read some type of letters or some type of color of the letter or some type of size? Yes, of course, you could be specific, but, A, I want to read just out of the box. I don't want to do anything. Okay, read debit. We're reading the word debit here. I'm gonna zoom in because it might look small. Is this a beautiful image? No, it's kind of ugly, right? I intentionally make this very challenging, right? We're reading this.
Now, what if I want to read now a zip code, for example, this zip code over here? Yes, we are also reading the zip code over here, right? Now, what if I want to read this ink printing in here? Now, if you have tried to read this before, you, with whatever vision system you're using, you might know that this is usually complicated, right? Again, is this a clean, super cool image? No. Could this image be better? Yes, I intentionally did it kind of ugly. Now, we read debit, we read this zip code number, we read this Videojet ink printing or whatever with the same settings. Now, let's try to read the word Zebra down here, right? Can we read that? Well, let's try, right? If I put it here, Zebra, no problem.
Can we read this one in here? What does it say? I can barely see it. Well, let's see. I'm gonna put it here. I'm just gonna make this a little larger, and I'm just gonna move it, and I'm gonna put it here. We are reading. I'm gonna zoom in. I don't believe you, Armando. Well, let me zoom in. Don't believe me. Don't believe anybody. We can read it, right? Can we read this one in here that says lens? This one is extremely ugly, right? When I was trying to do this and showing this to people, like, "I'm gonna show this demo," they say, "No, don't do it, man. No, this is crazy." It's fine, right? Can you see now?
Again, if you have tried to do machine vision, OCR in the past, you do know that this is extremely ugly, extremely complicated. I'm gonna do one more. I'm gonna move this one here, right? If you notice, we are reading right away. This one, for me, is particularly interest because we having here different scenarios. We having here on the left, a bright background, we having here a dark background, and we have a diagonal going across the zero and the nine, and it reads right away, right? Again, the beauty of Deep Learning OCR is that it's gonna read pretty much anything that is readable on it without you having to touch any of the settings in here, right? Remember, I'm gonna. Maybe I said this already 2x . I'm gonna say it one last time.
We've read all of these with the default settings, without pretty much, without touching anything, right? Can you be specific? Yes, of course. You can say, just say, "I just want to read this type of thing or that type of thing, or only numbers or only letters," but I just wanted to show you that we can do something like this with default settings. I'm gonna show you one more thing, then we're gonna go to the PowerPoint. One of my managers told me back in the day, "Armando, if the demo went well, just stop demoing, please." Let's just do one more, right? You want to see one more? Now, what I'm gonna do here is, I just took the poker chip. That says Zebra, and I'm handholding this, and, you know, that's kind of like a never do an OCR handholding.
Let me just show you this very quickly, right? Try to do this. I will give you the homework of trying to do this with whatever system you're using right now. I would bet, well, maybe betting is not good. I would challenge you to get this type of results with default settings again. You would know that introducing this type of distortion or variability is never good, right? Conventional OCR is gonna say, "That's not cool, man. Don't change anything. Don't change the lighting. Don't change. Don't introduce optical distortion." This is just gonna work well. What do you think, Tim?
I am shocked and amazed.
Yes, me too.
I mean, really, if you've seen and know conventional OCR, yeah, exactly. That should be your reaction when you see that because it's taking all of those variables out of the equation that would normally stump conventional OCR, so amazing.
I like that.
I think it was a worthy gamble as the last hurrah there. Good job.
Going back to what I just demo, I mean, that what I show you were, like, really ugly images, right? Of course, I mean, we if you, in your production line or in your scenario, if you help the algorithm a little, right, if you try to honor the basic machine vision and try to get a decent, better image, it's gonna work even better, faster, right? What just happened? It's like, "Oh, my God, what's, what did I just see?" Right? What you just saw was a deep learning-based OCR. What is deep learning, right? Let's, let's take a step back. What is deep learning? Well, deep learning is a method in artificial intelligence, or AI, right, that teaches computers to process data in a way that simulates the behavior of the human brain. Oh, wow, that's a...
There's a lot of fancy words in there. Basically what it is think about it this way, right? You are an adult human, right? You're a human. You can read pretty much anything, right, any type of text with your eyes because your brain has been taught already to be prepared for something like that, different fonts, different conditions, different lighting conditions, different positions, colors, backgrounds. Your brain has been trained with hundreds of thousands of images over your lifespan, right? You have seen a lot of different pieces of paper or different words, right? Your brain already has a model, right? Your brain already has a mesh of options or a convolutional neural network, right, in your head. Your brain is awesome, right?
If, if you think about it, what we can achieve, right, as a machine vision, right, if you would, is remarkable, right? That's how deep learning works, right? It's trying to mimic the way humans read letters, right? Armando, then, are you saying then that Zebra Deep Learning OCR reads without training, as an adult would, because the algorithm already has a convolutional neural network model that has been trained with hundreds of thousands of images to accommodate for different scenarios? That is correct, right? We did all the heavy lifting for you already, so you just have to draw a box and go for lunch, right?
If that's true, and I tell the truth even when I lie, if that's true, conventional OCR is like instead of asking an adult to read, it's like asking a kid that is just starting to learn how to read. That kid, he will just be able to understand the few letters that he knows and the type, color, or fonts that he has seen, so he has a limited spectrum of opportunities, right? That is correct. That's, that's the fundamental difference. Conventional OCR is like asking... If I ask my daughter, Sophia: "Sophia, can you read this?" If she has seen that word 1 million times, she's gonna tell me, right?
Yeah, that says Zebra." If I show her a different cardboard with a different word or with a different size of letters or a different font or a different color, she might say, "I don't know what that is," right? That's the difference between deep learning-based OCR and conventional OCR.
Good analogy.
If that is true, and everything I say is true, right, even when I lie, if that is true, Armando, why is conventional OCR still around, right? If Deep Learning OCR is so amazing, and it is, why is conventional OCR still around? Are they friends or are they enemies? Well, neither, right? They are more like far cousins, kind of. They are different, right? They both read. The ultimate product or the ultimate result for any of those two algorithms is a readout, right? They both read. Really, which one to choose depends on what is beyond reading, right? Not just the reading aspect. What is beyond that reading, right? Well, yeah, if you could give an example, that would be great, right? That's a meme, right? A famous meme.
Well, for example, if you need to read something out of the box, and you don't want to go over training a library or having to know about lighting techniques or having to worry about the contrast or what type of font it is, or what if something changes, what if the position of the part changes, If you just want to do something plug and play out of the box, Deep Learning OCR is what you want to use, okay? Quick and easy, right? If you need to read ABC, and you want to make sure that is ABC and not ABC, then you should use conventional OCR, right? Why? The last ABC visually is different than the first ABC, and maybe that is relevant for your application. Maybe you want to make sure that you're reading on a specific type of font, for example.
That's just one example. Are there more examples or differences between Deep Learning OCR and conventional OCR? Yes, there's much more to discuss, but we don't have time for that. Okay, I hope this was useful for you. I just tried to show you very briefly. I mean, this topic is very big. We could talk about this for hours, but we don't have time for that, right? I'm gonna pass it along to Tim, and we are happy to answer any questions that you may have.
Yeah. Yeah. Amazing, Armando, as always, you manage to make all of this wizardry, seem so easy. From the looks of our Q&A section here, we do actually have quite a few raised eyebrows in the group, so that's great. Let's go ahead and move on to the Q&A component of our webinar today. Don't worry, we are gonna be following up with everybody after the show. If there are questions that come up afterwards, we're gonna be in touch, and you can feel free to reach out to us as well. Ready to get grilled, Armando?
Yes.
Yep. Okay. I wish I had, like, Jeopardy music to play in the background. Can Deep Learning OCR be deployed using a smart camera and PC-based vision? I know you were using a smart camera in this, so I think I know the answer to this one.
Yes, it can. We are offering deep learning-based OCR in both our smart cameras platform and also PC-based.
Awesome. Pretty versatile there. Does Deep Learning OCR tool work on color images? That's a good question.
Yes. Yes, it can.
Okay. Can deep learning OCR read handwriting?
Ay, yi. It depends.
Did I just open a can of worms?
It depends. I mean, if you think about it, me or you or anybody, humans, let's say humans, sometimes we struggle to read handwriting, right?
Yep.
That being said, up to a certain extent, the answer is yes. Again, it really depends, right? For example, I My handwriting is kind of ugly, so I totally understand what some people cannot read it, right? The short answer is maybe.
Yeah. I like I'll agree with you there. I definitely wouldn't put it in the real use case column because it really depends on how messy somebody's handwriting is, right?
Correct.
Is there a way that I can test this out with my own images? Absolutely. Armando, how do they do that?
I'll send you my PayPal account. No, no, that's a joke. No, absolutely, the answer is yes. You can start testing this today. If you wanted, you can either contact us, right, and we can help you, or you can download today, Aurora Focus, which is the software that we use for some of our smart cameras, and you can use the emulator and test deep learning OCR with your images.
Yep, free download.
Yes.
Everybody should be downloading this after the demo.
Yes.
What if I want to train my own CNN model? Okay, somebody's familiar with artificial intelligence. Armando?
Okay, it is possible, yes. Now, remember, this algorithm that we have, we already did that heavy lifting for you. This Deep Learning OCR that we are giving you, already is trained with literally hundreds of thousands of images. But if you want to create your own CNN, yes, in our PC-based platform, or let me rephrase that, in our software design assistant, which you can deploy to a smart camera, you can create your own model. The answer is yes. If you want to create your own OCR model, we have that option.
Awesome. Okay, there's a few that are a little technical I'm gonna save for when we're offline, I think a good one to kind of cap us off here: "Is my conventional OCR setup now obsolete?
No. No, no. We still sell conventional OCR ourselves. It's just different, right? Again, if you have an application where everything is very consistent, lighting is very consistent, the font that you're trying to read is very consistent or expected, then conventional OCR is gonna work perfectly fine. If everything is nice, and repeatable and consistent, conventional OCR still has a space.
Good. We can still hire your daughter to read for us, then?
Correct.
Yeah. All right. With that, and by the way, I'm glad I had a mute button on my microphone because I, like, spat up water laughing at your presentation throughout. Great job. We're gonna bring today's webinar to an end. If you, as we mentioned earlier, if you want to conduct testing using Deep Learning OCR at your facility, please reach out to us, let us know. We'll be happy to set that up for you. Feel free to reach out at any point should you have any further questions about the capabilities of our Deep Learning OCR module. We're eager to show it off at every opportunity. All right, thank you for joining us, and keep an eye out for future webinars. Thanks, Armando.
Thank you all.