Autonomy Investor Day 2019

Apr 22, 2019

Being late. Welcome to our very first Analyst Day for Autonomy. I really hope that this is something we can do a little bit more regularly now to keep you posted about the development we're doing with regards to autonomous driving. About 3 months ago, we were getting prepped up for our Q4 earnings call with Elon and quite a few other executives. And one of the things that I told the group is that from all the conversations that I keep having with investors on a regular basis, the biggest gap that I see with what I see inside the company and what the outside perception is, is our ability of autonomous driving. And it kind of makes sense because for the past couple of years, we've been really talking about Model 3 ramp and a lot of the debate has revolved around Model 3. But in reality, a lot of things have been happening in the background. We've been working on the new full self driving chip. We've had a complete overhaul of our neural net for vision recognition, etcetera. So now that we finally started to produce our full self driving computer, we thought it's a good idea to just open the veil, invite everyone in and talk about everything that we've been doing for the past 2 years. So about 3 years ago, we wanted to use we wanted to find the best possible chip for full autonomy. And we found out that there's no chip that's been designed from ground up for neural nets. So we invited my colleague Pete Bannon, the VP of Silicon Engineering to design such chip for us. He's got about 35 years of experience of building chips and designing chips. About 12 of those years were for a company called PA Semi, which was later acquired by Apple. So he worked on dozens of different architectures and designs, and he was the lead designer, I think, for Apple iPhone 5 just before joining Tesla. And he's going to be joined on the stage by Elon Musk. Thank you. I was going to introduce Pete, but Martin has done so. Pete is just the best chip and system architect that I know in the world, and it's an honor to have you and your team at Tesla. And takeaway, just tell them about the incredible work that you and your team have done. Thanks, Elon. It's a pleasure to be here this morning and a real treat really to tell you about all the work that my colleagues and I've been doing here at Tesla for the last 3 years. I think we'll tell you a little bit about how the whole thing got started, and then I'll introduce you to the full self driving computer and tell you a little bit about how it works. We'll dive into the chip itself and go through some of those details. I'll describe how the custom neural network accelerator that we designed works, and then I'll show you some results. And hopefully, you'll all still be awake by then. I was hired in February of 2016. I asked Elon if he was willing to spend all the money it takes to do full custom system design. And he said, well, are we going to win? And I said, well, yes, of course. So he said, I'm in. And so that got us started. We hired a bunch of people and started thinking about what a custom design chip for full autonomy would look like. We spent 18 months doing the design. And in August of 2017, we released the design for manufacturing. We got it back in December, powered up and it actually worked very, very well on the first try. We made a few changes and released a B0 Rev in April of 2018. In July of 2018, the chip was qualified and we started full production of production quality parts. In December of 2018, we had the autonomous driving stack running on the new hardware and we were able to start retrofitting employee cars and testing the hardware and software out in the real world. Just last March, we started shipping the new computer in the Model S and X. And just earlier in April, we started production in Model 3. So this whole program from the hiring of the first few employees to having it in full production in all three of our cars is just a little over 3 years and is probably the fastest system development program I've ever been associated with, and it really speaks a lot to the advantages of having a tremendous amount of vertical integration to allow you to do concurrent engineering and speed up deployment. In terms of goals, we were totally focused exclusively on Tesla requirements, and that makes life a lot easier. If you have 1 and only 1 customer, you don't have to worry about anything else. One of those goals was to keep the power under 100 watts so that we could retrofit the new machine into the existing cars. We also wanted a lower part cost, so we could enable full redundancy for safety. At the time, we had a thumb in the wind estimate that it would take at least 50,000,000,000,000 operations a second of neural network performance to drive a car, and so we wanted to get at least much and really as much as we possibly could. Batch size is how many items you operate on at the same time. So for example, Google's TPU-one has a batch size of 256, and you have to wait around until you have 256 things to process before you can get started. We didn't want to do that, so we designed our machine with a batch size of 1. So as soon as an image shows up, we process it immediately to minimize latency, which maximizes safety. We needed a GPU to run some post processing. At the time, we were doing quite a lot of that, but we speculated that over time, the amount of post processing on the GPU would decline as the neural networks got better and better, and that has actually come to pass. So we took a risk by putting a fairly modest GPU in the design, as you'll see, and that turned out to be a good bet. Security is super important. If you don't have a secure car, you can't have a safe car. So there's a lot of focus on security and then, of course, safety. In terms of actually doing the chip design, as Elon alluded earlier, there was really no ground up neural network accelerator in existence in 2016. Everybody out there was adding instructions to their CPU or GPU or DSP to make it better for inference, but nobody was really just doing it natively. So we set out to do that ourselves. And then for other components on the chip, we purchased industry standard IP for CPUs and GPUs. That allowed us to minimize the design time and also the risk to the program. Another thing that was a little unexpected when I first arrived was our ability to leverage existing teams at Tesla. System software, firmware, board designs and a really good system software, firmware board designs and a really good system validation program that we were able to take advantage of to accelerate this program. Here's what it looks like. Over there on the right, you see all the connectors for the video that comes in from the cameras that are in the car. You can see the 2 self driving computers in the middle of the board. And then on the left is the power supply and some control connections. And so I really love it when a solution is boiled down to its barest elements. You have video, computing and power, and it's straightforward and simple. Here's the original hardware 2.5 enclosure that the computer went into and we've been shipping for the last 2 years. Here's the new design for the FSD computer. It's basically the same. And that, of course, is driven by the constraints of having a retrofit program for the cars. I'd like to point out that this is actually a pretty small computer. It fits behind the glovebox, between the glovebox and the firewall in the car. It does not take up half your trunk. As I said earlier, there's 2 fully independent computers on the board. You can see them there highlighted in blue and green. To either side of the large SoC, you can see the DRAM chips that we use for storage, and then below left, you see the flash chips that represent the file system. So these are 2 independent computers that boot up and run their own operating system. Yes. And if I can add something, I mean, the general principle here is that any part of this could fail and the car will keep driving. So you could have cameras fail, you could have power circuits fail, you could have 1 of the Tesla full self driving computer chips fail, car keeps driving. The probability of this computer failing is substantially lower than somebody losing consciousness. That's the key metric, at least in order of magnitude. Yes. So one of the things that we additional thing we do to keep the machine going is to have redundant power supplies in the car. So one machine is running on one power supply and the other one is on the other. The cameras are the same. So half of the cameras run on the blue power supply, the other half run the green power supply. And both chips receive all of the video and process it independently. So in terms of driving the car, the basic sequence is collect lots of information from the world around you. Not only do we have cameras, we also have radar, GPS, maps, the IMUs, ultrasonic sensors around the car. We have wheel ticks, steering angle. We know what the acceleration and deceleration of the car is supposed to be. All of that gets integrated together to form a plan. Once we have a plan, the 2 machines exchange their independent version of the plan to make sure it's the same. And assuming that we agree, we then act and drive the car. Now once you've driven the car with some new control, you of course want to validate it. So we validate that what we transmitted was what we intend to transmit to the other actuators in the car. And then you can use the sensor suite to make sure that it happens. So if you ask the car to accelerate or brake or steer right or left, you can look at the accelerometers and make sure that you are in fact doing that. So there's a tremendous amount of redundancy and overlap in both our data acquisition and our data monitoring capabilities here. Moving on to talk about the full self driving chip a little bit. It's packaged in a 37.5 millimeter BGA with 1600 balls. Most of those are used for power and ground, but plenty for signal as well. If you take the lid off, it looks like this. You can see the package substrate and you can see the dye sitting in the center there. If you take the dye off and flip it over, it looks like this. There's 13,000 C4 bumps scattered across the top of the dye. And then underneath that are 12 metal layers, and if you which is obscuring all the details of the design. So if you strip that off, it looks like this. This is a 14 nanometer FinFET CMOS process. It's 2 60 millimeters in size, which is a modest sized die. So for comparison, a typical cell phone chip is about 100 millimeters square, which so we're quite a bit bigger than that, but a high end GPU would be more like 600 to 800 millimeter square. So we're sort of in the middle, I would call it the sweet spot. It's a comfortable size to build. There's 250,000,000 logic gates on there and a total of 6,000,000,000 transistors, which even though I work on this all the time, that's mind boggling to me. The chip is manufactured and tested to AEC Q100 standards, which is a standard automotive criteria. Next, I'd like to just walk around the chip and explain all the different pieces to it. And I'm sort of going to go in the order that a pixel coming in from the camera would visit all the different pieces. So up there in the top left, you can see the camera's serial interface. We can ingest 2,500,000,000 pixels per second, which is more than enough to cover all the sensors that we know about. We have an on chip network that distributes data from the memory system, so the pixels would travel across the network to the memory controllers on the right and left edges of the chip. We use industry standard LPDDR4 memory running at 404,266 gigabits per second, which gives us a peak bandwidth of 68 gigabytes a second, which is a pretty healthy bandwidth. But again, this is not like ridiculous. So we're sort of trying to stay in the comfortable sweet spot for cost reasons. The image signal processor has a 24 bit internal pipeline that allows us to do take full advantage of the HDR sensors that we have around the car. It does advanced tone mapping, which helps to bring out details in shadows. And then it has advanced noise reduction, which just improves your overall quality of the images that we're using in the neural network. The neural network accelerator itself, there's 2 of them on the chip. They each have 32 megabytes of SRAM to hold temporary results and minimize the amount of data that we have to transmit on and off the chip, which helps reduce power. Each array has a 96x96 multiply add array with in place accumulation, which allows us to do almost 10,000 multiply ads per cycle. There's dedicated ReLU hardware, dedicated pooling hardware, and each of these delivered 306 excuse me, each one delivers 36,000,000,000,000 operations per second and they operate at 2 gigahertz. The 2 of them together run a die, deliver 72 trillion operations a second. So we exceeded our goal of 50 teraops by a fair bit. There's also a video encoder. We encode video and use it in a variety of places in the car, including the backup camera display. There's optionally a user feature for dashcam and also for clip logging data to the cloud, which Stuart and Andre will talk about more later. There's a GPU on the chip. It's modest performance. It has support for both 32- and 16 bit floating point. And then we have 12 A72 64 bit CPUs for general purpose processing. They operate at 2.2 gigahertz, and this represents about 2.5x times the performance available in the current solution. There's a safety system that contains 2 CPUs that operate in lockstep. This system is the final arbiter of whether it's safe to actually drive the actuators in the car. So this is where the two plans come together and we decide whether it's safe or not to move forward. And lastly, there's a safety system. And then basically, the job of the then the chip does not operate. Now I've told you a lot of different performance numbers and I thought it'd be helpful maybe to put it into perspective a little bit. So throughout this talk, I'm going to talk about a neural network from our narrow camera. It uses 35 giga 35,000,000,000 operations, 35 giga ops. And if we use all 12 CPUs to process that network, we could do 1.5 frames per second, which is super slow, not nearly adequate to drive the car. If we use the 600 Gigaflop GPU, the same network, we'd get 17 frames per second, which is still not good enough to drive the car with 8 cameras. The neural network accelerators on the chip can deliver 2,100 frames per second. And you can see from the scaling as we moved along that the amount of computing in the CPU and GPU are basically insignificant to what's available in the Neural Network Accelerator. It really is night and day. So moving on to talk about the Neural Network Accelerator. We're just going to stop for some water. On the left, there is a cartoon of a neural network Just to give you an idea of what's going on, the data comes in at the top and visits each of the boxes and the data flows along the arrows to the different boxes. The boxes are typically convolutions or deconvolutions with relus. The green boxes are pooling layers. And the important thing about this is that the data produced by one box is then consumed by the next box and then you don't need it anymore. You can throw it away. So all of that temporary data that gets created and destroyed as you flow through the network, there's no need to store that off chip in DRAM. So we keep all that data in SRAM, and I'll explain why that's super important in a few minutes. If you look over on the right side of this, you can see that in this network, of the 35,000,000,000 operations, almost all of them are convolution, which is based on dot products. The rest are deconvolution, also based on dot product. And then ReLU and Pooling, which are relatively simple operations. So if you were designing some hardware, you'd clearly target doing dot products, which are based on multiply add and really kill that. But imagine that you sped it up by a factor of 10,000. So 100% all of a sudden turns into 0.1%, 0.01%, and suddenly the relu and pooling operations are going to be quite significant. So our hardware design includes dedicated resources for processing relu and pooling as well. Now this chip is operating in a thermally constrained environment. So we had to be very careful about how we burn that power. We want to maximize the amount of arithmetic we can do. So we picked integer add. It's 9 times less energy than a corresponding floating point add. And we picked 8 bit by 8 bit integer multiply, which is significantly less power than other multiply operations and is probably enough accuracy to get good results. In terms of memory, we chose to use SRAM as much as possible, and you can see there that going off chip to DRAM is approximately 100 times more expensive in terms of energy consumption than using local SRAM. So clearly, we want to use local SRAM as much as possible. In terms of control, this is data that was published in a paper by Mark Horowitz at ISSSCC where he sort of critiqued how much power it takes to execute a single instruction on a regular introduced CPU. And you can see that the add operation is only 0.15 percent of the total power. All the rest of the power is control overhead and bookkeeping. So in our design, we start to basically get rid of all that as much as possible because what we're really interested in is arithmetic. So here's the design that we finished. You can see that it's dominated by the 32 megabytes of SRAM. There's big banks on the left and right and in the center bottom. And then all the computing is done in the upper middle. Every single clock, we read 2.56 bytes of activation data out of the SRAM array, 128 bytes of weight data out of the SRAM array and we combine it in a 96x96mallad array, which performs 9,000 multiply adds per clock. At 2 gigahertz, that's a total of 36.8 teraops. Now when we're done with the dot product, we unload the engine so that we shift the data out across the dedicated ReLU unit, optionally across the pooling unit and then finally into a write buffer where all the results get aggregated up and then we write out 128 bytes per cycle back into the SRAM. And this whole thing cycles along all the time continuously. So we're doing dot products while we're unloading previous results, doing pooling and writing back into the memory. If you add it all up, at 2 gigahertz, you need 1 terabyte per second of SRAM bandwidth to support all that work, and so the hardware supplies that. So 1 terabyte per second of bandwidth per engine, there's 2 on the chip, 2 terabytes per second. The accelerator has a relatively small instruction set. We have a DMA read operation to bring data in from memory. We have a DMA write operation to push results back out to memory. We have 3 dot product based instructions, convolution, deconvolution and interproduct. And then 2 relatively simple scale is 1 input, 1 output operation, and outwise is 2 inputs and 1 output. And then, of course, stop when you're done. We had to develop a neural network compiler for this. So we take the neural network that's been trained by our vision team as it would be deployed in the older cars and we take that and compile it for use on the new accelerator. The compiler does layer fusion, which allows us to maximize the computing each time we read data out of the SRAM and put it back. It also does some smoothing, so that the demands on the memory system aren't too lumpy. And then we also do channel padding to reduce bank conflicts and we do bank aware S stream allocation. And this is a case where we could have put more hardware in the design to handle bank conflicts, but by pushing it into software, we save hardware and power at the cost of some software complexity. We also automatically insert DMAs into the graph so that data arrives just in time for computing without having to stall the machine. And then at the end, we generate all the code, we generate all the weight data, we compress it we add a CRC checksum for reliability. To run a program, all the neural network descriptions or programs are loaded into SRAM at the start, and then they sit there ready to go all the time. To run a network, you have to program the address of the input buffer, which presumably is a new image that just arrived from a camera. You set the output buffer address, you set the pointer to the network weights and then you set go. And then the machine goes off and will sequence through the entire neural network all by itself, usually running for 1,000,000 or 2,000,000 cycles. And then when it's done, you get an interrupt and can post process the results. So moving on to results. We had a goal to stay under 100 watts. This is measured data from cars driving around running the full autopilot stacks. We're dissipating 72 watts, which is a little bit more power than the previous design. But with the dramatic improvement in performance, it's still a pretty good answer. Of that 72 watts, about 15 watts is being consumed running the neural networks. In terms of cost, the silicon cost of this solution is about 80% of what we were paying before. So we are saving money by switching to this solution. And in terms of performance, we took the narrow camera neural network, which I've been talking about that has 35,000,000,000 operations in it. We ran it on the old hardware in a loop as quick as possible, and we delivered 110 frames per second. We took the same data, the same network, compiled it for hardware for the new FSD computer. And using all 4 accelerators, we can get 2,300 frames per second processed. So a factor of 21. I think this is perhaps the most significant slide. It's night and day. I've never worked on a project where the performance increase was more than 3. So this was pretty fun. If you compare it to, say, NVIDIA's Drive Xavier solution, a single chip delivers 21 teraops. Our full self driving computer with 2 chips is 144 teraops. So to conclude, I think we've created a design that delivers outstanding performance, 144 teraups for neural network processing. It has outstanding power performance. We managed to jam all of that performance into the thermal budget that we had. It enables a fully redundant computing solution. It has a modest cost. And really, the important thing is that this FSD computer will enable a new level of safety and autonomy in Tesla's vehicles without impacting their cost or range, something that I think we're all looking forward to. Yes. I think why don't we do Q and A after each segment. So if people have questions about the hardware, they can ask right now. The reason I asked Pete to do just a detailed far more detailed than perhaps most people would appreciate, dive into the Tesla full self driving computer is because at first, it seems improbable. How could it be that Tesla, who has never designed a chip before, would design the best chip in the world? But that is objectively what has occurred, not best by a small margin, best by a huge margin. It's in the cars right now. All Teslas being produced right now have this computer. We switched over from the NVIDIA solution for S and X about a month ago, and we switched over Model 3 about 10 days ago. All cars being produced have the had all the hardware necessary, compute and otherwise, for full self driving. I'll say that again. All Tesla cars being produced right now have everything necessary for full self driving. All you need to do is improve the software. And later today, you will drive the cars with the development version of the improved software, and you will see for yourselves. Questions for Pete? Yes. Tripp Chaudhry, Global Equities Research, very, very impressive in every shape and form. I was wondering, like I took some notes. You are using activation function RELU, the rectified linear unit. But if we think about the deep neural network, it has multiple layers and some algorithms may use different activation functions for different hidden layers like Softmax or Tanh. Do you have flexibility for incorporating different activation functions rather than LU in your platforms? And I have a follow-up. Yes, we do. We have inflammations of tanh and sigmoid, for example. Beautiful. One last question. Like in the nanometers, you mentioned 14 nanometers as I was wondering, would it make sense to come a little lower, maybe 10 nanometers 2 years down or maybe 7? At the time we started the design, not all the IP that we wanted to purchase was available in 10 nanometer. So we finished the design in 14. It's maybe worth pointing out that we finished this design like maybe 1.5, 2 years ago and began design of the next generation. We're not talking about the next generation today, but we're about halfway through it. That will all the things that are obvious for next generation chip we're doing. You talked about the software as the piece now. You did a great job. I was blown away, understood 10% of what you said, but I trust that it's in good hands. Thanks. So it feels like you got the hardware pieces done and that was really hard to do and now you have to do the software piece. Now maybe that's outside of your expertise. But how should we think about that software piece? Well, can I ask for better introduction to Andre and Stuart? But are there any questions for the chip part before the next part of the presentation is neural nets and software. So maybe on the chip side, the last slide was 144,000,000,000,000 of operations per second versus was it NVIDIA 21,000,000,000? That's right. And maybe can you just contextualize that for a finance person, why that's so significant, that gap? Thank you. Well, I mean, it's a factor of 7 in performance delta. So that means you can do 7 times as many frames. You can run neural networks that are 7 times larger and more sophisticated. So it's a very big currency that you can spend on lots of interesting things to make the car better. I think that Xavier Power usage is higher than ours Xavier Power is higher than ours, I think, or comparable? I don't know that. I believe it's like the to the best of my knowledge, the power requirements would increase at least to the same degree, a factor of 7 and costs would also increase by a factor of 7. Power is a real problem because it also reduces range. So it has the balance for power is very high. And then you have to get rid of that power by the thermal problem becomes really significant because you got to get rid of all that power. Thank you very much. I think we have a lot of quite a bit of Just ask the questions. If you guys don't mind the day running a bit long, just we're going to do the Drive demos afterwards. So if you've got if you if anybody needs to pop out and do Drive demos a little sooner, you're welcome to do that. But we want to make sure we answer your questions. Yes. Pradeep Ramani from UBS. Intel and AMD to some extent have started moving towards a chiplet based architecture. I did not notice a chiplet based design here. Do you think that looking forward, that would be something that might be of interest to you guys from an architecture standpoint? A chiplet based architecture? Yes. We're not currently considering anything like that. I think that's mostly useful when you need to use different styles of technology. So if you want to integrate silicon germanium or DRAM technology on the same silicon substrate, that gets pretty interesting. But until the die size gets obnoxious, I wouldn't go there. Yes. To be clear, the strategy here and it started basically 3 a little over 3 years ago was designable, the computer that is fully optimized and aiming for full self driving, then right software that is designed to work specifically on that computer and get the most out of that computer. So you have tailored hardware that is a master of one trade, self driving. The NVIDIA is a great company, but they have many customers. And so when as they apply their resources, they need to do a generalized solution. We care about one thing, self driving. So it was designed to do that incredibly well. The software is also designed to run on that hardware incredibly well. And the combination of the software and the hardware, I The chip is designed to process video input. In case you use, let's say, LiDAR, would it be able to process that as well? Or is that primarily for video? What we're going to explain to you today is that LiDAR is a fool's errand, and anyone relying on LiDAR is doomed. Doomed. Expensive sensors that are unnecessary. It's like having a whole bunch of expensive appendices. Like one appendix is bad, well, now I want to put a whole bunch of them. That's ridiculous. You'll see. Hi. Hi. Hi. So just two questions. Just on the power consumption, is there a way to maybe give us like a rule of thumb on every watt is reduces range by certain percent or certain amount, just so we can get a sense of how much of Model 3, the target consumption is 2 50 watts per mile. It depends on the nature of the driving as to how many miles that affects in city. It would have a much bigger effect than on highway. So if you're driving for an hour in a city and you had a solution hypothetically that was a kilowatt, you'd lose 4 miles on a Model 3. So if you're only going, say, 12 miles an hour, then that's like that would be a 25% impact on range in city. It's basically, powers the power of the system has a massive impact on city range, which is where we think most of the robo taxi market will be. So the power is extremely important. Okay. Tasha? I'm sorry, I didn't hear you. Thank you. What's the primary design objective of the next generation chip? We don't want to talk too much about the next generation chip, but it's it'll be at least, let's say, 3 times better than the current system. To develop this chip, is the chip being you don't manufacture the chip, you contract that out? And how much cost reduction does that save in the overall vehicle cost? The 20% cost reduction I cited was the piece cost per vehicle reduction, not that wasn't a development cost, that was just the actual No, I'm saying, but like if I'm manufacturing these in mass, is this saving money in doing it yourself? Yes, a little bit. I mean, most chips are made most people don't make chips with their own fab, it's pretty unusual. I think the So you don't see any supply issues with getting the chip mass produced? No. The cost saving pays for the development. I mean, the basic strategy going to Elon was we're going to build this ship, it's going to reduce the cost. And Elon said, times a 1000000 cars a year, deal. That's correct. Yes. Sorry. If there are really chip specific questions, we can answer them. Otherwise, there will be a Q and A opportunity after Andre talks and after Stuart talks. So there will be 2 other Q and A opportunities. This is very chip specific, then Also, I'll be here all afternoon. Yes. And secondly, Pete will be here at the end as well. So go ahead. Thanks. That dye photo you had, there's the neural processor takes up quite a bit of the dye. I'm curious, is that your own design? Or is there some external IP there? Yes. That was the custom design for Bite Tesla. Okay. And then I guess the follow on would be there's probably a fair amount of opportunity to reduce that footprint as you tweak the design? It's actually quite dense. So in terms of reducing it, I don't think so. It will greatly enhance the functional capabilities in the next generation. Okay. And then last question, can you share where you're fabbing this part? What? Where are we fabbing it? Oh, it's Samsung. Samsung? Yes, Austin, Texas. Thank you. There's one at the back. Hi, Graham Tanaka, Tanaka Capital. Just curious, how defensible your chip technologies and design is from an IP point of view and hoping that you won't be offering a lot of the IP to the outside for free? Thanks. We have filed on the order of a dozen patents on this technology. Fundamentally, it's linear algebra, which I don't think you can patent. I'm not sure, but I think if somebody started today and they're really good, they might have something like what we have right now in 3 years, but in 2 years, we'll have something 3 times better. Talking about the intellectual property of protection, you have the best intellectual property and some people just steal it for the fun of it. I was wondering, if we look at SPEW interaction with Aurora that companies industry believes they stole your intellectual property, I think the key ingredient that you need to protect is the weights that you associate to various parameters. Do you think your chip can do something to prevent anybody, maybe encrypt all the weights so that even you don't know what the weights are at the chip level so that your intellectual property remains inside it and nobody knows about it and nobody can just steal it? Man, I'd like to meet the person that could do that because I would hire them in a heartbeat. Yes, so a real hard problem. Yes, do you want to I mean, we do encrypt the it's a hard chip to crack. So if they can crack it, it's very good. If they can then crack it and then also figure out the software and the neural net system and everything else, they can design it from scratch. Like that's all part of it. It's our intention to prevent people from stealing all that stuff. I mean, if they do, we hope it at least takes a long time. It will definitely take them a long time, yes. I mean, it's not like if it was our goal to do that, how would we do it? It would be very difficult. But the thing that's, I think, a very powerful sustainable advantage for us is the fleet. Nobody has the fleet. Those weights are constantly being updated and improved based on billions of miles driven. Tesla has 100 times more cars with the full self driving hardware than everyone else combined. We have by the end of this quarter, we'll have 500,000 cars worth of the full 8 cameras set up, 12 ultrasonics. Some of them will still be on hardware too, but we still have the data gathering ability. And then by a year from now, we'll have over 1,000,000 cars with full self driving computer, hardware, everything. Yes. It's just a massive data advantage. It's similar to like how like Google search engine has a massive advantage because people use it and people are programming effectively program Google with their queries and their results. May I just press you on that and please reframe the question because I'm tackling and if it's appropriate. But when we talk to Waymo or Avidya, they do speak with equivalent conviction about their leadership because of their competence in simulating miles driven. Can you talk about the advantage of having real world miles versus simulated miles? Because I think they expressed that by the time you get a 1,000,000 miles, they can simulate a 1,000,000,000 and no Formula 1 race car driver, for example, could ever successfully complete a real world track without driving in a simulator. Can you talk about the advantages? It sounds like that you perceived to have associated with having data ingestion coming from real world miles versus simulated miles. Absolutely. The simulator we have a quite a good simulation too, but it just does not capture the long tail of weird things that happen in the real world. If the simulation fully captured the real world, well, I mean, that would be proof that we're living in a simulation, I think. Yes, it doesn't. I wish. But simulations do not capture the real world. They don't the real world is really weird and messy. You need the cars on the road. We're actually going to get into that in Andre and Stuart's presentation. So okay, why don't we move on to Andre? The last question was actually a very good segue because one thing to remember about our FSD computer is that it can run much more complex neural nets for much more precise image recognition. And to talk to you about how we actually get that image data and how we analyze them, we have our Senior Director of AI, Andrej Karpathy, who's going to explain all of that to you. Andre has a PhD from Stanford University, where he studied computer science focusing on patient recognition and deep learning. Andre, why don't you just talk do your own intro. There's a lot of PhDs from Stanford. That's not important. Yes. Okay. We don't care. Come on. Thank you. Andre started the computer vision class at Stanford. That's much more significant. That's what matters. Just so can you please talk about your background in a way that is not bashful? Just tell me about the stuff you've done, yes? And then Sure. So yes, I think I've been training neural networks basically for what is now a decade. And these neural networks were not actually really used in the industry until maybe 5 or 6 years ago. So it's been some time that I've been training these neural networks. And that included institutions at Stanford, at OpenAI, at Google, and really just training a lot of neural networks, not just for images, but also for natural language and designing architectures that couple those two modalities for my PhD. So In the computer science class? Yes. And at Stanford actually taught the convolutional neural networks class. And so I was the primary instructor for that class. I actually started the course and designed the entire curriculum. So in the beginning it was about 150 students, then it grew to 700 students over the next 2 or 3 years. So it's a very popular class. It's one of the largest classes at Stanford right now. So that was also really successful. I mean, Andre is like really one of the best computer vision people in the world, arguably the best. Okay. Thank you. Yes. So hello, everyone. So Pete told you all about the chip that we've designed that runs neural networks in the car. My team is responsible for training of these neural networks. And that includes all of data collection from the fleet, neural network training and then some of the deployment onto that chip. So what do the neural networks do exactly in the car? So what we are seeing here is a stream of videos from across the vehicle, across the car. These are 8 cameras that send us videos. And then these neural networks are looking at those videos and are processing them and making predictions about what they're seeing. And so some of the things that we're interested in and some of the things you're seeing on this visualization here are lane line markings, other objects, the distances to those objects, what we call drivable space shown in blue, which is where the car is allowed to go and a lot of other predictions like traffic lights, traffic size and so on. Now for my talk, I will talk roughly in 3 stages. So first, I'm going to give you a short primer on neural networks and how they work and how they're trained. And I need to do this because I need to explain in the second part why it is such a big deal that we have the fleet and why it's so important and why it's a key enabling factor to really train these neural networks and making them work effectively on the roads. And in the 3rd stage, I'll talk about vision and LiDAR and how we can estimate depth just from vision alone. So the core problem that these networks are solving in the car is that of visual recognition. So for you and I, these are very this is a very simple problem. You can look at all of these for images and you can see that they contain a cello, a boat, an iguana or scissors. So this is very simple and effortless for us. This is not the case for computers. And the reason for that is that these images are to a computer really just a massive grid of pixels. And at each pixel you have the brightness value at that point. And so instead of just seeing an image, a computer really gets a million numbers in a grid that tell you the brightness values at all the positions. The matrix, if you will. It really is the matrix. Yes. And so we have to go from that grid of pixels and brightness values into high level concepts like iguana and so on. And as you might imagine, this iguana has a certain pattern of brightness values, but iguanas actually can take on many appearances. So they can be in many different appearances, different poses, in different brightness conditions, against different backgrounds. You can have different crops of that iguana. And so we have to be robust across all those conditions, and we have to understand that all those different brightness patterns actually correspond to iguanas. Now the reason you and I are very good at this is because we have a massive neural network inside our heads that is processing those images. So light hits the retina, travels to the back of your brain to the visual cortex. And the visual cortex consists of many neurons that are wired together and that are doing all the pattern recognition on top of those images. And really over the last, I would say about 5 years, the state of the art approaches to processing images using computers have also started to use neural networks, but in this case artificial neural networks. But these artificial neural networks, and this is just a cartoon diagram of it, are a very rough mathematical approximation to your visual cortex. We really do have neurons and they are connected together. And here I'm only showing 3 or 4 neurons in 3 or 4 in 4 layers. But a typical neural network will have tens to 100 of millions of neurons and each neuron will have 1,000 connections. So these are really large pieces of almost simulated tissue. And then what we can do is we can take those neural networks and we can show them images. So for example, I can feed my iguana into this neural network and the network will make predictions about what it's seeing. Now in the beginning these neural networks are initialized completely randomly. So the connection strengths between all those different neurons are completely random. And therefore the predictions of that network are also going to be completely random. So it might think that you're actually looking at a boat right now, and it's very unlikely that this is actually an iguana. And during the training during the training process, really what we're doing is we know that that's actually an iguana. We have a label. So what we're doing is we're basically saying we'd like the probability of Iguana to be larger for this image and the probability of all the other things to go down. And then there's a mathematical called back propagation stochastic gradient descent that allows us to back propagate that signal through those connections and update every one of those connections and update every one of those connections just a little amount. And once the update is complete, the probability of Iguana for this image will go up a little bit. So it might become 14% and the probability of the other things will go down. And of course, we don't just do this for this single image. We actually have entire large data sets that are labeled. So we have lots of images. Typically, you might have millions of images, thousands of labels or something like that. And you are doing forward backward passes over and over again. So you're showing the computer, here's an image, it has an opinion, and then you're saying this is the correct answer, and it tunes itself a little bit. You repeat this millions of times, and sometimes you show images, the same image to the computer hundreds of times as well. So the network training typically will take on the order of few hours or a few days depending on how big of a network you're training. And that's the process of training a neural network. Now there's something very unintuitive about the way neural networks work that I have to really get into and that is that they really do require a lot of these examples and they really do start from scratch. They know nothing and it's really hard to wrap your head around this. So as an example, here's a cute dog. And you probably may not know the breed of this dog, but the correct answer is that this is a Japanese spaniel. Now all of us are looking at this and we're seeing Japanese spaniel and we're like, okay, I got it. I understand kind of what this Japanese spaniel looks like. And if I show you a few more images of other dogs, you can probably pick out other Japanese spaniels here. So in particular, those 3 look like a Japanese spaniel and the other ones do not. So you can do this very quickly, and you need one example, but computers do not work like this. They actually need a ton of data of Japanese spaniels. So this is a grid of Japanese spaniels showing them you need thousands of examples, showing them in different poses, different brightness conditions, different backgrounds, different crops. You really need to teach the computer from all the different angles what this Japanese spaniel looks like and it really requires all that data to get that to work. Otherwise, the computer can't pick up on that pattern automatically. So what does all this imply about the the setting of self driving? Of course, we don't care about dog breeds too much, maybe we will at some point, but for now we really care about lane markings, objects, where they are, where we can drive, and so on. So the way we do this is we don't have labels like IGUANA for images, but we do have images from the fleet like this and we're interested in, for example, inline markings. So we a human typically goes into an image and using a mouse annotates the lane line markings. So here's an example of an annotation that a human could create a label for this image. And it's saying that that's what you should be seeing in this image. These are the lane line markings. And then what we can do is we can go to the fleet and we can ask for more images from the fleet. And if you ask the fleet, if you just do a naive job of this and you just ask for images at random, the fleet might respond with images like this, typically going forward on some highway. This is what you might just get like a random collection like this. And we would annotate all that data. Now if you're not careful and you only annotate a random distribution of this data, your network will kind of pick up on this random distribution of data and work only in that regime. So if you show it slightly different example, for example, here is an image that actually the road is curving and it is a bit of a more residential neighborhood. Then if you show the neural network this image, that network might make a prediction that is incorrect. It might say that, okay, well, I've seen lots of times on highways, lanes just go forward, so here's a possible prediction. And of course, this is very incorrect. But the neural network really can't be blamed. It does not know that the train on the tree on the left, whether or not it matters or not. It does not know if the car on the right matters or not towards the lane line. It does not know that the buildings in the background matter or not. It really starts completely from scratch. And you and I know that the truth is that none of those things matter. What actually matters is that there are a few white lane markings over there in the vanishing point. And the fact that they curl a little bit should pull the prediction. Except there's no mechanism by which we can just tell the neural network, hey, those lane line markings actually matter. The only tool in the toolbox that we have is labeled data. So what we do is we need to take images like this when the network fails and we need to label them correctly. So in this case, we will turn the lane to the right. And then we need to feed lots of images of this to the neural net. And neural net over time will accumulate will basically pick up on this pattern that those things there don't matter, but those lane line markings do and we learn to predict the correct lane. So what's really critical is not just the scale of the data set. We don't just want millions of images. We actually need to do a really good job of covering the possible space of things that the car might encounter on the roads. So we need to teach the computer how to handle scenarios where it's night and wet. You have all these different specular reflections. And as you might imagine, the brightness patterns in these images will look very different. We have to teach the computer how to deal with shadows, how to deal with forks in the road, how to deal with large objects that might be taking up most of that image, how to deal with tunnels or how to deal with construction sites. And in all these cases, there's no, again, explicit mechanism to tell the network what to do. We only have massive amounts of data. We want to source all those images and we want to annotate the correct lines and the network will pick up on the patterns of those. Now large and varied data sets basically make these networks work very well. This is not just a finding for us here at Tesla. This is a ubiquitous finding across the entire industry. So experiments and research from Google, from Facebook, from Baidu, from Alphabet's DeepMind all show similar plots where neural networks really love data and love scale and variety. As you add more data, these neural networks start to work better and get higher accuracies for free. So more data just makes them work better. Now a number of companies have a number of people have kind of pointed out that potentially we could use simulation to actually achieve the scale of the data sets. And we're in charge of a lot of the conditions here and maybe we can achieve some variety in the simulator. Now at Tesla and that was also kind of brought up in the questions just before this. Now at Tesla, this is actually a screenshot of our own simulator. We use simulation extensively. We use it to develop and evaluate the software. We've also even used it for training quite successfully so. But really when it comes to training data from neural networks, there really is no substitute for real data. The simulator simulations have a lot of trouble with modeling appearance, physics and the behaviors of all the agents around you. So here are some examples to really drive that point across. The real world really throws a lot of crazy stuff at you. So in this case, for example, we have very complicated environments with snow, with trees, with wind. We have various visual artifacts that are hard to simulate potentially. We have complicated construction sites, bushes and plastic bags that can go in that can kind of go around with the wind. Complicated construction site that might feature lots of people, kids, animals all mixed in and simulating how those things interact and flow through this construction zone might actually be completely intractable. It's not about the movement of any one pedestrian in there. It's about how they respond to each other and how those cars respond to each other and how they respond to you driving in that setting. And all of those are actually really tricky to simulate. It's almost like you have to solve the self driving problem to just simulate other cars in your simulation. So it's really complicated. So we have dogs, exotic animals. And in some cases, it's not even that you can't simulate it, it's that you can't even come up with it. So for example, I didn't know that you can have truck on truck on truck like that. But in the real world, you find this and you find lots of other things that are very hard to really even come up with. So really, the variety that I'm seeing in the data coming from the fleet is just crazy with respect to what we have in the simulator. We have a really good simulator. Yes. It's I mean, I think simulation, you're fundamentally grading your own homework. So if you know that you're going to simulate it, okay, you can definitely solve for it. But as Andre is saying, you don't know what you don't know. The world is very weird and has millions of corner cases. And if somebody can produce a self driving simulation that accurately matches reality, that in itself would be in a monumental achievement of human capability. They can't. There's no way. Yes. So I think the three points that I really tried to drive home until now are to get neural networks to work well, you require these 3 essentials. You require a large data set, a varied data set and a real data set. And if you have those capabilities, you can actually train your networks and make them work very well. And so why is Tesla in such a unique and interesting position to really get all these three essentials right? And the answer to that, of course, is the fleet. We can really source data from it and make our neural network systems work extremely well. So let me take you through a concrete example of, for example, making the object detector work better to give you a sense of how we develop these neural networks, how we iterate on them and how we actually get them to work over time. So object detection is something we care a lot about. We'd like to put bounding boxes around, say, the cars and the objects here because we need to track them and we need to understand how they might move around. So again, we might ask human annotators to give us some annotations for these and humans might go in and might tell you that okay those patterns over there are cars and bicycles and so on. And you can train your neural network on this. But if you're not careful, the neural network will make mispredictions in some cases. So as an example, if we stumble by a car like this that has a bike on the back of it, then the neural network actually when I joined would actually create 2 detections. It would create a car detection and a bicycle detection. And that's actually kind of correct because I guess both of those objects actually exist. But for the purposes of the controller and the planner downstream, you really don't want to deal with the fact that this bicycle can go with the car. The truth is that that bike is attached to that car. So in terms of like just objects on the road, there's a single object, a single car. And so what you'd like to do now is you'd like to just potentially annotate lots of those images as this is just a single car. So the process that we go through internally in the team is that we take this image or a few images that show this pattern and we have a mechanism, a machine learning mechanism by which we can ask the fleet to source us examples that look like that. And the fleet might respond with images that contains those patterns. So as an example, these 6 images might come from the fleet. They all contain bikes on backs of cars. And we would go in and we would annotate all those as just a single car. And then the performance of that detector actually improves. And the network internally understands that, hey, when the bike is just attached to the car, that's actually just a single car. And it can learn that given enough examples. And that's how we've sort of fixed that problem. I will mention that I talk quite a bit about sourcing data from the fleet. I just want to make a quick point that we've designed this from the beginning with privacy in mind and all the data that we use for training is anonymized. Now the fleet doesn't just respond with bicycles and backs of cars. We look for all the things we look for lots of things all the time. So for example, we look for boats and the fleet can respond with boats. We look for construction sites and the fleet can send us lots of construction sites from across the world. We look for even slightly more rare cases. So for example, finding debris on the road is pretty important to us. So these are examples of images that have screened to us from the fleet that show tires, cones, plastic bags and things like that. If we can source these at scale, we can annotate them correctly and the neural network will learn how to deal with them in the world. Here's another example. Animals of course also a very rare occurrence and event, but we want the neural network to really understand what's going on here that these are animals and we want to deal with that correctly. So to summarize, the process by which we iterate on neural network predictions looks something like this. We start with a seed data set that was potentially sourced at random. We annotate that data set and then we train neural networks on that data set and put that in the car. And then we have mechanisms by which we notice inaccuracies in the car when this detector may be misbehaving. So for example, if we detect that the neural network might be uncertain or if we detect that or if there's a driver intervention on any of those settings, we can create this trigger infrastructure that sends us data of those inaccuracies. And so for example, if we don't perform very well on lane line detection on tunnels, then we can notice that there's a problem in tunnels. That image would enter our unit tests, so we can verify that we're actually fixing the problem over time. But now what you do is to fix this inaccuracy, you need to source many more examples that look like that. So we ask the fleet to please send us many more tunnels. And then we label all those tunnels correctly. We incorporate that into the training set, and we retrain the network, redeploy and iterate the cycle over and over again. And so we refer to this iterative process by which we improve these predictions as the data engine. So iteratively deploying something potentially in shadow mode, sourcing inaccuracies and incorporating them into the training set over and over again. And we do this basically for all the predictions of these neural networks. Now so far, I've talked about a lot of explicit labeling. So like I mentioned, we ask people to annotate data. This is an expensive process in time and also with respect to yes, it's just an expensive process. And so these annotations, of course, can be very expensive to achieve. So what I want to talk about also is really to utilize the power of the fleet. You don't want to go through this human annotation bottleneck. You want to just stream in data and automate it automatically. And we have multiple mechanisms by which we can do this. So as one example of a project that we recently worked on is the detection of cut ins. So you're driving down the highway, someone is on the left or on the right, and they cut in front of you into your lane. So here's a video showing the autopilot detecting that this car is intruding into our lane. Now of course we'd like to detect a cut in as fast as possible. So the way we approach this problem is we don't write explicit code for is the left blinker on, is the right blinker on, track the cuboid over time and see if it's moving horizontally. We actually use a fleet learning approach. So the way this works is we ask the fleet to please send us data whenever they see a car transition from a right lane to the center lane or from left to center. And then what we do is we rewind time backwards and we automatically can annotate that, hey, that car will turn will in 1.3 seconds cut in front of you. And then we can use that for training the neural net. And so the neural net will automatically pick up on a lot of these patterns. So for example the cars are typically yawed, they're moving this way, maybe the blinker is on, all that stuff happens internally inside the neural net just from these examples. So we asked the fleet to automatically send us all this data. We can get half a 1000000 or so images and all of these would be annotated for cut ins and then we train the network. And then we took this cut in network and we deployed it to the fleet, but we don't turn it on yet, we run it in shadow mode. And in shadow mode, the network is always making predictions. Hey, I think this vehicle is going to cut in from the way it looks. This vehicle is going to cut in. And then we look for mispredictions. So as an example, this is a clip that we had from shadow mode of the cut in network. And it's kind of hard to see, but the network thought that the vehicle right ahead of us on the right was going to cut in. You can sort of see that it's slightly flirting with the lane line. It's trying to it's sort of encroaching a little bit, and the network got excited and it thought that that was going to be cut in. That vehicle will actually end up in our center lane. That turns out to be incorrect and the vehicle did not actually do that. So what we do now is we just churn the data engine. We source that Wren in the shadow mode. It's making predictions. It makes some false positives and there are some false negative detections. So we got overexcited in sometimes and sometimes we missed a cut in when it actually happened. All those create a trigger that streams to us and that gets incorporated now for free. There's no humans harmed in the process of labeling this data, incorporated for free into our training set, we retrain the network and redeploy the shadow mode. And so we can spin this a few times and we always look at the false positives and negatives coming from the fleet. And once we're happy with the false positive, false negative ratio, we actually flipped a bit and actually let the car control to that network. So you may have noticed, we actually shipped one of our first versions of a cabin detector approximately, I think, 3 months ago. So if you've noticed that the car is much better at detecting currents, that's fleet learning operating at scale. Yes, it actually works quite nicely. So that's fleet learning. No humans were harmed in the process, Just a lot of neural network training based on data and a lot of shadow mode and looking at those results. Essentially, like everyone's training the network all the time is what it amounts to. Whether the autopilot is on or off, network is being trained. Every mile that's driven for the car that's a hardware 2 or above is training the network. Yes. Another interesting way that we use this in the scheme of fleet learning and the other project that I will talk about is path prediction. So while you are driving the car, what you're actually doing is you are annotating the data because you are steering the wheel, you're telling us how to traverse different environments. So what we're looking at here is some person in the fleet who took a left through an intersection. And what we do here is we have the full video of all the cameras and we know that the path that this person took because of the GPS, the inertial measurement unit, the wheel angle, the wheel ticks. So we put all that together and we understand the path that this person took through this environment. And then of course this we can use this for supervision for the network. So we just source a lot of this from the fleet. We train a neural network on those trajectories and then the neural network predicts paths just from that data. So really what this is referred to typically is called imitation learning. We're taking human trajectories from the real world and we're just trying to imitate how people drive in real worlds. And we can also apply the same data engine crank to all of this and make this work over time. So here's an example of path prediction going through a kind of a complicated environment. So what you're seeing here is a video, and we are overlaying the predictions of the network. So this is a path that the network would follow in green. And some yes? I mean the crazy thing is the network is predicting paths it can't even see with incredibly high accuracy. It can't see around the corner, but it's saying the probability of that curve is extremely high. So that's the path, and it nails it. You will see that in the cars today, but we're going to turn on augmented vision so you can see the lane lines and the path predictions of the cars overlaid on the video. Yes. There's actually more going on under the hood that you can even tell. It's kind of scary, to be honest. And of course, there's a lot of details I'm skipping over. You might not want to annotate all the drivers. You might annotate just you might want to just imitate the better drivers, and there's many technical ways that we actually slice and dice that data. But the interesting thing here is that this prediction is actually a 3 d prediction that we project back to the image here. So the path here forward is a three-dimensional thing that we're just rendering in 2 d, but we know about the slope of the ground from all this, and that's actually extremely valuable for driving. So path prediction actually is live in the fleet today by the way. So if you're driving cloverleafs, if you're in a cloverleaf on a highway until maybe 5 months ago or so your car would not be able cloverleaf. Now it can. That's path prediction running live on your cars. We've shipped this a while ago. And today you are going to get to experience this for traversing intersections, a large component of how we go through intersections in your drives today is all sourced from path prediction from automatic labels. So what I talked about so far is really the 3 key components of how we iterate on the predictions of the network and how we make it work over time. You require large, varied and real data set. We can really achieve that here at Tesla. And we do that through the scale of the fleet, the data engine, shipping things in shadow mode, iterating that cycle and potentially even using fleet learning where no human annotators are harmed in the process and just using data automatically and we can really do that at scale. So in the next section of my talk, I'm going to especially talk about depth perception using vision only. So you might be familiar that there are at least 2 sensors in the car. 1 is vision, cameras just getting pixels and the other is LiDAR that a lot of companies also use. And LiDAR gives you these point measurements of distance around you. Now one thing I'd like to point out, first of all, is you all came here you drove here, many of you, and you used your neural net and vision. You were not shooting lasers out of your eyes and you still ended up here. We might have, I mean, that's good for everyone. So clearly the human neural net derives distance and all the measurements in the three d understanding of the world just from vision. It actually uses multiple cues to do so. I'll just briefly go over some of them just to give you a sense of roughly what's going on inside. As an example, we have 2 eyes pointed out. So you get 2 independent measurements at every single time step of the world ahead of you. And your brain stitches this information together to arrive at some depth estimation because you can triangulate any points across those two viewpoints. A lot of animals instead have eyes that are positioned on the sides, so they have very little overlap in their visual fields. They will typically use structure for motion. And the idea is that they bob their heads, and because of the movement, they actually get multiple observations of the world and you can triangulate again depth. And even with one eye closed and completely motionless, you can still have some sense of depth perception. If you did this, I don't think you would notice me coming 2 meters towards you or 100 meters back. And that's because there are a lot of very strong monocular cues that your brain also takes into account. This is an example of a pretty common visual illusion where you have these two blue bars are identical, but your brain, the way it stitches up the scene is it just expects one of them to be larger than the other because of the vanishing lines of this image. So your brain does a lot of this automatically. And neural nets artificial neural nets can as well. So let me give you three examples of how you can arrive at depth perception from vision alone, a classical approach and 2 that rely on neural networks. So here's a video going down, I think this is San Francisco, of a Tesla. So these are our cameras, our sensing. And we're looking at all I'm only showing the main camera, but all the cameras are turned on, the 8 cameras in the autopilot. And if you just have the 6 second clip, what you can do is you can stitch up this environment in 3 d using multi view stereo techniques. So this this is supposed to be a video. Is it not a video? Although I know it's There we go. So this is the 3 d reconstruction of those 6 seconds of that car driving through that path. And you can see that this information is purely it's very well recoverable from just videos. And roughly that's through process of triangulation and as I mentioned multivostereo. And we've applied similar techniques, slightly more sparse and approximate also in the car. So it's remarkable all that information is really there in the sensor and just a matter of extracting it. The other project that I want to briefly talk about is, as I mentioned, there's nothing about neural networks neural networks are very powerful visual recognition engines. And if you want them to predict depth, then you need to, for example, look for labels of depth. And then they can actually do that extremely well. So there's nothing limiting networks from predicting this monocular depth except for labeled data. So one example project that we've actually looked at internally is we use the forward facing radar, which is shown in blue, and that radar is looking out and measuring depths of objects, and we use that radar to annotate the what vision is seeing, the bounding boxes that come out of the neural networks. Instead of human annotators telling you, okay, this car and this bounding box is roughly 25 meters away, you can annotate that data much better using sensors. So you use sensor annotation. So as an example, radar is quite good at that distance. You can annotate that and then you can train a neural network on it. And if you just have enough data of it, this neural network is very good at predicting those patterns. So here's an example of predictions of that. So in circles, I'm showing radar objects. And in and the cuboids that are coming out here are purely from vision. So the cuboids here are just coming out of vision and the depth of those cuboids is learned by a sensor annotation from the radar. So if this is working very well, then you would see that the circles in the top down view would agree with the cuboids, and they do. And that's because neural networks are very competent at predicting depths. They can learn the different sizes of vehicles internally, and they know how big those vehicles are and you can actually derive depth from that quite accurately. The last mechanism I will talk about very briefly is slightly more fancy and gets a bit more technical, but it is a mechanism that has recently there's a few papers basically over the last year or 2 on this approach. It's called self supervision. So what you do in a lot of these papers is you only feed raw videos into neural networks with no labels whatsoever, and you can still learn you can still get neural networks to learn depth. And it's a little bit technical, so I can't go into the full details, but the idea is that the neural network predicts depth at every single frame of that video, and then there are no explicit targets that the neural network is supposed to regress to with the labels. But instead, the objective for the network is to be consistent over time. So whatever depth you predict should be consistent over the duration of that video, and the only way to be consistent is to be right, So the neural network automatically predicts the correct depths for all the pixels. And we've reproduced some of these results internally. So this also works quite well. So in summary, people drive with vision only, no lasers are involved. This seems to work quite well. The point that I'd like to make is that visual recognition, very powerful visual recognition, is absolutely necessary for autonomy. It's not a nice to have. Like we must have neural networks that actually really understand the environment around you. And LiDAR points are a much less information rich environment. So Vision really understands the full details. Just a few points around are much there's much less information in those. So as an example, on the left here, is that a plastic bag or is that a tire? LiDAR might just give you a few points on that, but Vision can tell you which one of those 2 is true and that impacts your control. Is that person who is slightly looking backwards, are they trying to merge into your lane on the bike or are they just going forward? In the construction sites, what do those signs say? How should I behave in this world? The entire infrastructure that we had built up for roads is all designed for human visual consumption. So all the signs, all the traffic lights, everything is designed for vision. And so that's where all that information is. So you need that ability. Is that person distracted and on their phone? Are they going to walk into your lane? Those answers to all these questions are only found in vision and are necessary for level 4, level 5 autonomy. And that is the capability that we are developing at Tesla. And through this is done through a combination of large scale neural network training, through data engine and getting that to work over time and using the power of the fleet. And so in this sense, LiDAR is really a shortcut. It sidesteps the fundamental problems, the important problem of visual recognition that is necessary for autonomy. So it gives a false sense of progress and is ultimately a crutch. It does give like really fast demos. So if I was to summarize the entire my entire talk in one slide, it would be this. All of autonomy because you want level 4, level 5 systems that can handle all the possible situations in 99.99 percent of the cases. And chasing some of the last few knives is going to be very tricky and very difficult and is going to require a very powerful visual system. So I'm showing you some images of what you might encounter in any one slice of that 9. So in the beginning, you just have very simple cars going forward, then those cars start to look a little bit funny, then maybe you have bikes on cars, then maybe you have cars on cars, then maybe you start to get into really rare events like cars turned over or even cars airborne. We see a lot of things coming from the fleet. And we see them at some rate, like a really good rate compared to all of our competitors. And so the rate of progress at which you can actually address these problems, iterate on the software and really feed the neural networks with the right data, that rate of progress is really just proportional to how often you encounter these situations in the wild. And we encounter them significantly more frequently than anyone else, which is why we're going to do extremely well. Thank you. It's all super impressive. Thank you so much. How much data how many pictures are you collecting on average from each car per period of time? And then it sounds like the new hardware with the dual, dual active, active computers gives you some really interesting opportunities to run-in full simulation one copy of the neural net while you're running the other one driving the car and compare the results to do quality assurance. And then I was also wondering if there are other opportunities to use the computers for training when they're parked in the garage for the 90% of the time that I'm not driving my Tesla around? Thank you very much. Yes. So for the first question, how much data do we get from the fleet? So it's really important to point out, it's not just the scale of the data set, it really is the variety of that data set that matters. If you just have lots of images of something going forward on the highway, at some point, the neural net just gets it. You don't need that data. So we are really strategic in how we pick and choose. And the trigger infrastructure that we've built is quite sophisticated and allows us to get just the data that we need right now. And so it's not a massive amount of data. It's just very well picked data. For the second question, with respect to redundancy, absolutely, you can run basically the copy of the network on both, and that is actually how it's designed to achieve a level 4, level 5 system that is redundant. So that's absolutely the case. And your last question, I'm sorry, I did not Training. The car is an inference optimized computer. We do have a major program at Tesla, which we don't have enough time to talk about today, called Dojo. That's a super powerful training computer. The goal of Dojo will be to be able to take in vast amounts of data and train at a video level and do unsupervised massive training of vast amounts of video with the Dojo program Dojo Computer. But that's for another day. I'm like test pilot in a way because I drive the 4, 5, 10 and all these really tricky, really long tail things happen every day. But the one challenge that I'm curious to how you're going to solve is changing lanes. Because whenever I try to get into a lane with traffic, everybody cuts you off. And so human behavior is very irrational when you're driving in LA and the car just wants to do it safely and you almost have to do it unsafely. So I was wondering how you're going to solve that problem. Yes. So one thing I will point out is I spoke about the data engine as iterating on neural networks, but we do the exact same thing on level of software and all the hyperparameters that go into the choices of when we actually lane change, how aggressive we are. We're always changing those potentially around them in shadow mode and seeing how well they work. And so to tune our heuristics around when it's okay to lane change, we would also potentially utilize the data engine and the shadow mode and so on. Ultimately, actually designing all the different heuristics for when it's okay to lane change is actually a little bit intractable I think in the general case. And so ideally you actually want to use fleet learning to guide those decisions. So when do humans lane change? In what scenarios? And when do they feel it's not safe to lane change? And let's just look at a lot of the data and train machine learning classifiers for distinguishing when it is too safe to do so. And those machine learning classifiers can write much better code than humans because they have the massive amount of data backing that. So they can really tune all the right thresholds and agree with humans and do something safe. Well, we're going to I think we'll probably have a mode that goes beyond Mad Max mode to LA traffic mode. Yes. Well, Mad Max would have a hard time in LA traffic, I think. Yes. So really it's a trade off. Like you don't want to create unsafe situations, but you want to be assertive. But that little dance of how you make that work as a human is actually very complicated and it's very hard to write in code. But I think we really do it really does seem like machine learning approach is kind of like the right way to go about it, where we just look at a lot of ways that people do this and try to imitate that. We're just being like more conservative right now. And then as we're getting higher confidence, we'll allow users to select a more aggressive mode. That will be up to the user. But in the more aggressive modes in trying to merge in traffic, there is a slight I mean, no matter how many years, there's a slight chance of like a fender bender, not a serious accident. But you basically will have a choice of do you want to have a non zero chance of a vendor bender on freeway traffic, which is unfortunately the only way to navigate L. A. Traffic, yes. Yes. Yes. Yes. I mean, yes. And it was nice to have like LA Story. This movie is a great movie. Yes. Yes. We'll have more aggressive options over time that will be user specified. Mad Max Plus. Yes, Mad Max Plus, exactly. Hello? Hi, Jed Dorsheimer from Canaccord Genuity. Thank you and congratulations on everything that you've developed. When we look at the AlphaZero project, it was a very defined and limited variable in terms of the parameters on that, which allowed for the learning curve to be so quick. The risk or what you're trying to do here is almost develop consciousness in the cars through the neural network. And so I guess the challenge is how do you not create a circular reference in terms of the pulling from the centralized model of the fleet to that handoff where the car has enough information? Where is that line, I guess, in terms of the point of the learning process to handing it off where there's enough information in the car and not having to pull from the fleet? Well, the car can operate if it's completely disconnected from the fleet. It just it uploads the training that's better and better as the gets better and better. So simply, if you disconnected it from the fleet from that point onwards, it would stop getting better, but it would still function fine. But I guess in the hardware portion of your share, in the previous version, you talked about a lot of the power benefits of not storing a lot of the images. And so in this portion, you're talking about the learning that's going on by pulling from the fleet. I guess I'm having a hard time reconciling how if there was a situation where I'm driving up the hill as you showed and I'm predicting where the road is going to go, that's coming from all of the other fleet variables that led to that intelligence. How I'm not how I'm getting the benefit of the low power using the cameras with the neural network? That's where I'm losing the 2. Maybe it's just me, but I guess that's I mean the compute power in the full self driving computer is incredible. And it maybe we should mention that if it had never seen that road before, it would still have made those predictions provided it was a road in the United States. In the case of LiDAR, the March of 9, isn't there an example, I want to just get to your slam on LiDAR because it's pretty clear you don't like LiDAR. And this LiDAR is lame. LiDAR is lame. Isn't there like a case where at some point $99,000,000 down the road where actually LiDAR may be helpful and why not have it as some sort of redundancy or backup? That's my first question. And the second, so you can still have your focus on computer vision, but just have it as a redundant. And my second question is, if that is true, what happens to the rest of the industry that's building their autonomy solutions on LiDAR? They're all going to dump LiDAR, that's my prediction. Mark my words. I should point out that I don't actually super hate LiDAR as much as it may sound, but at SpaceX, SpaceX Dragon uses LiDAR to navigate to the space station and dock. Not only that, we SpaceX developed its own LiDAR from scratch to do that, and I spearheaded that effort personally because in that scenario, LiDAR makes sense. And in cars, it's freaking stupid. It's expensive and unnecessary. And as Andre was saying, once you solve Vision, it's been worthless. So you have expensive hardware that's worthless on the car. We do have a forward radar, which is low cost and is helpful, especially for occlusion situations. So if there's like fog or dust or snow, the radar can see through that. If you're going to use active photon generation, don't use visible wavelength because once with passive optical, you've taken care of all visible wavelength stuff. You want to use a wavelength that is occlusion penetrating like radar. So LIDAR is just active photon generation in the visual spectrum. If you can do active photon generation, do it outside the visual spectrum in the radar spectrum. So like at 3.8 millimeters versus 400, 700 nanometers, we're going to be much better occlusion penetration, and that's why we have a forward radar. And then we also have 12 ultrasonics for near field information, in addition to the 8 cameras and the forward radar. You only need the radar in the forward direction because that's the only direction you're going real fast. So it's I mean, we've gone over this multiple times like, are we sure we have the right sensor suite? Should we add anything more? No. Hi. So right here. So you had mentioned that you asked the fleet for the information that you're looking for, for some of the vision. I have two questions about that. It sounds like the cars are doing some computation to determine what kind of information to send back to you. Is that based on stored information? Yes. So they absolutely do the computation in real time on the car. And we have a way to basically specify condition that we're interested in. And then those cars do that computation there. If they did not, then we'd have to send all the data and do that offline in our back end. We don't want to do that. So all that computation have us on the car. So it's based on that question, it sounds like you guys are in a really good position to have currently 500,000 cars in the future, potentially millions of cars that are essentially computers representing almost free data centers for you to do computational. Is that a huge future opportunity for Tesla? It's current. But it's current opportunity and that's not really factored in for anything yet. That's incredible. Thank you. We have 425,000 cars with hardware 2 and beyond, which is means they've got all 8 cameras, the right of the radar and ultrasonics. And they've got at least the NVIDIA computer, which is enough to essentially figure out what information is important, what is not, compress the information that is important to the most salient elements and upload it to the network for training. So it's a massive compression of real world beta. You have these sort of network of millions of computers, which is like massive data centers essentially that are distributed data centers for computational capacity. Do you see it being used for other things besides self driving in the future? I suppose it could possibly be used for something besides self driving. We've been super focused on self driving. So as we get that really nailed, maybe there's going to be some other use for 1,000,000 and then tens of millions of computers with hardware 3 or full self driving computer. Yes, maybe there would be it could be maybe there's like some sort of AWS angle here. It's possible. Hello. Hi, Elon. Matt Joyce, Loop Ventures. I own a Model 3 in Minnesota where it snows a lot. Since camera and radar cannot see road markings through snow, what is your technical strategy to solve this challenge? Does it involve high precision GPS at all? Yes. So actually like today, actually, Autopilot will do a decent job in snow, even when landline markings are covered. Even when landline markings are faded, covered or when there's lots of rain on them, we still seem to drive relatively well. We didn't specifically go after snow yet with our data engine, but I actually think this is completely tractable because in a lot of those images, even when things are snowy, when you ask a human annotator where are the lane lines, they actually could tell you. They actually are relatively consistent in creating those lane lines. As long as the annotators are consistent on your data, then I have there's the neural network will pick up on those patterns and will do just fine. So it's really just about is the signal there even for the human annotator? If that is the answer to that is yes, then the neural network can do it just fine. Yes. There's actually there are a number of important signals, as Andrey was saying. So lane lines are one of those things, but one of the most important signals is drive space. So what is drivable space and what is not drivable space? And what actually really matters the most is drivable space more than lane lines. And the prediction of drivable space is extremely good. And I think especially after this upcoming winter, we'll be incredible. It's like it will be like how could it possibly be that good? That's crazy. The other thing to point out is, maybe it's not even only about human annotators. As long as you as a human can drive through that environment, through fleet learning, we actually know the path you took. And you obviously use vision to guide you through that path. You did not just use the lane line markings. You used the entire geometry of the entire scene. So you see how the world is roughly curling, you see how the cars are positioned around you, network will pick up on all those patterns automatically inside it if you just have enough of the data, people traversing those environments? Yes. It's actually extremely important that things not be rigidly tied to GPS because GPS error can vary quite a bit. And the actual situation for a road can vary quite a bit. So there could be construction, there could be a detour. And if the car is using GPS as primary, this is a real bad situation. It's asking for trouble. It's fine to use GPS for like tips and tricks. So it's like you can drive your home neighborhood better than a neighborhood in like some other country or some other part of the country. So you know your own neighborhood well, and you use kind of like the knowledge of your neighborhood to drive with more confidence, to maybe have counterintuitive shortcuts and that kind of thing. But you the GPS overlay data should only be helpful, but never primary. If it's ever primary, it's problem. So question back here in the back corner. I just wanted to follow-up partially on that because several of your competitors in space over the past few years have made, have talked about how they are augmenting all of their perception and path planning capabilities that are kind of on the car platform with high definition maps of the areas that they are driving. Does that play a role in your system? Do you see it adding any value? Are there areas where you would like to get more data that is not collected from the fleet, but is more kind of mapping style data? I think the high precision GPS maps and lanes are a really bad idea. The system becomes extremely brittle. So any change like this might any change to the system makes it it can't adapt. So if it locks onto GPS and high precision lane lines, and does not allow vision override. In fact, vision should be the thing that does everything and then like lane lines are a guideline, but they're not the main thing. We briefly blocked up the tree of high precision lane lines and then realized that was a huge mistake and reversed it out. This is not good. So this is very helpful for understanding annotation, where the objects are and how the it's more art than science? It does pretty good actually, like with cut ins and stuff, it's doing really well. Yes. So like I mentioned, we're using a lot of machine learning right now in terms of predicting kind of creating an explicit representation of what the role looks like. And then there's an explicit planner and a controller on top of that representation. And there's a lot of heuristics for how to traverse and negotiate and so on. There is a long tail just like in what visual environments look like, there's a long tail in just those negotiations and a little game of chicken that you play with other people and so on. And so I think we have a lot of confidence that eventually there must be some kind of a fleet learning component to how you actually do that because writing all those rules by hand is going to quickly plateau, I think. Yes. We've dealt with this issue with cut ins, and it's like we'll allow gradually more aggressive behavior on the part of the user. They can just dial the setting up and say be more aggressive, be less aggressive, drive easy, chill mode aggressive? Incredible progress. Phenomenal. Two questions. First, in terms of platooning, do you think the system is geared? Because somebody asked about when there is snow on the road, but if you have a platooning feature, you can just follow the car in front. Does your system is your system capable of doing that? And I have two follow ups. So you're asking about platooning. So I think like we could absolutely build those features. But again, if you just use if you just train neural networks, for example, on imitating humans, humans already like follow the car ahead. And so that neural network actually incorporates those patterns internally. It's just it figures out that there's a correlation between the way the car ahead of you faces and the path that you are going to take. Well, that's all done internally in the net. You're just concerned with getting enough data and the tricky data. And the neural network training process actually is quite magical, does all the other stuff automatically. So you turn all the different problems into just one problem, just collect your data set and using your own upper training. Yes. And just there's 3 steps to self driving. There's being feature complete, then there's being feature complete to the degree that where we think that the person in the car does not need to pay attention and then there's being at a reliability level where we've also convinced regulators that, that is true. So there's kind of like 3 levels. We expect to be feature complete in self driving this year, and we expect to be confident enough from our standpoint to say that we think people do not need to touch the wheel, look out of the window sometime probably around, I don't know, Q2 of next year. And then we start to expect to get regulatory approval at least in some jurisdictions for that towards the end of next year. That's roughly the time line that I expect things to go on. And probably for trucks, the tuning will be approved by regulators before anything else. And then you can have like maybe if you're long haul doing long haul freight, you can have one driver in the front and then have 4 semis trailing behind in a platooning manner. And I think that probably the regulators will be quicker to approve that than other things. Okay. Regarding of course, you don't have to convince us, LiDAR is a technology, in my opinion, which has an answer looking for a question probably that I mean, this is very impressive what we saw today and probably demo could show something more. I was wondering what is the maximum dimension of a matrix that you may be having in your training or in your deep learning pipeline, ballpark figure? Maxim, the mention of the matrix. So doing a lot of matrix multiply operations inside the neural network, you're asking about the there's many different ways to answer that question, but I'm not 100% sure if they're useful. Answers. These neural networks would typically have, like I mentioned, about tens to 100 of millions of neurons. Each of them are on average have about 1,000 connections to neurons below. So those are the typical scales that are kind of used across the industry and also that we would use as well. Yes, I've been actually very impressed by the rate of improvement on autopilot the past year on my Model 3. The two scenarios I wanted your feedback on. Last week, the first scenario was, I was on the right hand most lane of the freeway and there was a highway on ramp. And then my Model 3 actually was able to detect the 2 cars on the side, slow down and let the car go in front of me and 1 car go behind me. And I was like, oh my gosh, this is like insane. Like I didn't think my Model 3 could do that. So that was like super impressive. But the same week, another scenario, which is I was on the right hand lane again, but my right hand lane was merging with the left lane. It wasn't an on ramp. It's just a normal highway freeway lane. And my Model 3 wasn't able to detect really that situation and I wasn't able to slow down or speed up and I had to intervene kind of. So can you, from your perspective, kind of share kind of the background on how a neural net would how Tesla might adjust for that? And like how that could be improved over time? Yes. So like I mentioned, we have a very sophisticated trigger infrastructure. If you have intervened, it's actually potentially likely that we received that clip and that we can actually analyze it and see what happened and tune the system. So it probably enters some statistics over, okay, at what rate are we correctly merging the traffic? And we look at those numbers and we look at the clips and we see what's wrong and we try to fix those clips and make progress against those benchmarks. So yes. Yes. So we would potentially go through a phase of categorization and then we look at some of the biggest kind of categories that actually seem to semantically be related to a simple to the same problem and then we will look at some of those and then try to develop software against that. Okay. We do have one more presentation, which is the software. So it's like essentially the Autopilot hardware with Stuart. There's the sort of neural net vision with Andre and then there's the software engineering at scale that's going to be presented by Stuart. So thanks. And there will be opportunity afterwards to ask questions. So yes, thanks. I just wanted to very briefly say, if you have an early flight and you want to do a test right with our latest development software, if you could please speak to my colleague, Anne, or drop her an e mail, and we can take you out for a test ride. And Stuart, over to you. All right. So that's actually from a clip of a longer than 30 minute uninterrupted drive, with no interventions, navigating Autopilot on the highway system, which is in production today on hundreds of thousands of cars. So I'm Stuart, and I'm here to talk about how we build some of these systems at scale. Just like a really short introduction on kind of where I'm coming from, what I do. So I've been in a couple of companies or less. I've been writing software professional about 12 years. The thing that excites me most and I'm really passionate about is taking the cutting edge of machine learning and actually connecting that with customers through robustness and scale. So at Facebook, I worked initially inside of our ads infrastructure to build some of the machine learning, some really, really smart people. And we actually tried to build that to a single platform that we could then scale to all the other aspects of the business from how we rank the news feeds, to how we deliver search results, to how we make every recommendation across the platform. And that became the Applied Machine Learning Group. That's something I was incredibly proud of. And a lot of that wasn't just the core algorithms and the really important improvements that happened there, does that matter? A lot of it is actually the engineering practices to build these systems at scale. And the same thing was true at Snap where I went where we were really, really excited to sort of actually help to monetize this product. But the hardest part, we were using Google at the time, and they were effectively running us on a very small scale. And we wanted to build that same infrastructure. We take understanding of these users, connect that with cutting edge machine learning, build that at massive scale and handle billions and then trillions of both predictions and auctions every day, which is really robust. And so when the opportunity came to come to Tesla, that's something I'm just like incredibly excited to do, which is specifically take the amazing things that are happening both in the hardware side and the computer vision and AI side and actually package that together with all the planning, the controls, the testing, the kernel patching of the operating system, all of our continuous integration, our simulation and actually build that into a product we get onto people's cars in production today. And so I want to talk about the timeline for how we did that with Navigate on Autopilot and how we're going to do that as we navigate on Autopilot off the highway and onto city streets. So we're at 70,000,000 miles already for Navigate on autopilot, something really, really cool. And I think one thing that is worth kind of calling out on this is that we're continuing to accelerate and keep learning from this data. Like Andre talked about this data engine, as this accelerates up, we actually do make more and more assertive lane changes. We are learning from these cases where people intervene either because they failed to detect a merge correctly or because they wanted the car to be a little more peppy in different environments, and we just want to keep making that progress. So to start all of this, we begin with trying to understand the world around us. And we talked about the different sensors in the vehicle, but I want to dig in a little bit more here. We have 8 cameras, but then we also have additionally 12 ultrasonic sensors or radar, an inertial measurement unit, GPS. And then one thing we forget about is we also have the pell and steering actions. So not only can we look at what's happening around the vehicle, we can look at how humans chose to interact with that environment. And so I'll talk to this clip right now. This basically is showing what's happening today in the car and we're continuing to push this forward. So we start with a single neural network. We see the detections around it. We then build all that together with multiple neural networks and multiple detections. We bring in the other sensors and we convert that into what Elon calls a vector space, an understanding of the world around us. And this is something where we continue to get better and better at this, we're moving more and more of this logic into the neural networks themselves. And the obvious end game here is that the neural network looks across all the cars, brings in all the information together and just ultimately outputs a source of truth for the world around us. And this is actually not like an artist rendering in many senses. This is actually the output of one of the debugging tools that we use on the team every day to understand what the world looks like around us. So another thing that I think is really, really exciting to me, I think when I do hear about sensors like LiDAR, a common question is around just having extra sensor modalities, like why not have some redundancy on the vehicle? And I want to dig in on one thing is not always obvious with neural networks themselves. So we have a neural network running on our, say, wide fisheye camera. That neural network is not making one prediction about the world. It's making many separate predictions, some of which actually audit each other. So as a real example, we have the ability to detect a pedestrian. That's something we train very, very carefully on and put a lot of work into. We also have the ability to detect obstacles in the roadway, and a pedestrian is an obstacle. It's shown differently to the neural network. It says, oh, there's a thing I can't drive through. And these together combine to give us an increased sense of what we can and can't do in front of the vehicle and how to plan for that. We then do this across multiple cameras because we have overlapping fields of view in many places around the vehicle. In front, we have a particularly large number of overlapping fields of view. Lastly, we can combine that with things like the radar and ultrasonics to build these extremely precise understandings what's happening in front of the car. We can use that both to learn future behaviors that are very accurate. We can also build very accurate predictions of how things will continue to happen in front of us. So one example I think is really exciting is we can actually look at bicyclists and people and not just ask where are you now, but where are you going? And this is actually the heart of what we're doing for our next generation automatic emergency braking system, which will not just stop for people in your path, but it will stop for people who are going to be in your path. And that's running in shadow mode right now. We'll go out to the fleet this quarter, and I'll talk about shadow mode in a second. So when you want to start a feature like this for navigating on autopilot on the highway system, you can start by learning from data. And you can just look at how humans do things today. What is their assertiveness profile? How do they change lanes? What causes them to either abort or change like their maneuvers? And you can see things that are not immediately obvious like, oh, yes, simultaneous merging is rare, but very complicated and very important. And you can start to build opinions about different scenarios, such as a fast overtaking vehicle. So this is what we do when we initially have some algorithms we want to try out. We can put them on the fleet and we can see what they would have done in a real world scenario, such as this car that's overtaking us very quickly. This is taken from our actual simulation environment, showing different paths that we have considered taking and how those overlay on the real world behavior of a user. When you get those algorithms tuned up and you feel good about them specifically, and this is really taking that output in the neural network, putting it in that vector space and building and tuning these parameters on top of it, ultimately, I think we can do through more and more machine learning, you go into a controlled deployment, which for us is our early access program. And as you get this out to a couple of 1,000 people who are really excited to give you highly vigilant, but useful feedback about how it behaves, not in open loop, but in a closed loop way in the real world and you watch their interventions. We talked about this, like when somebody takes over, we can actually get that clip, try to understand what happens. And one thing we can really do is we can actually play this back again in an open loop way and ask, as we build our software, are we getting closer or further from how humans behave in the real world? And one thing which is super cool with the full self driving computers, we're actually building our own racks and infrastructure. So basically, you can fit 4 of our full self driving computers fully racked up, build these into our own cluster and actually run this very sophisticated data infrastructure to actually understand over time as we tune and fix these algorithms, are we getting closer and closer to how humans behave? And ultimately, can we exceed their capabilities? And so once we had this, we feel really good about it. We want to do a wide rollout. But to start, we actually asked everybody to confirm the car's behavior via stock confirm. And so we started making lots and lots of predictions about how we should be navigating the highway. We asked people to tell us, is this right or is this wrong? And this is again a chance to churn that data engine. And we did spot some really tricky and interesting long tails of, in this case, I think a really fun example like these very interesting cases of simultaneous merging where you start going and then somebody moves either behind or before you not noticing you. And what is the appropriate behavior here? And what are the tunings of the neural network we need to do to super precise about the appropriate behaviors here. We worked, we tuned these in the background, we made them better. And over the course of time, we got 9,000,000 successfully accepted lane changes. And we use these, again, with our continuous integration infrastructure, to actually understand, do we think we're ready. And this is one thing where full self driving is also really exciting to me. Since we own the entire software stack straight from the kernel patching all the way to the tuning on the image signal processor, we can start to collect even more data that is even more accurate. And this allows us to do even better and better tuning these faster iteration cycles. And so earlier this month, we were kind of thought we're ready to deploy an even more seamless version of Navigate on Autopilot on the highway system. And that seamless version does not require a stock confirm. So you can sit there, relax, put your hand on the wheel and just oversee what the car is doing. And in this case, we're actually seeing over 100,000 automated lane changes every single day on the highway system. And This is something that's just like super cool to us to deploy at scale. And the thing that I'm kind of most excited about from all this is the actual life cycle of this and how we actually are able to turn that data engine crank faster and faster and faster with time. And I think one thing that's really becoming very clear is the combination of the infrastructure we have built, the tooling we built on top of that and the combined power of the full self driving computer, I believe we can do this even faster as we move navigate on autopilot from the highway system onto city streets. And so yes, with that, I'll hand off to Elon. Yes. To the best of my knowledge, all those lane changes have occurred with 0 accidents. That is correct. Yes. I watch every single accident. So it's conservative, obviously, but it's to have 100 of 1,000, going to 1,000,000 of lane changes and 0 accidents is, I think, a great achievement by the Tesla team. Thank you. So let's see. A few other things that I'm familiar with mentioning. In order to have a self driving car or a robotaxi, you really need redundancy throughout the vehicle at the hardware level. So starting in maybe it was October 2016, all cars made by Tesla have redundant power steering. So we have redundant motors in the power steering. So any one failure of the if a motor fails, the car can still steer. All of the power and data lines have redundancy. So you can sever any given power line or any data line and the car will keep driving. The auxiliary power system, even if the main pack you lose complete power in the main pack, the car is capable of steering and braking using the auxiliary power system. So you can completely lose the main pack, and the car is safe. The whole system from a hardware standpoint has been designed for to be a robotaxi since basically October 2016. So when we rolled out hardware Autopilot Version 2. But we do not expect to upgrade cars made before that. We think it would actually cost more to make a new car than to upgrade the cars, just to give you a sense of how hard it is to do this. Unless it's designed in, it's not worth it. So we've gone through the future of self driving, where it's hardware, it's vision and then there's a lot of software and there's a the software problem here should not be minimized. It's a massive software problem that yes, managing vast amounts of data, training against the data, how do you control the car based on the vision, it's a very difficult software problem. So going after going over just like Tesla Mass Plan, obviously, we've made a bunch of forward looking statements, as I call it. And but let's go through some of our other forward looking statements that we've made. Way back when we created the company, we said we'd build the Tesla Roadster. They said it was impossible and that even if we did build it, nobody would buy it. This is like universal opinion was that building an electric car was extremely dumb and would fail. I agree with them that probability of failure was high, but that this was important. So we built the Tesla Roadster, got into production in 2,008 and shipping that car. It's not a collector's item. Then we pulled a more affordable car with the Model S. We did that. Again, we were told that's impossible. I was called a fraud and a liar. It was not going to happen. This is all untrue. Okay, Famous last words now is we went into production with the Model S in 2012. It exceeded all expectations. There is still, in 2019, no car that can compete with the Model S of 2012. It's 7 years later. Still waiting. So we bought an affordable car, maybe highly affordable. It was affordable, more affordable with Model 3. We bought the Model 3. We're in production. I said we'd get over 5,000 cars a week for Model 3. At this point, 5,000 cars a week is a walk in the park for us. It's not even hard. So we do large scale solar, which we did through the SolarCity acquisition, and that we develop and deploy solar roof, which is going really well. We're now on Version 3 of the solar tile roof, and we expect to split up production of the solar tile roof significantly later this year. I have it on my house, and it's great. And I sort of make the Powerwall and the Powerpack, and we made the Powerwall and Powerpack. In fact, the Powerpack is now deployed in massive grid scale utility systems around the world, including the largest operating battery projects in the world that above 100 megawatts. And in the next or probably by next year, tiers at the most, we expect to have a gigawatt scale battery project completed. So all these things, I said we would do them, we did it. So we'd do it, we did it. We're going to do the robo taxi thing too. Only criticism, and it's a fair one, and sometimes I'm not on time. But I get it done, and the Tesla team gets it done. So what we're going to do this year is we're going to reach combined production of 10,000 a week between SX and 3. We feel very confident about that. And we feel very confident about being future complete with self driving. Next year, we'll expand the product line with Model Y and Semi, and we expect to have the 1st operating robo taxis next year with no one in them next year. It's always difficult to like when things are on an exponential at an exponential rate of improvement, it's very difficult to kind of wrap one's mind around it because we're used to extrapolating on a linear basis. But when you've got massive amounts of like as the hardware massive amounts of hardware on the road, the cumulative data is increasing exponentially. The software is getting better at an exponential rate. I feel very confident predicting autonomous Rover taxis for Tesla next year. Not in all jurisdictions because we won't have regulatory approval everywhere, but I'm confident we'll have at least regulatory approval somewhere literally next year. So any customer will be able to add or remove their car to the Tesla network. So we expect this to operate similar sort of like a combination of maybe the Uber and Airbnb model. So if you own the car, you can add or subtract it to the Tesla network, and Tesla would take 25% or 30% of the revenue. And then in places where there aren't enough people sharing their cars, we would just have dedicated Tesla vehicles. So we'll when you use the car, we'll show you our ride sharing app. So you're able to summon the car from the parking lot, get in and go for a drive. It's really simple. So you just take the same Tesla app that you currently have, we'll just update the app and add a summon Tesla or commit your car to the fleet. So see that summon your car or add summon a Tesla or add your add or subtract your car to the fleet? You'll be able to do that from your phone. So we see potential for smoothing out the demand distribution curve and having the car operate at a much higher utility than a normal car would operate. So typically, the use of a car is about 10 to 12 hours a week. So most people will drive 1.5 to 2 hours a day, typically 10 to 12 hours a week of total driving. But if you have a car that can operate autonomously, then most likely you could probably most likely you'd have that car operate for a third of the week or longer. So they're 168 hours in a week. So probably you've got something on the order of 55 to 60 hours a week of operation, maybe a bit longer. So the fundamental utility of a vehicle increases by a factor of 5. So you look at this from a macroeconomic standpoint and say just if this was like some if we were operating some big simulation, if you could upgrade your simulation to increase the utility of cars by a factor of 5, that would be a massive increase in the economic efficiency of the simulation, just gigantic. So we'll do Model 3 S3 and XS Taxis, but we made an important change to our leases. So if you lease a Model 3, you don't have the option of buying it at the end of the lease. We want them back. If you buy the car, you can keep it, If you lease it, you have to give it back. And as I said, we're in any locations where there's not enough supply for sharing, Tesla will just make its own cars and add them to the network in that place. So the current cost of Model 3 Robotaxi is less than $38,000 We expect that number to improve over time. And we're designing the cars the cars currently being built are all designed for 1,000,000 miles of operation. The drive units are designed and tested and validated for 1,000,000 miles of operation. The current battery pack is about maybe 300000 to 500000 miles. The new battery pack that's probably going to production next year is designed explicitly for 1,000,000 miles of operation. The entire vehicle, battery pack inclusive, will it's designed to operate for 1,000,000 miles with minimal maintenance. So we'll actually be adjusting tire design and really optimizing the car for a hyper efficient robo taxi. And at some point, you won't need steering wheels or pedals, and we'll just delete those. So as these things become less and less important, we'll just delete parts, just they won't be there. I can say like probably 2 years from now, we make a car that has no steering wheels or pedals. And if we need to accelerate that time, we can always just delete parts, easy. Probably, say, long term, 3 years, robo taxis with eliminated parts, maybe it ends up being $25,000 or less. And we want a super efficient car, so the electricity consumption is very low. So we're currently at 0.5 miles per kilowatt hour, but we can we'll improve that to 5 and beyond. And there's just really no company that has the full stack integration. We've got the vehicle design and manufacturing, but the computer hardware in house, we've got the in house software development and AI, and we've got by far the biggest fleet. It's extremely difficult, not impossible perhaps, but extremely difficult to catch up when Tesla has 100 times more miles per day than everyone else combined. This is the cost of running a gasoline car or the average cost of running a car in the U. S, this is taken from AAA. So it's currently about $0.62 a mile, 13,500 miles, 350,000,000 vehicles adds up to $2,000,000,000,000 a year. These literally just taken from the AAA website. Cost of ride sharing is, according to Uber and Lyft, is $2 $3 a mile. The cost to run a robotaxi, we think less than $0.18 a mile. And dropping. This is this would be current this current cost, future cost will be lower. You say what would be the probable gross profit from a single robotaxi? We think probably something on the order of 30,000 dollars per year, and we expect that we're literally designing we're designing the cars the same way that commercial semi trailer semi trucks are designed. Commercial semi trucks are all designed for a 1,000,000 mile life, and we're designing the cars for a 1,000,000 mile life as well. So in nominal dollars, that would be a little over $300,000 over the course of 11 years, maybe higher. I think this consumption is actually relatively conservative, and this assumes that 50% of the miles driven are there's nothing are not useful. So this is only at 50% utility. By the middle of next year, we'll have over 1,000,000 Tesla cars on the road with full self driving hardware, feature complete, at a reliability level that we would consider that no one needs to pay attention, meaning you could go to sleep in your from our standpoint, if you fast forward a year, just look maybe a year, maybe a year 3 months, but next year, for sure, we will have over 1,000,000 robo taxis on the road. The fleet wakes up with an over the air update. That's all it takes. You say what is net present value of Roma Taxi probably on the order of a couple of $100,000 So buying a Model 3 is a good deal. Any questions? In our own fleets, I don't know, I guess, long term, we have probably on the order of 10,000,000 vehicles. I mean, our production rates generally, if you look at our compound annual production rate since 2012, which is like the that's our 1st full year of Model S production, we went from 23,000 vehicles produced in 2013 to around 250,000 vehicles produced last year. So in the course of 5 years, we increased output by a factor of 10. I would expect that something similar occurs over the next 5 or 6 years. As for sharing versus I don't know, the nice thing is that essentially customers are fronting us the money for the car. It's great. So in terms of the one thing is the snake charger, I'm curious about that. And also, how did you determine the pricing? It looks like you're undercutting the average Lyft or Uber ride by about 50%. So I'm curious if you could talk a little bit about the pricing strategy. Sure. We expect the obviously, solving for the snake charger is pretty straightforward from a Vision prop standpoint. It's like a known situation. Any kind of known situation with Vision is like a charge port is trivial. So yes, the cars would just automatically park and automatically plug in. There would be no one no human supervision required. Yes, so sorry, what was our pricing, yes. We just threw some numbers on there. I mean, I think it's definitely plug in whatever pricing you think makes sense. We just kind of randomly said, okay, maybe $1 And the thing is like it's there's like on the order of 2,000,000,000 cars and trucks in the world. So robo taxis will be in extremely high demand for a very long time. And from my observation thus far is that the auto industry is very slow to adapt. I mean, like I said, there's still not a car on the road that you can buy today that is as good as the Model S was in 2012. So that suggests a pretty slow rate of adaptation for the car industry. And so probably $1 is conservative for the next 10 years because like people sort of think like there's like actually not enough appreciation for the difficulty of manufacturing. Manufacturing is insanely difficult. A lot of people I talk to think like if you just have the right design, you can like instantly make as much of that thing as the world wants. This is not true. It's extremely hard to design a new manufacturing system for new technology. I mean Audi is having major problems manufacturing the e tron, and they are extremely good at manufacturing. And if they're having problems, what about others? So the on the order of 2,000,000,000 cars and trucks in the world, on the order of about 100,000,000 units per year of production capacity of vehicles, but only of the old design. It will take a very long time to convert all of that to full self driving cars. And there really needs to be electric because the cost of operation of a gasoline diesel car is much higher than an electric car. So any robotax that isn't electric will absolutely not be competitive? Elon, it's Colin Rusch from Oppenheimer over here. Obviously, we appreciate that the customers are fronting some of the cash for this fleet getting built up, but it sounds like a massive balance sheet commitment from the organization over the course of time. Can you talk a little bit about what that looks like? What your expectations are in terms of financing over the next, call it, 3 years, 3, 4 years for building up this fleet and trying to monetize it with your customer base? Well, we're aiming to be approximately cash flow neutral during the fleet buildup phase. And then I would expect to be extremely cash flow positive once the robo taxis are enabled. But I don't want to talk about financing rounds. It would be difficult to talk about financing rounds in this venue, but I think we'll make the right moves. I think we'll make the moves you think we should make. I have a question. If I'm Uber, why wouldn't I just buy all your cars? Why would I let you put me out of business? There's a clause that we put into our cars, I think it was about 3 or 4 years ago. They can only be used in the Tesla network. So even a private person, like if I go out and buy 10 Model 3s, I can't I can run it on the network, that's a business now, right? You're only allowed to use the Tesla network. Right. But if I use the Tesla network, in theory, I could run a car sharing robo taxi business with my 10 Model 3s. Yes. But it's like the App Store. The you can only add or remove them through the Tesla network, and then Tesla gets revenue share. But similar to Airbnb though in that I have this home, my car, and now I can just rent them out. So I can make an extra income from owning multiple cars and just renting them out. Like I have a Model 3, I aspire to get this Roadster here next when you build it, and I'm going to just rent my Model 3 out. Why would I give it back to you? I guess you could operate a rental car fleet, but I think this is very unwieldy. I don't know. It seems easy. Okay. Try it. In order to operate a robo taxi network, it sounds like you have to solve certain problems. Like for example, autopilot today, if you oversteer, it lets you take over. But if it's a ride sharing product that someone else is getting in the passenger seat, like moving the steering can't let that person take over the car, for example, because they might not even be in the driver's seat. So is the hardware already there for it to be a robo taxi? And it might get into situations such as a cop pulling it over where some human might need to intervene like using central fleet of operators that remotely sort of interact with humans? Or I mean, is all of that type of infrastructure already built in to each of the cars? Does that make sense? I think there will be sort of a phone home thing where if the car gets stuck, it will just phone home to Tesla and ask for a solution. Things like being pulled over for by police offshore, that's easy for us to program in. That's not a problem. But it will be possible for somebody to take over using the steering wheel or at least for some period of time. And then probably down the road, we'll just cap the steering wheel so there's no steering control. We'll We'll just take steering wheel off, put a cap on in the long give it like couple of years. Hardware modification to the car in order for it to enable that or? Literally just unbolt the steering wheel and put a cap on where the steering wheel handle currently is. But that is a like future car that you would put out. But what about today's cars where the steering wheel is a mechanism to take over autopilot? Like so if it's in a robotaxi mode, would someone be able to take it over by just simply moving the steering wheel type? Yes. I think there'll be a transition period where people will be able to take over and should be able to take over from the robo taxi. And then once regulators are comfortable with us not having steering wheel, we'll just delete that. And for cars that are on the fleet, obviously, with the permission of the owner, if it's owned by somebody else, we would just take the steering wheel off and put a cap where the steering wheel currently touches. So there might be like 2 phases to robotaxi, one where the service is provided and you come in as the driver, but could potentially take over and then in the future, there might not be a driver option. Is that how you see it as well or like? In the future, there will in future, the probability of the steering wheel being taken away in the future is 100%. People consumers will demand it. But initially, you would follow-up This is not I want to be clear. This is not me prescribing a point of view about the world. This is me predicting what consumers will demand. Consumers will demand in the future that people are not allowed to drive these 2 tonne death machines. I totally agree with that. But in order for a Model 3 today to be part of the robotaxi network, you call it, you would then get into the driver's seat essentially because just to be on the safe? Okay. That makes sense. Thank you. Exactly. Just sort of like there were amphibians, but then pretty much that things just become like land creatures. There will be a little bit of civilian phase. Hi. Sorry, I see what the okay. Yes. The strategy we've heard from other players in the robo taxi space is to select a certain municipal area to create geo fenced self driving. That way you're using an HD map to have a more confined area with a bit more safety. A, we didn't hear much today around the importance of HD Maps. To what extent is an HD Map necessary for you? And second, we also didn't hear much about deploying this into specific municipalities where you're working with the municipality to get the buy in from them and you're also getting a more defined area. So what's the importance of HD Maps and to what extent are you looking at specific municipalities for rollout? I think HD Maps are a mistake. We actually had HD Maps for a while, actually can that. Because you either need HD Maps, in which case if anything changes about the environment, the car will break down or you don't need HD Maps, in which case why are you wasting your time doing HD Maps. So the HD Maps thing, like the 2 main crutches that are that should not be used and will, in retrospect, be obviously false and foolish are LiDAR and HD maps. Mark, my words. Hello. If you need a geo fenced area, you don't have real self driving. Elon, just it sounds like maybe battery supply could be the only bottleneck left towards this vision. And also, could you just clarify how you get the battery packs to last 1,000,000 miles? I think sales will be a constraint. That's a subject for a whole separate that's a whole separate subject. And I think we're actually going to want to push our sort of standard range plus battery more than our long range battery because the energy content in the long range is 50% higher kilowatt hours. So essentially, you can make onethree more cars if you just if they're all sort of standard range plus instead of the long range pack. So one is like around 50 kilowatt hours, the other one is around 75 kilowatt hours. So we're actually probably going to bias our sales intentionally towards the small battery pack in order to have a higher volume of what basically, what the obvious thing to do is maximize the number of autonomous units or the number of maximize the output that will subsequently result in the biggest autonomous fleet down the road. So we're making doing a number of things in that regard, but it's just not for today's meeting. The million mile life is basically just about getting the cycle life of the pack to you need basically on the order like let's say, you've got a basic math. If you've got a 2 50 mile range pack, you're going to need 4,000 cycles. So very achievable. We already do that with our stationary storage some of our stationary storage solutions like PowerPack. We're ready to play PowerPack with 4,000 cycle life capability. May I ask Sorry. Yes. I wanted to It's like ventriloquism, right? It's obviously significant, very constructive margin implications to the extent you can drive attach rates much higher over the full self driving option. I'd just be curious if you can level set kind of where you are in terms of those attach rates and how you expect to educate consumers about the Robotech scenario so that attach rates do materially improve over time? Sorry, it's a bit hard to hear your question. Yes. Just curious where we are today in terms of full self driving attach rates in terms of the financial implications. I think it's hugely beneficial if those attach rates materially increase because of the higher gross margin dollar that flow through to the extent people do sign up for full FST. Just curious how you see that ramping? What the attach rates are today versus when do you expect how do you expect to educate consumers and get them aware that they should be attaching FSD to their vehicle purchases? We're going to ramp that up massively after today. Yes. I mean, the fundamental really fundamental message that consumers should be taking today is that it's financially insane to buy anything other than a Tesla. They will be it will be like owning a horse in 3 years. I mean, fine if you don't own a horse, but you should go into it with that expectation. If you buy a car that does not have the hardware necessary for full self driving, is like buying a horse. And the only car that has the hardware necessary for full self driving is Tesla. Like people should really think about their purchase, any other vehicle. It's basically crazy to buy any other car than a Tesla. We need to make that convey that argument clearly, and we will have today. Perfect. Thanks for bringing the future to present. Very informational time today. I was wondering, like you did not talk much about Tesla pickup. And let me give a context for that. I could be wrong, but the way I'm looking at Tesla Network, it will as an early adopter and something as a test bread, I think Tesla's pickup may be the first phase of putting the vehicles in network because the utility of Tesla pickup would be pretty much people who are either loading a lot of stuff or are in the profession of construction or little here and there odd items like picking up stuff from Home Depot. I would say that maybe it needs to have a 2 stage process, pickup trucks exclusively for Tesla network as a starting point, then people like me can buy them later. But what are your thoughts on that? Well, today was really just about autonomy. There's a lot that we could talk about such as cell production, pickup truck and future vehicles, but today was just focus on autonomy. But I agree it's a major thing. I'm very excited for the Tesla pickup truck unveiled later this year. It's going to be great. Colin Langan, UBS. Just so that we understand the definitions, when you refer to feature complete self driving, it sounds like you're talking level 5, no geofence. Is that what's expected by the end of the year? Just so we're all staying there. And then the regulatory process, I mean, have you talked to regulators about this? This seems quite an aggressive time line from what other people have put out there. I mean, are they what are the hurdles that are needed? And what is the time to get approval? And do you need things like in California? I know they're tracking miles that with an operator behind that you need those things. What is that process going to look like? Yes. I mean, we talk to regulators around the world all the time. As we introduce additional features like Navigate on autopilot, We this requires like regulatory approval on a per jurisdiction basis. So but I think fundamentally, regulators, in my experience, are convinced by data. So if you have a massive amount of data that shows that autonomy is safe. They listen to it. They may take time to digest the information. Their process may take a bit of time, but they have always come to the right conclusion from what I've seen. I have a question over here. I've got license wise and a pillar? Yes. Okay. I just wanted just to some of the work we've done trying to better understand the ride hail market, it looks like it's very concentrated in major dense urban centers. So is the way to think about this that the robo taxis would probably deploy more into that area? And the additional full self driving for personally owned vehicles would be in the suburban areas? I think like probably yes, like Tesla owned robotaxis would be in dense urban areas along with customer vehicles. And then as you get to medium and low density areas, it would tend to be more that people own the car and occasionally lend it out. Yes. There are a lot of edge cases in Manhattan and, say, Downtown San Francisco, but those are and there are various cities around the world that have challenging urban environments, but we do not expect this to be a significant issue. And when I say future complete, I mean it will work in Downtown San Francisco and Downtown Manhattan this year. I have a neural net architecture question. Do you use different models for, say, path planning and perception or different types of AI? And sort of how do you split up that problem across the different pieces of autonomy? Well, essentially, the right now, AI and neural nets are used really for object recognition. And we're still basically just using it as still frames, so identifying objects in still frames and tying it together in a perception path planning layer thereafter. The but what's happening steadily is that the neural net is kind of eating into the software base more and more. And so over time, we expect the neural net to do more and more. Now from a computational cost standpoint, there are some things that are very simple for a heuristic and very difficult for a neural net. And so it probably makes sense to maintain some level of heuristics in the system because they're just computationally a 1000 times easier than a neural net. Like a neural net is like a cruise missile. And if you're trying to swat a fly, just use a flyswatter, not a cruise missile. So but over time, I would expect that it moves really to just training on against video and then video in, car steering and pedals out. Or basically video in lateral and longitudinal acceleration out almost entirely. That's what we're going to use the Dojo system for. There's no system that can currently do that. Maybe over here, just going back to the Sensor Suite discussion, Elon. The one area I'd like to just talk about is a lack of side radars. And it's a situation where you have an intersection with a stop sign where there's maybe a 35, 40 mile per hour cross traffic. Are you comfortable with the sensor suite, the side cameras being able to handle that? Just maybe talk a little bit about that. Yes, no problem. I mean, essentially the car is going to do kind of what a human would do. I mean, you can think of a human as like basically a camera on a slow gimbal. And it's quite remarkable that people are able to drive the car in the way that they are because if you can't look in all directions at once. The car can literally look in all directions at once with multiple cameras. So humans are able to drive just by sort of looking this way or looking that way. They're obviously stuck in their driver's seat. They can't really get out of the driver's seat. So it's like kind of 1 camera on a gimbal and is able to drive a conscientious driver can drive with very high safety. The cameras in the cars have a better vantage point than the person. So they're like up in the B pillar or at in front of the rearview mirror. They've really got a great vantage point. So if you're turning on to a road that's got a lot of high speed traffic, you can just do what person does, just like turn a little bit, then go fully into the road, let the cameras see what's going on if things look good and then the rear cameras don't show on any oncoming traffic, off you go. And if it looks sketchy, you can just pull back a little bit, just like a person. The behavior is like remarkably it starts to become remarkably lifelike. It's like quite eerie actually. The car just starts behaving like a person. Over here. Here you go. Ben Trimblequist, right here. Okay. Given all the value you're creating in your auto business by wrapping all of this technology around yourselves, I guess I'm curious as to why you would still be taking some of your cell capacity and putting it into Powerwall and Powerpack. Wouldn't it make sense to put every single unit you can make into this part of your business? We've already stolen almost all the cell lines for that were meant to go to Powerwall and Powerpack and use them for Model 3. I mean last year, in order to make our Model 3 production and not be self stopped, we had to convert all of the 2,170 lines at the Gigafactory to car cells. The and so our actual output in total gigawatt hours of stationary storage compared to vehicles is an order of magnitude different. And for stationary storage, we can basically use a whole bunch of miscellaneous cells out there. So we can just gather cells from multiple suppliers all around the world, and you don't have a homologation issue or safety issue like you have with cars. So that's basically our Stationery Factory business has been just kind of beating off scraps for quite a while. So a bit like really thinking of like the production as being there are many, many constraints of a massive production system. It's like like the degree to which manufacturing and supply chain is underappreciated is amazing. There are a whole series of constraints. And what is the constraint in 1 week may not be the constraint in another week. It's insanely difficult to make a car, especially one which is rapidly evolving. So yes, but I'll just take a few more questions and then I think we should just break 4 so you can try out the cars. Hi, Elon, Adam Jonas. Questions on safety. What data can you share with us today on how safe this technology is, which obviously be important in a regulatory or insurance discussion? Well, we published the accidents per mile every quarter. And what we see right now is that Autopilot is about twice as safe as a normal normal driver on average, and we expect that to increase quite a bit over time. Like I said, in the future, it will be consumers will want to outlaw and I'm saying they will succeed nor am I saying I agree with this position. But in the future, consumers will want to outgrow people driving their own cars because it is unsafe. If you think of like elevators, elevators used to be operated on a big lever, like you go up and down the floor and it's like a big relay and you had elevator operators, but then periodically, they would get tired or drunk or something and then they'd turn the lever at the wrong time and sever somebody in half. So now you do not have elevator operators. And it would be quite alarming if you went into an elevator that had a big lever that could just move between floors arbitrarily. So there's just buttons. And in the long term, again, not a value judgment, I'm not saying I want the world to be this way. I'm saying consumers will most likely demand that people are not allowed to drive cars. And Elon, a follow-up. Can you share with us how much Tesla is spending on autopilot or autonomous technology by order of magnitude on an annual basis? Thank you. It's basically our entire expense structure. Question on the economics of the Tesla network, just so I understand. It looked like so you get a Model 3 off lease, dollars 25,000 goes on the balance sheet, would be an asset and then you cash flow of 30,000 dollars a year roughly? Is that the way to think about? Yes, I mean, or something like that, yes. And then just in terms of financing of it, there's a question earlier, you mentioned you would do it is it cash flow neutral to the robo taxi program or cash flow neutral to Tesla as a whole? Sorry, the cash flow neutral one. In terms of he asked a question about financing the robo tax yet. It looks to me like they're self financing, but you mentioned they would be basically cash flow neutral. Is that what you're referring to? No, I just think between now and when the robo taxis are fully deployed throughout the world, the sensible thing for us is to maximize rates and drive the company to cash flow neutral. Once the robotaxi fleet is active, I would expect to be extremely cash flow positive. And that's so you were talking about production? Yes. To produce them all? Yes. Okay. Thanks. Maximize the number of autonomous units made. Thank you. Okay. Just maybe one last question. If I add my Tesla to the robotaxi network, who is liable for an accident? Is it Tesla or is it me, If the vehicle has an accident and harms some of the It's probably Tesla. It's probably Tesla. I think the right thing to do is to make sure there are very, very few accidents.