Will pure vision-based autonomous driving definitely beat LiDAR

Recently, Tesla’s autonomous electric vehicle Cybercab officially entered production in North America, becoming the world’s first mass-produced Level 5 autonomous vehicle without human intervention. According to media reports, Tesla, through a self-certification compliance pathway, is not subject to the U.S. National Highway Traffic Safety Administration (NHTSA)’s annual exemption quota of 2,500 autonomous vehicles, enabling unlimited production capacity. As recently as last December, a Tesla owner completed a 10,000-mile intercontinental drive using the FSD V14.2 system, crossing 24 U.S. states with zero human intervention throughout the journey.

Meanwhile, in December 2025, a large-scale power outage occurred in San Francisco, causing all traffic lights to fail. Overnight, 300 Waymo autonomous taxis were completely paralyzed at intersections, unable to move. Within 6 hours, there were more than 20 related complaints, and fire vehicles were unable to pass. Later that night, Waymo announced the suspension of its entire fleet in the city. A single power outage crippled an entire city’s autonomous driving system.

One system was completely paralyzed by an infrastructure failure, while the other could cross an entire country without human hands touching the steering wheel. Behind this lies the biggest technological route dispute in the history of autonomous driving.

2026: Pure Vision Autonomous Driving Has Won

In March 2024, Tesla released the FSD V12 version. This version achieved a breakthrough previously unimplemented: discarding all approximately 300,000 lines of handwritten C++ driving rule code and replacing it with an end-to-end neural network. Previous autonomous driving systems followed this logic: brake when encountering a red light, yield when detecting pedestrians, and slow down, observe, yield, and then merge when entering a roundabout. For each scenario, engineers needed to write a rule. 300,000 lines of code covered thousands of scenarios. But the real world contains infinite possibilities, and engineers can never exhaust all rules with code alone.

FSD V12’s approach is completely different. It does not tell the vehicle how to drive; instead, it provides the model with millions of video clips of human driving for it to learn independently. The input is only camera images, and the output is directly steering wheel angle, throttle, and brake commands, with no human-prescribed rules in between. As Elon Musk put it: “Not a single line of code says ‘this is a roundabout.'”

For this reason, the industry has called V12 the “ChatGPT moment” in the field of autonomous driving. ChatGPT does not rely on human-written grammatical rules to form sentences, but rather learns the essence of language from trillions of words. Similarly, V12 does not rely on human driving rules to control vehicles, but comprehends the essence of driving from billions of miles of driving data.

From the perspective of version iteration, V12.3 had a critical disengagement (requiring human takeover) approximately every 180 to 228 miles; V13.2 improved this indicator to 371 to 493 miles; V14.2 exceeded 1,400 miles. The newly released V14.3 version is defined by Tesla as the final piece of the puzzle to achieve unsupervised FSD—with upgraded reinforcement learning training and a 20% reduction in inference latency. From V12 to V14, the safety distance has increased by 8 times in less than two years.

During the 2026 Q1 earnings call, Elon Musk made a weighty statement: V14 is significantly safer than human driving. This means that autonomous driving has crossed a threshold for the first time—it is no longer approaching humans, but surpassing them.

At the same time, three industry signals support this judgment.

First, the weak are eliminated. Apple abandoned Project Titan; Argo AI collapsed completely. It is not that the technology is unfeasible, but that funding is unsustainable. R&D costs are squeezing out players without a data flywheel.

Second, the survivors are accelerating. Waymo completes 500,000 rides per week, and Baidu has served more than 20 million times cumulatively. But Tesla already has 1.28 million FSD paid users, with a cumulative driving distance exceeding 10 billion miles—nearly 50 times that of Waymo. The data advantage is not linear growth, but exponential: more data trains a better model, a better model attracts more users, and more users generate more data.

Third, the wind is shifting. XPeng Motors removed LiDAR from multiple models and launched a pure vision solution; NIO shifted to an end-to-end technical route; China’s largest ADAS supplier developed an end-to-end system that does not rely on LiDAR. Even Waymo itself reduced the number of sensors from 40 to 23 in its sixth-generation system.

The direction is clear: more AI, fewer sensors. Therefore, 2026 has become the first year, not because LiDAR has become cheaper, nor because a certain city has put hundreds more robotaxis on the road, but because a fundamental technological paradigm shift has occurred: software has proven for the first time that it can replace hardware, and AI driving has surpassed human driving in a statistical sense for the first time.

The Structural Pitfalls of LiDAR

Waymo is currently the only company operating large-scale fully autonomous taxis. With 500,000 rides per week, operating 3,000 vehicles in 10 U.S. cities, it has raised a total of 16 billion US dollars and has a valuation of 126 billion US dollars. This is the most commercially successful autonomous driving enterprise to date. However, Waymo relies on three core elements: high-precision maps, LiDAR, and remote operators. These three work well together under ideal conditions, but the real world is not always ideal.

During the San Francisco power outage, all traffic lights failed, and 300 Waymos were collectively paralyzed. Its logic is to treat unmarked intersections as four-way stops, but when the entire city loses power, every intersection becomes a four-way stop, and requests to the remote control system are instantly overwhelmed. A single infrastructure failure led to the entire fleet going offline. In December 2025, a Waymo carrying passengers in Los Angeles drove straight into an armed police arrest scene. Police officers were holding guns, and the suspect was lying on the ground. After slowing down, the Waymo stopped next to the suspect, as the system could not understand the meaning of this scenario. Austin recorded 19 incidents where Waymos illegally overtook school buses with extended stop arms, including one incident where students were still on the road. Even more absurdly, two Waymos collided with each other in a dead end in San Francisco, and a third was stuck behind and unable to move. In the end, Waymo employees had to manually drive the vehicles away.

These accidents all occurred in scenarios that deviated from Waymo’s pre-set maps and pre-programmed rules. Waymo performs excellently in expected environments, but when reality deviates from expectations, the system is powerless.

High-precision maps are essentially a static world model. Every street needs to be pre-scanned by a dedicated mapping vehicle and marked to centimeter-level accuracy, including lane width, bike lanes, speed limits, crosswalks, traffic lights, etc. However, once road construction, temporary road closures, or emergencies change the reality, the vehicle either detours, stops, or calls for remote control. Philip Koopman, a professor at Carnegie Mellon University, once said: “When Waymo encounters changes not on the map, it often shows false confidence and then makes up a solution on its own.”

Remote control also has bottlenecks. Waymo’s approximately 3,000 vehicles are equipped with 70 remote operators, roughly one for every 41 vehicles. Control centers are located in Arizona, Michigan, and the Philippines. U.S. Senator Ed Markey stated at a hearing that he was shocked by the fact that overseas personnel were controlling vehicles on U.S. roads.

In addition, Waymo has not yet verified its ability to operate in snowy conditions. Its product director publicly admitted at a Boston City Council meeting: “We have not yet verified fully autonomous driving in snowy weather and snow-covered roads.” Waymo’s expansion targets include Miami, Dallas, Houston, Orlando, and San Antonio—all cities in warm climates.

Waymo is indeed the strongest in the areas it covers, but its coverage is still very limited. The United States has 4.2 million miles of highways, and Waymo has only completed partial mapping in about 25 cities so far. At this rate, it will take decades to cover the entire United States.

Tesla: No Maps, No LiDAR

Andrej Karpathy, former Director of AI at Tesla, once said: “Roads are designed for visual creatures.” Traffic signs, traffic lights, lane lines, road textures—all these pieces of information are prepared for human eyes. Humans can drive with just two cameras (eyes) plus a neural network (brain). Therefore, in theory, a synthetic neural network plus cameras should also be sufficient. In Karpathy’s view, LiDAR is like a crutch: it can quickly produce demonstrations but bypasses the most fundamental visual recognition problem. And visual recognition is precisely the core challenge that autonomous driving must overcome.

There is a rarely clarified issue known as the sensor fusion paradox. When the output results of LiDAR and cameras are inconsistent, people may think that one more sensor means one more layer of redundancy. But the actual result is that one more sensor also increases complexity, an additional set of calibration requirements, and a new set of failure modes. A 2025 paper in Nature Scientific Reports specifically studied this problem: when the output semantics of different sensors are inconsistent, unconditional fusion will instead reduce the overall detection and classification accuracy. In short, adding LiDAR can actually make performance worse in certain scenarios.

The gap in data density is also significant. An 8-megapixel camera generates 8 million data points per frame; while the most advanced LiDAR can only provide a few hundred thousand points. At distances above 200 meters and under good lighting conditions, the amount of information obtained by cameras is far greater than that of LiDAR.

LiDAR also has shortcomings in harsh weather. Academic research shows that in fog and snow, LiDAR’s detection distance decreases by 25%; when rainfall exceeds 50 millimeters per hour, target detection basically fails; dense fog can compress its effective range to 25 meters. Domestic measured data shows that in heavy rain, the effective distance of LiDAR is reduced to less than 30 meters, and near-field noise increases by 5 times. In tests during the rainy season in Guangdong, the recognition accuracy of pure vision models under 50-meter visibility was 12% higher than that of models fused with LiDAR. Cameras certainly degrade in harsh weather, but their limitations can be compensated for by software improvements. LiDAR, however, faces physical limitations of light scattering, which cannot be solved by software.

The most convincing signal comes from Waymo itself. Waymo’s sixth-generation system reduced the total number of sensors from 40 to 23, cameras from 29 to 13, and LiDAR from 5 to 4. The direction is clear: better AI can replace more sensors. Waymo is slowly moving closer to Tesla’s philosophy—fewer hardware, more intelligence.

Numbers Don’t Lie: The Data Flywheel Crushes Competitors

In Tesla’s 2026 Q1 financial report, several numbers are particularly crucial:

1.28 million FSD paid users, a year-on-year increase of 51%. Approximately 180,000 new paid users were added in the first quarter, and the subscription churn rate continued to decline. This indicates that users are increasingly relying on the system and are less willing to cancel their subscriptions.

Tesla has begun to shift FSD from a one-time purchase model to a pure subscription model, at $99 per month. The management’s statement is meaningful: “FSD is the product; the vehicle is just the delivery method for FSD.” Tesla no longer positions itself as a car-selling company—a positioning shift that may be more important than any technological breakthrough.

Cumulative FSD driving distance exceeded 10 billion miles. Waymo’s total is approximately 200 million miles. Tesla’s data volume is nearly 50 times that of Waymo, and the gap is expanding exponentially every day.

V14.3 is defined by Tesla itself as the final piece of the puzzle to achieve unsupervised FSD. This version has completed a number of key improvements. Starting from V13, a spatiotemporal voxel transformer was introduced, using a 10-second cyclic video buffer to build a persistent 3D environment model. V14 added emergency vehicle recognition, human gesture understanding, real-time detour processing, and full autonomous capability from parking space to parking space.

Jim Fan, Director of Robotics at NVIDIA, commented: V14 has passed the physical Turing test, and its driving behavior is almost indistinguishable from that of humans. David Moss, a car owner, drove a Tesla across the United States with 10,000 miles of zero intervention. Some vivid details are worth noting: V14 can take the initiative to make way for emergency vehicles, while the surrounding human drivers did not; it can detect motorcycles in blind spots during heavy rain, which human drivers cannot see; it can distinguish between chickens crossing the road and geese just wandering by the side of the road—yielding to the former and detouring around the latter. This judgment ability cannot be written with rule-based code.

In Q1 2026, the paid mileage of robotaxis increased nearly three times month-on-month, jumping from 610,000 miles in Q4 to 1.7 million miles. In April, unsupervised robotaxis were officially launched in Dallas and Houston, open to the public. In addition to the previously launched Austin and San Francisco Bay Area, Tesla currently operates robotaxis in four cities. Elon Musk confirmed during the earnings call: To date, there have been no injury accidents in the unsupervised project. Next, it will expand to Phoenix, Miami, Orlando, Tampa, Las Vegas and other cities. The goal is to operate unsupervised FSD or robotaxis in more than a dozen states by the end of 2026.

Global expansion is also progressing in an orderly manner. The Netherlands has approved the deployment of the supervised version of FSD. Application materials are planned to be submitted to Brussels in May, seeking EU-level approval in the second half of the second quarter. China has also granted approval, but broader promotion still depends on the regulatory process, and management hopes to obtain a wider opening in the third quarter.

Tesla currently has more than 6 million FSD hardware-equipped vehicles in transit, with 1.28 million paid users. Waymo has approximately 3,000 vehicles. The fleet size ratio is 2000:1. In terms of hardware costs, a complete Waymo taxi costs about $75,000 to $80,000. Tesla’s FSD computer costs $2,300, plus 8 cameras for about a few hundred dollars. Therefore, the hardware cost gap between the two is 15 to 30 times.

In terms of geographic coverage, Tesla FSD can work on almost all public roads in the United States, Canada, Mexico, and Puerto Rico without pre-built maps, and is expanding to Europe and China. Waymo covers about 10 cities, and each city takes months of manual mapping, testing, and regulatory approval.

In terms of OTA updates, when Tesla pushes an FSD update, it can cover millions of vehicles within a few days. Waymo’s update to cover 3,000 vehicles has an efficiency gap of more than 500 times.

Tesla also has a trump card called “Shadow Mode”: even if FSD is not turned on in each vehicle, the neural network is still running silently in the background, comparing its judgments with those of human drivers, and automatically marking and uploading any inconsistencies. The entire fleet is looking for weaknesses in AI 24/7. No competitor can replicate this capability.

Tesla’s own safety data from November 2025 shows: FSD users have a major collision approximately every 2.9 million miles, compared with the U.S. national average of every 505,000 miles—FSD is 5.7 times safer than human driving.

On February 18, 2026, the first mass-produced Cybercab rolled off the assembly line at Tesla’s Texas factory. It adopts a two-seater design, no steering wheel, no pedals, is equipped with butterfly doors and wireless charging, has a range of 200 miles, and a target price of $25,000 to $30,000. Tesla plans to adopt a new Unboxed manufacturing process, which theoretically allows a single production line to have an annual capacity of 2 to 3 million vehicles. In the future, Cybercab will gradually replace the Model Y and become the main model in the robotaxi fleet. The biggest bottleneck is federal regulation: FMVSS regulations require vehicles to be equipped with a steering wheel and pedals, but Cybercab has neither. Relevant application documents were submitted to Congress in February, planning to increase the exemption limit from 2,500 to 90,000 vehicles, but the previous two similar bills failed to pass. The per-vehicle economic model is very competitive: a Cybercab costs less than $30,000, operates 60,000 miles a year, is priced at $0.3 to $0.5 per mile, and after deducting operating costs, the annual net profit is about $8,000 to $20,000, with an investment payback period of about 20 months.

Conclusion: Pure Vision Autonomous Driving Is Destined to Win in the Long Run

The overall direction is irreversible: more data, stronger AI, fewer sensors.

Rich Sutton, one of the founders of reinforcement learning, published “The Bitter Lesson” in March 2019. It mentions that in tasks such as chess, Go, image recognition, natural language processing, and protein folding, the final winning solutions are those that utilize computing power and data, rather than those relying on human expert knowledge. Autonomous driving is likely to follow the same rule.

Tesla has a data flywheel composed of 6 million vehicles, has exceeded 10 billion miles of training data, and collects data at a rate of 20 million miles per day. From V12 to V14, FSD’s safety distance has increased by 8 times in less than two years. Industry giants such as XPeng, NIO, and Momenta are all transitioning to pure vision and end-to-end directions. Even Waymo itself is reducing sensors.

Leave a Comment

Your email address will not be published. Required fields are marked *