Artificial intelligence has already changed how people search, communicate, and create, but the real test is when AI moves off the screen and into the physical world, where decisions play out instantly and in public view. Autonomous vehicles are among the first large-scale deployments of embodied artificial intelligence, raising important questions about safety, trust, and the role of AI interacting with the spaces we share.
Few people are better positioned to weigh in than Chinmay Jain, Senior Technical Product Manager at Waymo and editorial board member for IRJEMS and IJACT. Jain has guided Waymo’s Driving Behavior team through its evolution from pilot projects to the large-scale deployments it’s known for today. His perspective highlights how autonomous vehicles reveal the limits and possibilities of embodied AI, and what it means for AI to go from answering prompts to carrying out tasks in the physical world.
Agency on the Road
“At Waymo, the agent isn’t a line of code. It’s always a vehicle navigating a complex, unpredictable urban environment,” Jain explains. He points out that autonomous vehicles exhibit agency in ways that reach far beyond stimulus-response systems.
One example is long-horizon planning. An autonomous vehicle is rarely just deciding whether to turn left in ten feet. “It would hardly understand how to plan a route, let alone account for traffic, if that were the case,” he says. “It’s constantly creating and updating a multi-step plan to get from point A to point B, while predicting the behavior of every other road user and adjusting accordingly.” This capacity to plan and replan in real time represents a form of reasoning rarely seen in AI applications outside of mobility.
Another example is resilience to unexpected, long-tail events, just like the everyday surprises of city life which can’t be fully captured in training data. From a construction zone that appears overnight to an aggressive driver cutting across lanes, the system needs to be able to generalize from prior experiences to navigate safely. “The real world always throws something new at you,” Jain says. “The AI has to be able to adapt in ways that aren’t hard-coded.”
Advanced models like Waymo’s are also capable of more nuanced reasoning, particularly in ways that integrate other signals like audio. “When the AI can see a police car and also understand the meaning of its siren—Is it something I should respond to? Should I slow down or pull over?—that’s a leap toward genuine agency,” he says.
Safety in a World of Physical Consequences
Mistakes in text-based AI may confuse or mislead, but they rarely endanger lives. “In embodied AI, every decision carries a physical consequence,” Jain says. For autonomous vehicles, that could mean a fender bender, or worse. “Physical agents demand a much more rigorous and careful approach to training and testing.”
Simulation plays a central role. For example, Waymo vehicles drive billions of simulated miles for every million miles in paid trips they drive on public roads. In order to create new scenarios, engineers use techniques like fuzzing, which spins one real-world scenario into thousands of variations. “It lets us test the limits of the system in ways that would be impossible or unsafe in live traffic,” Jain explains.
The lesson, he argues, extends beyond mobility. “Any embodied AI system, whether in healthcare, manufacturing, or even the home, needs a safety framework. You must understand how failure modes arise and demonstrate mitigations before moving a product into production.”
Where Innovators Can Still Make Their Mark
Despite the dominance of well-funded players, Jain believes smaller companies have plenty of room to innovate. One promising area is simulation and validation tooling. “High-fidelity simulators are difficult to build, but they’re one of the most important tools for safe development,” he says. A startup that delivers them for industries such as warehouse logistics or agriculture could unlock enormous value.
Data presents another bottleneck. Unlike language models, which benefit from the abundance of online text, embodied AI requires specialized datasets that capture 3D environments, time-series interactions, and cause-and-effect outcomes. Collecting and labeling this data is also costly. “High-quality sensor-to-action data is scarce because the technology is so nascent,” Jain points out. “A solution that can streamline or automate labeling, for example, can lay the groundwork for industry-wide progress.” Similarly here, simulations can help.
Finally, Jain highlights niche applications where environments are more contained and risks lower than public roads: inspecting solar farms, monitoring construction sites, or automating commercial kitchens. “These are areas where a small, focused team can quickly establish expertise,” he notes.
Looking Ahead
When asked about the most promising applications still in development, Jain points to humanoid robotics and in-home assistive devices. “For a long time they felt like science fiction, but many companies are making real progress,” he says. “The idea of a robot that can walk into a factory, use the same tools, and navigate the same spaces as a person is incredibly powerful.” Others are exploring assistive agents in the home, while defense and public safety groups test robots for bomb disposal or search and rescue.
Looking further out, Jain is energized by the role embodied AI could play in scientific discovery: autonomous underwater vehicles probing the ocean floor, robotic explorers analyzing Martian soil, or lab assistants running experiments around the clock. “These are agents that participate in expanding our understanding of the world,” he says, “That’s truly a future worth building toward.”