Logo
FrontierNews.ai

The Missing Piece: Why AI Needs a Body to Truly Think

Physical AI represents a fundamental shift in how artificial intelligence operates: instead of processing text and images on screens, AI systems are now learning to perceive and act in physical environments through robots and embodied machines. This convergence of robotics and AI intelligence is transforming everything from manufacturing to hospitality, with researchers and entrepreneurs racing to build systems that can understand the world not just intellectually, but physically (Source 1, 2).

What Exactly Is Physical AI, and How Does It Differ from ChatGPT?

Physical AI, also called embodied AI, combines software intelligence with hardware robots to create systems that can interact with the real world. Unlike traditional AI chatbots that predict the next word in a sentence, physical AI systems must understand geometry, physics, spatial relationships, and how to manipulate objects (Source 1, 2).

The distinction matters because language models like ChatGPT learn from humanity's written knowledge, but they cannot pick up a coffee mug or navigate a cluttered room. "There's all the geometry of the world, the dynamic of how I move my hand, the physical interaction of the contact with the cup," explained Martin Hebert, dean of computer science at Carnegie Mellon University. "This is much more complex than just predicting the next word in a sentence".

The breakthrough enabling physical AI has been the emergence of new AI model types. Vision-language models (VLMs) and vision-language-action (VLA) models allow robots to see their surroundings, understand instructions in natural language, and decide what actions to take. These models learn from videos of humans performing real-world tasks, helping robots develop a broader understanding of how the world works.

Why Are Researchers Abandoning Pure Language Model Research?

A growing number of AI scientists are pivoting away from language model research toward physical AI and "world models," which teach AI systems how to predict and react in physical environments. Louis Castricato, a computer scientist who spent eight years studying large language models at Brown University, decided the field had hit a plateau. "We basically have passed the point of doing real fundamental LLM research," he said. "Now it's just applications." He quit his doctoral studies and founded Overworld, a startup building AI systems that understand and navigate physical worlds.

This shift reflects a broader realization among leading researchers. Fei-Fei Li, known as the "Godmother of AI," founded World Labs to develop world models. Yann LeCun, who previously served as Meta's chief AI scientist, left to start Advanced Machine Intelligence Labs in Paris, also focused on world models. Both scientists argue that true AI intelligence requires understanding not just language but the physical structure of space and time.

"Where language models learn the statistical structure of text, world models learn the statistical structure of space and time: how light falls on a surface, how a garden looks from an angle no camera has captured, how objects respond to force and follow the laws of physics," explained Fei-Fei Li, founder of World Labs.

Fei-Fei Li, Founder, World Labs

How Are Companies Building Physical AI Systems Today?

Major robotics companies and startups are now deploying physical AI systems in real-world environments. Boston Dynamics develops advanced mobile robots, Tesla is building humanoid robots, Agility Robotics specializes in warehouse automation robots, and Unitree Robotics creates quadruped and humanoid systems. In India, companies like Addverb Technologies and iHub Robotics are developing autonomous warehouse and humanoid systems using VLA-like AI.

One striking example comes from Sony AI. The company's autonomous table tennis robot, called Ace, recently defeated seven ranked professional players under official competition rules between February and April 2026. Most notably, Ace beat Miyuu Kihara, currently ranked World No. 26 in women's singles, and two-time Olympic silver medalist Miu Hirano. After her match, Hirano remarked: "It's really strong. Is there really anyone who can beat this?" The improvements came primarily through retraining rather than hardware redesign, with Ace now using a single reinforcement learning policy that performs nine distinct skills and reacts faster than before.

Hirano

What Are the Major Obstacles Slowing Physical AI Adoption?

Despite rapid progress, physical AI faces significant hurdles that differ fundamentally from software AI challenges. The obstacles include technical, practical, and ethical dimensions:

  • Data Collection and Transformation: Physical AI requires massive amounts of unstructured data, including videos of humans performing tasks, sensor data from robots, and environmental information. The challenge isn't collecting the data but transforming it into formats robots can learn from and apply across multiple tasks.
  • Privacy and Security Concerns: Training robots requires continuous real-world sensing through cameras, audio, and sensors, raising significant privacy concerns. Companies are reluctant to share internal recordings that could be used to train commercial AI systems, limiting the data available for development.
  • Lack of General-Purpose Models: Most physical AI systems remain specialized for specific tasks or robot types. The industry lacks scalable, general-purpose models that can operate across different robots and task categories, requiring high situational awareness and contextual understanding.
  • Cost and Infrastructure Requirements: Building physical AI demands expensive custom hardware, simulation environments, and real-world testing infrastructure. These costs slow deployment compared to digital AI, which can be distributed globally via the internet.
  • Regulatory and Trust Barriers: Physical AI adoption is slower than digital AI because robots operating in homes, hospitals, and shared spaces face regulatory scrutiny, safety requirements, and user trust concerns.

How Are Investors and Entrepreneurs Approaching World Models?

Despite the challenges, venture capital is flowing into physical AI and world model companies. Steve Jang, co-founder and managing partner at Kindred Ventures, is investing in multiple world model startups including Overworld, Causal Labs (building AI for weather prediction), and Extropic (developing specialized computer chips for world models). Jang believes the future will include many different types of models with different architectures rather than one dominant approach.

The applications extend beyond robotics. Castricato's Overworld is building interactive video game worlds where scenes adapt as virtual characters move through them and interact with objects. "There's no other world model where you can just walk through doors or where you can interact with a detailed environment like this," he noted. "We optimize for interaction above anything else".

What Does the Shift From Experimentation to Real-World Utility Mean?

A critical transition is underway in how the physical AI industry measures success. For nearly two decades, robotics companies remained in research and prototyping phases because the technology for widespread commercial deployment simply didn't exist. Now, the focus has shifted from experimentation to practical utility.

End users care less about the sophistication of underlying AI models and more about whether systems solve real-world problems efficiently and affordably. While affordability continues to improve, the ability to deliver practical value has expanded significantly, accelerating adoption across industries including agriculture, healthcare, cooking automation, hospitality, logistics, and delivery.

"Some people may have different definitions, but physical and embodied AI are kind of the evolution of what we used to call robotics," explained Martin Hebert, dean of computer science at Carnegie Mellon University.

Martin Hebert, Dean of Computer Science, Carnegie Mellon University

The emergence of adaptable intelligence through foundation models has made AI the missing layer that transforms robotics from rigid, task-specific automation into scalable, intelligent systems. As these systems move from laboratories into factories, homes, and hospitals, the race to build robots that can plan, perceive, and act autonomously is accelerating (Source 1, 2, 3).