Logo
FrontierNews.ai

OpenAI's Sora Shutdown Signals a Shift: Why Startups Are Now Leading the World Model Race

OpenAI's decision to discontinue Sora in March 2026 has reshaped the video generation landscape, revealing that raw computational power and massive budgets may not guarantee success in building artificial intelligence systems that simulate reality. The company's abrupt shutdown of its flagship video tool, which had accumulated nearly 10 million downloads since launch and was backed by a $1 billion Disney partnership, exposed a fundamental problem: inference costs (the expense of running trained AI models) had become unsustainable. According to reports, OpenAI was spending approximately $15 million daily to generate millions of 10-second videos, with each clip costing the company roughly $1.30 to produce. This financial reality has opened the door for leaner startups to challenge the tech giants' dominance in video generation and the emerging world model space.

What Are World Models and Why Do They Matter?

World models represent the next frontier in artificial intelligence, moving beyond simple content creation to systems that can understand and simulate physical reality. These models are designed to predict what will happen next in a scene by understanding cause and effect, gravity, object collisions, and lighting, much like human intuition. The technology has applications far beyond entertainment, including autonomous driving, robotics, gaming, and industrial simulation. Three major paradigms currently dominate the field: physics-based approaches that embed real-world laws directly into code, self-supervised learning systems that construct their own rules without explicit physics knowledge, and three-dimensional modeling approaches that learn from visual data.

How Are Startups Outmaneuvering Tech Giants on Cost?

Video Rebirth, a Singapore-based startup founded by Tencent's former AI head, has emerged as a surprising competitor despite having only $80 million in funding and a team of 30 people. The company's flagship model, Bach, debuted at number 6 on an Artificial Analysis text-to-video leaderboard in May 2026, ranking higher than any other startup model and offering the cheapest price per minute of video generated among the top 10 competitors. The key to Video Rebirth's efficiency lies in proprietary technology called multi-step sampling loss, a mathematical technique that trains the model to anticipate and correct errors during generation, requiring fewer computational steps to create final videos. This approach can speed up video generation by up to 10 times compared to traditional models.

Beyond inference optimization, Video Rebirth reduced training costs by focusing on fewer, higher-quality videos rather than massive datasets. The company trained Bach on licensed movies, music videos, and in-house filmed clips, most at 720p resolution, rather than pursuing the highest possible resolution. Additionally, the model splits the tasks of prompt adherence and visual generation between separate components, unlike competitors that rely on a single "brain" to handle both functions simultaneously.

What Technical Advantages Are Startups Developing?

  • Physics-Based Simulation: Video Rebirth's Bach generates videos that follow real-world physical laws, including gravity, object collisions, and realistic lighting, addressing a major industry bottleneck where AI-generated objects often appear morphing or uncanny.
  • Product Consistency: The model excels at maintaining product consistency across frames, a critical requirement for e-commerce advertisers who need reliable visual representations of merchandise.
  • Multi-Shot Capability: Bach can generate multi-shot videos up to 45 seconds long based on reference images and text prompts, compared to ByteDance's Seedance 2.0, which is capped at 15 seconds.
  • Enterprise-Grade Controllability: Video Rebirth's technology provides fine-grained control over cause-and-effect relationships and object movement across space and time, features essential for professional filmmakers and advertisers.

Meanwhile, Shanghai-based Fysics AI has proposed an entirely different paradigm with its Fysiverse model, which embeds the laws of physics directly into its code rather than relying on data-driven approaches favored by American tech giants. The startup, founded by former Nvidia senior manager Zhang Lihua, describes Fysiverse as a "new-generation physics-based world model that adheres to real-world physical laws" and claims it addresses common issues in existing world models, including physical illusions, reasoning failures, and breakdowns in non-standard scenarios.

Why Is the World Model Race Becoming a Startup Opportunity?

The economics of AI video generation have fundamentally shifted the competitive landscape. When OpenAI's inference costs became prohibitive, the company's research team refocused on world simulation research for robotics and real-world physical tasks rather than continuing consumer-facing video generation. This pivot suggests that the path to profitable, scalable video generation may require architectural innovations that startups can develop more nimbly than entrenched organizations managing legacy systems and massive computational infrastructure.

Video Rebirth is already planning its next phase of development. The company is working on a world model that can create interactive 3D environments on the fly based on text prompts, a capability that would eliminate the need for traditional lines of code to build 3D simulations. Liu Wei, cofounder and CEO of Video Rebirth, stated his ambitious timeline for this technology: "In three years, we'll prove that the physical world can be simulated in real time".

"We do video generation in order to build a world model," said Liu Wei, cofounder and CEO of Video Rebirth.

Liu Wei, Cofounder and CEO at Video Rebirth

Investor confidence in this startup-led approach is growing. Video Rebirth closed an $80 million seed round in March 2026 with backing from AMD Ventures, Hyundai Motor Group's venture capital arm, and other major investors. Fang Wei, senior investment manager of Hyundai Cradle, explained the investment thesis: "Video Rebirth shares this exact vision from day one, positioning its technology to unlock critical future applications in physical AI". The company is raising an additional round in July 2026, signaling continued momentum in the space.

"Our rationale rests on the belief that video generation is far more than a tool for content creation; it represents one of the clearest and most viable pathways toward world models," stated Fang Wei, senior investment manager of Hyundai Cradle.

Fang Wei, Senior Investment Manager at Hyundai Cradle

What Does This Mean for the Future of AI Video?

The startup challenge to OpenAI's dominance suggests that the future of AI video generation and world models may not belong to the company with the largest budget, but rather to teams that can optimize efficiency and focus on physics-based accuracy. Video Rebirth's positioning as a potential "standard tool for professional content creation across film, advertising, gaming and e-commerce" within five years mirrors Adobe's historical role in creative software, according to venture capital analysts. As these startups mature and refine their approaches, the competitive pressure on established players to reduce costs and improve physical realism will likely intensify, ultimately benefiting creators and enterprises that depend on AI video technology.

" }