Meituan's LongCat-2.0 Just Proved China Can Train Frontier AI Without Nvidia
Meituan, the Chinese delivery app giant, just released LongCat-2.0, a massive AI coding model trained entirely on homegrown Chinese computer chips rather than Nvidia GPUs, signaling a major shift in how frontier AI models can be built outside the United States. The 1.6 trillion-parameter model was trained on over 50,000 Chinese application-specific integrated circuits (ASICs) and has already been quietly dominating developer charts under an anonymous name for the past two months.
What makes this release significant goes beyond just another powerful AI model. The successful training run proves that near-frontier AI systems can be scaled without relying on Nvidia's general-purpose GPUs, which have powered nearly all major AI development globally. This technological independence arrives at a critical moment, as the U.S. government has pressured American AI labs to restrict access to their latest models. OpenAI limited access to its GPT-5.6 models, while Anthropic took its Claude Fable 5 and Mythos 5 models entirely offline following government requests.
How Has LongCat-2.0 Performed in Real-World Testing?
LongCat-2.0 was secretly operating on OpenRouter, a platform where developers access multiple AI models, under the codename "Owl Alpha." During its two-month anonymous run, the model processed approximately 10.1 trillion tokens monthly, averaging 559 billion tokens per day. This represented a 242 percent month-over-month explosion in usage that pushed it into the platform's global top three.
On standardized coding benchmarks, LongCat-2.0 demonstrates competitive performance with leading Western models. The model scored 59.5 on SWE-bench Pro, surpassing OpenAI's GPT-5.5 score of 58.6. It also achieved 70.8 on Terminal-Bench 2.1, 77.3 on SWE-bench Multilingual, and 73.2 on FORTE, a general corporate workflow simulator.
The model's strength lies in agentic tasks, meaning it excels at multi-step engineering work, tool integration, and automated code repository manipulation. By the time Meituan publicly claimed the architecture, LongCat-2.0 had already secured the top ranking on the Hermes Agent workspace, second place on Claude Code deployments, and third place across international OpenClaw environments.
What Technical Innovations Make LongCat-2.0 Efficient?
The model's architecture relies on several engineering innovations designed to handle massive scale without proportional increases in computing cost. At its core lies Mixture-of-Experts (MoE) sparsity, a technique that activates only the most relevant parts of the model for each task. While LongCat-2.0 contains 1.6 trillion total parameters, only about 48 billion parameters activate per token on average, ranging from 33 billion to 56 billion depending on query complexity.
Meituan introduced three key technical components to optimize performance:
- Streaming-aware Indexing: Restructures how tokens are selected by converting fragmented memory access into predictable, sequential blocks, improving data throughput and reducing computational waste.
- Cross-Layer Indexing: Allows a single indexing pass to guide multiple consecutive layers during inference, reducing redundant calculations by leveraging the fact that attention patterns remain stable across adjacent layers.
- Hierarchical Indexing: Uses a two-stage scoring approach that first performs rapid block-level filtering before running fine-grained token selection on remaining candidates.
Additionally, the model incorporates 135 billion parameters dedicated to N-gram Embedding, a technique that captures relationships between groups of tokens. This expands the embedding space roughly 100-fold, allowing the model to process large batches more efficiently by reducing memory input-output bottlenecks.
How Does Pricing Compare to Other Leading Models?
LongCat-2.0 arrives with aggressive pricing designed to undercut Western alternatives. During a limited-time promotional period, the model costs just $0.30 per million input tokens and $1.20 per million output tokens, totaling $1.50 per million tokens. Standard pricing is $0.75 for input and $2.95 for output, or $3.70 per million tokens combined.
For context, Moonshot AI's Kimi-K2.6 costs $0.95 for input and $4.00 for output, totaling $4.95 per million tokens. OpenAI's GPT-5.5 runs $5.00 for input and $30.00 for output, or $35.00 per million tokens. Anthropic's Claude Opus 4.8 costs $5.00 for input and $25.00 for output, totaling $30.00 per million tokens.
The model also offers a unique pricing mechanism where context-cache hits, meaning repeated processing of the same information, are processed completely free of charge. This can dramatically reduce costs for workflows that repeatedly reference the same documents or codebases.
Why Does This Challenge Nvidia's Market Position?
The successful training of a near-frontier AI model on 50,000 Chinese ASICs rather than Nvidia GPUs represents a fundamental shift in AI infrastructure. For years, Nvidia's dominance in AI training has been nearly absolute, with virtually every major frontier model relying on Nvidia's specialized processors. If Chinese technology companies can consistently iterate trillion-parameter architectures using homegrown chips, it threatens Nvidia's stranglehold on the market.
Meituan's technical blog notes that the software ecosystem around non-Nvidia ASICs remains less developed than the Nvidia stack. However, the company invested in custom software solutions, deterministic operators, and fault recovery systems to overcome these limitations. The training run processed over 35 trillion tokens with no rollbacks or irrecoverable loss spikes, demonstrating that alternative hardware can sustain frontier-scale operations with proper engineering.
This development arrives precisely as U.S. export controls fragment access to advanced Western models. Anthropic's Claude Fable 5 remains offline 18 days after the government restriction took effect. International developers seeking unrestricted access to frontier-grade coding models now have alternatives like Kimi K2.7-Code from Moonshot AI, GLM-5.2, and now LongCat-2.0.
What Does This Mean for Developers and Enterprises?
LongCat-2.0 integrates directly with popular developer tools including Claude Code, OpenClaw, and Hermes, meaning developers can immediately use it for real work rather than treating it as an experimental API. The model is designed for repository-level code edits, automated task execution, and long-horizon agent workflows, making it practical for large-scale software engineering projects.
The open-source release under an MIT license means enterprises can download the model weights and run it on their own infrastructure, avoiding API costs and data privacy concerns associated with cloud-based services. However, self-hosting comes with practical constraints. At 1.6 trillion parameters, even with aggressive compression techniques, the model requires roughly 400 gigabytes of storage before accounting for the temporary memory needed during inference. This places it out of reach for most consumer-grade hardware.
The timing of this release underscores a broader pattern. As Western governments restrict access to cutting-edge AI models in the name of national security, they inadvertently create market opportunities for open-source alternatives and non-Western models. Developers and enterprises locked out of the latest proprietary systems now have viable options that cost significantly less and come with fewer regulatory restrictions.