Why Anthropic Is Betting on a British Chip Startup That Won't Ship Until 2027
Anthropic is exploring a partnership with London-based chip startup Fractile to diversify its AI inference hardware beyond Nvidia, Google, and Amazon. The talks signal a broader industry shift toward specialized chips designed specifically for running AI models rather than training them, as companies seek to cut the soaring costs of serving AI applications at scale.
Fractile's chips won't be ready for commercial use until around 2027, but the company's approach to solving a fundamental bottleneck in AI computing has caught the attention of major AI developers. Founded in 2022 by Oxford PhD Walter Goodwin, Fractile is developing inference accelerators that store data directly next to the transistors performing calculations, using fast on-chip SRAM instead of relying on slower off-chip DRAM memory. This eliminates the constant back-and-forth data movement that slows down AI model inference today.
"Fractile's design stores data needed for computations directly next to the transistors that perform the arithmetic, rather than relying on off-chip DRAM," Goodwin told Fortune in July 2024, adding that based on simulations at the time, Fractile could run a large language model 100 times faster and 10 times cheaper than Nvidia's GPUs.
Walter Goodwin, Founder, Fractile
Anthropic's interest in Fractile reflects a deliberate strategy to avoid dependence on any single chip vendor. The company already runs Claude, its flagship AI assistant, on Nvidia GPUs, Amazon's Trainium processors through Project Rainier, and Google's TPUs under a deal that provides 3.5 gigawatts of computing capacity from 2027 through 2031. This diversification gives Anthropic negotiating leverage and protects it from supply chain disruptions.
Why Is Inference Hardware Becoming a Competitive Battleground?
The shift from training to inference represents a fundamental change in AI economics. Training a large language model is expensive but happens once; inference, the process of running a trained model to generate responses, happens millions of times per day. As AI adoption accelerates, inference costs have become a major drag on profitability. Anthropic's annualized revenue run rate passed $30 billion in March 2026, up from around $9 billion at the end of 2025, yet inference costs continue to pressure gross margins.
Unlike OpenAI and xAI, which are building massive proprietary data centers, Anthropic has chosen to rent capacity from multiple providers. This strategy depends on having access to diverse, cost-effective hardware options. Fractile is one of several inference-focused startups pursuing specialized architectures. Groq, another SRAM-focused chip maker, was acquired by Nvidia for $20 billion in December 2025, signaling how valuable inference optimization has become.
The industry is increasingly adopting a "disaggregated" approach to inference, where different types of chips handle different parts of the AI inference pipeline. Nvidia pairs its GPUs with Groq's LPUs (Logical Processing Units) for different inference stages. AWS uses its Trainium accelerators for one phase and Cerebras Systems' wafer-scale accelerators for another. Even Intel has announced reference designs combining GPUs with SambaNova's RDUs (Reconfigurable Data Units).
How Are Chip Startups Carving Out a Niche in AI Hardware?
Inference workloads are far more diverse than training workloads, creating opportunities for specialized hardware. Large batch inference requires different combinations of compute power, memory, and data bandwidth than real-time AI assistants or code agents. This heterogeneity means no single chip design dominates inference the way Nvidia's GPUs have dominated training.
Most AI chip startups have found success focusing on the "decode" phase of inference, where models generate responses token by token. This phase is bandwidth-constrained, meaning it benefits from fast, on-chip memory like SRAM rather than raw compute power. Fractile, Groq, and Cerebras all leverage this insight. However, startups are exploring other approaches too. Lumai, a UK-based company, is developing optical inference accelerators that use light instead of electrons to perform matrix multiplication operations, potentially achieving exascale performance in just a 10-kilowatt power budget by 2029.
- SRAM-Based Architectures: Fractile, Groq, and others co-locate memory and compute to eliminate slow data movement between chips, enabling faster token generation for inference workloads.
- Wafer-Scale Designs: Cerebras builds single massive chips that integrate enormous amounts of compute and memory on one die, reducing latency and power consumption for certain inference tasks.
- Optical Computing: Lumai and similar startups are exploring hybrid electro-optical architectures that use photons for matrix multiplication, potentially delivering extreme efficiency for compute-bound inference.
Not all startups agree on the disaggregated approach. Tenstorrent, which unveiled its RISC-V-based Galaxy Blackhole compute platforms this week, is building a more general-purpose solution. CEO Jim Keller argued that pairing different chips for different inference stages creates unnecessary complexity and fragility as AI models evolve.
What Does Fractile's Funding and Timeline Mean for the Market?
Fractile raised $15 million in seed funding from Kindred Capital, the NATO Innovation Fund, and Oxford Science Enterprises. The company is now in talks to raise $200 million at a valuation exceeding $1 billion, with venture firms including Founders Fund, 8VC, and Accel reportedly interested. The team includes engineers from Graphcore, Nvidia, and Imagination Technologies, giving the startup deep expertise in chip design and AI workloads.
The 2027 timeline for commercial availability aligns with Anthropic's broader infrastructure strategy. The company's Google TPU deal doesn't deliver full capacity until 2027, suggesting Anthropic is planning its chip portfolio years in advance. By the time Fractile's chips are ready, the inference market will likely be far larger and more competitive than it is today, but the company's technology addresses a real bottleneck that Nvidia itself has acknowledged by acquiring Groq.
Fractile's approach of building its own software stack alongside the hardware is also significant. Specialized chips only succeed if developers can easily program them. By developing software tools in parallel, Fractile is trying to avoid the adoption barriers that have limited some previous AI chip startups.
The inference chip market is no longer a niche. As AI models become commoditized and competition intensifies on cost-per-token, companies like Anthropic are hedging their bets across multiple hardware vendors. Fractile's talks with Anthropic suggest that even unproven startups with compelling technology can attract serious interest from AI leaders, provided they solve a genuine problem in the inference pipeline.