Miami Startup Claims It Finally Cracked the Code on Faster AI Models
A Miami-based startup called Subquadratic claims to have solved a mathematical bottleneck that has constrained large language models (LLMs) for nearly a decade. The company emerged from stealth mode last month with bold assertions about its new model, called SubQ, which it says is faster, cheaper, and more energy-efficient than existing systems from Google DeepMind, OpenAI, and Anthropic. After initial skepticism, Subquadratic has begun releasing independent test results that appear to validate its core claims.
What Problem Is Subquadratic Actually Solving?
To understand why Subquadratic's breakthrough matters, it helps to know how most modern AI language models work. The key mechanism inside an LLM is called a transformer, which uses a process known as dense attention. When a transformer processes text, it encodes each word (or token) as a number, then multiplies each number by every other number to capture meaning across the entire text. For a 10,000-word document, this creates nearly 50 million individual multiplications.
The problem is that as text gets longer, the number of computations skyrockets. Double the words, and you roughly quadruple the computations. This quadratic expansion is why LLMs consume enormous amounts of energy and are expensive to run. Subquadratic's solution ditches dense attention in favor of sparse attention, which selects only some word relationships to multiply instead of all of them. The idea is straightforward: not every word pair matters equally when understanding a document.
"Sparse attention says not all of those relationships are important, because they're not. If you're reading a book, you're not going to look at the first and second words, first and third, that's insane," said Alex Whedon, cofounder and chief technology officer at Subquadratic.
Alex Whedon, Cofounder and Chief Technology Officer at Subquadratic
Sparse attention is not a new idea. Researchers have attempted it for years, but previous approaches used fixed patterns that couldn't match the performance of dense attention models. Subquadratic claims it has cracked the problem by dynamically selecting which word relationships matter for each piece of text, rather than using rigid rules.
How Do the Independent Test Results Look?
Subquadratic's initial claims were met with heavy skepticism because the company provided few independent benchmarks. The turning point came when the startup asked Appen, a third-party AI evaluation firm, to test SubQ. The results appear to support Subquadratic's assertions across multiple dimensions.
- Speed Performance: In a baseline speed test, Appen found that SubQ was 56 times faster than models using FlashAttention, a previous sparse-attention technique.
- Coding Ability: On LiveCodeBench, a test measuring performance on competitive coding problems from real contests, SubQ scored 89.7%, placing it in the same range as other top coding models.
- Cost Efficiency: According to Subquadratic CEO Justin Dangel, running Anthropic's Opus 4.6 model through a standard test cost $2,600, while SubQ completed the same test for eight dollars.
"That was really exciting to me, it validated their architecture. I was like, 'Wow, this could be a game changer,' because models struggle with speed and inefficiency," said Jeanine Sinanan-Singh, director of generative AI research at Appen.
Jeanine Sinanan-Singh, Director of Generative AI Research at Appen
Sinanan-Singh added that independent verification was crucial for credibility. "When you have kind of shocking results, it's really not as credible when you say it yourself," she noted.
What Makes SubQ Different From Other Models?
SubQ's most striking capability is its ability to handle massive amounts of text simultaneously. The model has a context window, roughly equivalent to working memory, of up to 12 million tokens. Most leading models today have context windows of around one million tokens. This means SubQ can process roughly 100,000 words at once, compared to the roughly 8,000 words typical of competitors.
In a demonstration, Subquadratic's Whedon asked SubQ to reason about information contained in 400 documents and received a response in seconds. This capability opens possibilities for data-heavy tasks like analyzing entire code bases or summarizing hundreds of documents at once.
The company has not disclosed exactly how SubQ selects which word relationships to focus on, describing this as proprietary. However, Whedon explained that the selection is calculated dynamically for each piece of text, unlike previous sparse-attention methods that relied on fixed patterns.
How to Evaluate Claims About New AI Breakthroughs
- Seek Independent Verification: Look for third-party testing from established evaluation firms rather than relying solely on a company's self-published benchmarks, which can be cherry-picked to show favorable results.
- Compare Across Multiple Metrics: A model might excel at speed but underperform on accuracy, or vice versa. Evaluate whether improvements across different dimensions are meaningful for your use case.
- Check Real-World Availability: Assess whether the technology is actually available for testing and deployment, not just announced in theory, since many AI breakthroughs take years to reach practical use.
What's Next for Subquadratic and the Broader AI Industry?
Subquadratic insists that its breakthrough could fundamentally reshape how LLMs are built going forward. The company's leadership believes that dense-attention transformers, which have dominated AI development since Google researchers published the foundational "Attention Is All You Need" paper in 2017, may become obsolete.
Subquadratic
"We hope we're kicking off a new age of efficiency. We don't think anybody will be building on transformers in a few years," said Justin Dangel, cofounder and CEO at Subquadratic.
Justin Dangel, Cofounder and CEO at Subquadratic
However, SubQ is not yet widely available for public testing, which limits independent verification. Subquadratic acknowledges that releasing third-party benchmarks alongside its initial announcement would have preempted skepticism. The company says it is taking time to ensure future results are fully verified before publication.
For now, SubQ appears positioned as a specialized tool for specific use cases rather than a universal replacement for existing models. Its advantages in speed and cost efficiency could make it valuable for organizations processing large documents or code, but it may not displace general-purpose models from OpenAI, Google, or Anthropic across all applications. The independent test results suggest Subquadratic's claims deserve serious attention, though broader adoption will depend on wider availability and continued validation from the research community.