CoreWeave Deploys Nvidia's Most Advanced AI Chip, Delivering 10x Better Efficiency for AI Teams
CoreWeave has become the first AI cloud provider to bring Nvidia's newest and most powerful processor, the Vera Rubin NVL72, into production at full scale. The milestone means that AI companies relying on CoreWeave's infrastructure can now run their models far more efficiently, processing requests faster while spending significantly less money. This matters because as AI models grow larger and more complex, the cost and speed of running them in production has become the biggest bottleneck holding back AI companies from scaling.
What Makes Vera Rubin Different From Previous AI Chips?
Nvidia's Vera Rubin NVL72 represents a significant leap forward in AI infrastructure. Each rack contains 72 Nvidia Rubin GPUs (graphics processing units, the specialized chips that power AI) paired with 36 Nvidia Vera CPUs (central processing units), all connected through ultra-fast networking that moves data at 260 terabytes per second. In practical terms, this translates to performance gains that are hard to overstate: the chip delivers up to 10 times better inference per watt of electricity, uses up to one-fourth fewer GPUs to accomplish the same work, and costs one-tenth as much per million tokens processed compared to Nvidia's previous-generation Blackwell chips.
Inference is the technical term for running an AI model after it has been trained. Think of it like the difference between teaching someone a skill versus watching them use that skill repeatedly. Training happens once; inference happens millions of times. As AI models become more sophisticated and companies deploy them to handle real-world tasks continuously, inference performance has emerged as the defining constraint on how quickly AI companies can operate and grow.
How CoreWeave Engineered Vera Rubin for Real-World Production?
Bringing a cutting-edge processor to production at scale requires far more than just plugging in new hardware. CoreWeave developed several purpose-built innovations specifically designed to make Vera Rubin perform reliably in production environments where downtime is costly:
- Software-Defined Liquid Cooling: CoreWeave created a system called Valvey, a programmable valve assembly that transforms cooling from a passive mechanical system into a software-controlled surface. It monitors flow rate, temperature, pressure, and leak detection in real time, enabling automated isolation and emergency shutdown without disrupting neighboring equipment on shared cooling loops.
- Unified Rack Control: A new appliance called Racky aggregates power, cooling, and environmental sensors into a standardized management interface, allowing each Vera Rubin rack to be managed as a cloud resource rather than a custom one-off build.
- Advanced Networking Architecture: CoreWeave supports both Nvidia Quantum-X800 InfiniBand and Nvidia Spectrum-X Ethernet with RDMA over Converged Ethernet (RoCE), delivering 1.6 terabits per second of backend bandwidth per GPU and scaling to configurations of hundreds of thousands of GPUs.
- Enhanced Security and Isolation: CoreWeave is advancing secure, multi-tenant AI cloud operations using Nvidia BlueField-4 DPUs (data processing units), enabling faster data access, lower latency, and stronger tenant isolation at scale.
"The agentic era demands a fundamentally different approach to infrastructure, one that keeps pace with workloads that reason continuously, scale unpredictably, and operate in production around the clock," said Chen Goldberg, EVP of product and engineering at CoreWeave. "What separates infrastructure that performs in a lab from infrastructure that performs in production is the depth of engineering underneath it. With patent-pending innovations like Valvey and Racky, CoreWeave has done the full-stack orchestration work to enable Vera Rubin to perform the way it was designed to, not just in a lab, but at production scale for the world's most demanding AI teams."
Chen Goldberg, EVP of Product and Engineering, CoreWeave
Who Is Already Using This Technology?
Jane Street, a major quantitative research firm, is among the early adopters of CoreWeave's Vera Rubin infrastructure. The company has previously scaled across Nvidia's Hopper and Blackwell generations and is now partnering with CoreWeave on Vera Rubin deployment. The efficiency gains at rack scale translate directly into faster training runs and shorter iteration cycles for researchers, allowing them to experiment and refine AI models more quickly.
"Our research depends on infrastructure that's both powerful and reliable, and CoreWeave has delivered on this as we've scaled across Nvidia Hopper and Blackwell," said Craig Falls, head of quantitative research at Jane Street. "Their ability to deliver highly performant clusters with full cluster observability and a support team that engages deeply on hard problems gives us the confidence to partner with them on Vera Rubin. We are excited about the efficiency gains at rack scale translating into faster training runs and shorter iteration cycles for our researchers."
Craig Falls, Head of Quantitative Research, Jane Street
CoreWeave has also demonstrated strong performance benchmarking results. The company achieved the top Platinum ranking in both SemiAnalysis ClusterMAX 1.0 and 2.0 evaluations, and earned the number one ranking for inference speed and price-performance for Moonshot AI's Kimi K2.6 model in independent inference benchmarking conducted by Artificial Analysis.
What Role Did Hardware Partners Play?
Bringing Vera Rubin to production required collaboration across the entire infrastructure stack. Dell Technologies provided the architectural backbone through its PowerEdge XE9812 servers, which were engineered specifically for the density and precision required by this deployment. Micron contributed its 7600 SSDs (solid-state drives), delivering improved energy efficiency through one of the first liquid-cooled NVMe storage solutions deployed at rack scale.
"Dell Technologies and CoreWeave share a commitment to delivering innovation that performs at the frontier of what AI demands," said Michael Dell, chairman and CEO of Dell Technologies. "The PowerEdge XE9812 was engineered for exactly this kind of density and precision. Working with CoreWeave to bring up the first Nvidia Vera Rubin NVL72 rack is a direct validation of what enterprise-grade hardware can do when it's paired with the right operational expertise."
Michael Dell, Chairman and CEO, Dell Technologies
This deployment represents a significant milestone in AI infrastructure. As agentic AI systems become more prevalent, requiring continuous reasoning and persistent sessions, the demand for efficient inference infrastructure will only grow. CoreWeave's successful bring-up of Vera Rubin at production scale demonstrates that the next generation of AI infrastructure is ready to support the most demanding workloads the industry can throw at it.