NVIDIA's Jensen Huang Declares the Age of Agents Has Arrived,Here's What That Means for Your Work
NVIDIA is shifting its entire product strategy around autonomous AI agents that can observe, reason, plan and act with minimal human input, marking what CEO Jensen Huang calls the beginning of a new era in computing. At the company's GTC Taipei conference on June 1, Huang unveiled a comprehensive hardware and software ecosystem designed specifically for agents rather than traditional human-operated systems, signaling a fundamental change in how enterprises will deploy artificial intelligence.
What Is NVIDIA Betting on With This "Age of Agents"?
Huang framed nearly every product announcement around software agents that operate continuously with little human oversight. He pointed to a concrete metric to back the claim: developer commits on coding platforms have nearly tripled in the first months of 2026, suggesting that agents are already performing useful work in real-world environments. This shift reflects a broader belief that compute itself has become a direct revenue source for businesses, not just an operational cost.
The centerpiece of NVIDIA's announcement is the Vera Rubin platform, a five-rack data center system that the company treats as one unified computer optimized for agentic workloads. According to NVIDIA, Vera Rubin delivers up to 10 times higher inference performance per watt and 10 times lower cost per token compared to its predecessor. When paired with Groq 3 LPX inference trays, the system achieves up to 35 times higher throughput per watt for trillion-parameter models, meaning it can process extremely large AI models far more efficiently than before.
The Vera CPU, NVIDIA's first standalone data center processor, contains 88 cores and was specifically designed for agents rather than human operators. Huang argued that billions of agents running continuously demand far lower latency than people do, creating an entirely new processor market that did not previously exist.
How Is NVIDIA Bringing Agents to Personal Computers and Devices?
NVIDIA is not limiting agents to data centers. The company introduced RTX Spark, a chip built with MediaTek that brings 1 petaflop of AI performance to Windows laptops and compact desktops. RTX Spark pairs a Blackwell RTX graphics processor with 6,144 CUDA cores (specialized computing units designed for parallel processing) with a 20-core Grace CPU, positioning it as the foundation for personal computers that run agents locally without relying on cloud servers.
To support this vision, NVIDIA unveiled an entire Windows lineup including a laptop, an always-on desktop agent box, and a deskside DGX Station capable of running frontier models up to 1 trillion parameters directly on a desk. Partners including Asus, Dell, Gigabyte, HP, MSI and Supermicro began shipping DGX Station systems in June 2026. Adobe is rebuilding Photoshop and Premiere for RTX Spark, with versions that NVIDIA says run twice as fast and integrate with agents.
What Software and Tools Support These Agents?
NVIDIA introduced a comprehensive software stack to make agents practical for enterprises. The company released an Agent Toolkit that bundles models, an agent harness and an enterprise runtime. A separate secure runtime called OpenShell isolates each agent and enforces policy, addressing governance and security concerns that arise when autonomous systems operate at scale.
On the model side, NVIDIA released Nemotron 3 Ultra, a 550-billion-parameter mixture of experts model (a type of AI architecture that activates different specialized sub-models depending on the input). According to NVIDIA, Nemotron 3 Ultra runs inference five times faster and costs about 30 percent less than leading open alternatives. Verified NVIDIA agent skills are now available inside the Claude Code plug-in marketplace and the Hermes Skills Hub, making it easier for developers to build on existing tools.
Steps to Understanding NVIDIA's Agent Infrastructure Strategy
- Data Center Layer: Vera Rubin combines multiple specialized components including Vera CPUs, Rubin GPUs, Groq 3 LPX inference trays, Spectrum-6 Ethernet racks and BlueField-4 storage into a single unified system optimized for continuous agent operation.
- Personal Computing Layer: RTX Spark chips bring AI inference capabilities directly to laptops and desktops, allowing agents to run locally without cloud connectivity, reducing latency and improving privacy.
- Software and Runtime Layer: Agent Toolkit, OpenShell, and Nemotron 3 Ultra provide the models, isolation mechanisms and enterprise controls needed to deploy agents safely at scale across organizations.
- Integration Layer: Verified agent skills in Claude Code and Hermes Skills Hub allow developers to build on NVIDIA's infrastructure without starting from scratch.
NVIDIA also extended its agent vision into robotics and autonomous vehicles. The company launched Cosmos 3, an open world foundation model built on a mixture-of-transformers design that learns from teleoperation, simulation and re-projected video so robots can reason about their surroundings. The Drive Hyperion vehicle platform now reaches services representing about 97 percent of the world's mobility market, and NVIDIA introduced Alpamayo 2 Super, an open reasoning model for self-driving research paired with a reinforcement learning trainer and scenario generator.
"Huang said the company now sells AI infrastructure rather than chips alone, and he argued that compute has become a direct source of revenue for the businesses that buy it," noted the announcement at GTC Taipei.
Jensen Huang, CEO at NVIDIA
The scale of NVIDIA's supply chain expansion underscores the company's confidence in this direction. Huang stated that the Vera Rubin supply chain is twice the size of the prior Grace Blackwell effort, spanning 150 partners in Taiwan and more than 350 factories across 30 countries. Production shipments are scheduled to begin in fall 2026.
However, significant questions remain unanswered. Most of the performance numbers come from NVIDIA and have not been independently tested. Vera Rubin will not ship in volume until fall 2026, so buyers cannot yet validate the cost-per-token claims in their own workloads. The new Windows machines and RTX Spark systems also arrive later in 2026, leaving their software ecosystem and agent tooling unproven outside controlled demonstrations. Enterprise agent runtimes raise governance and security questions that products such as OpenShell address in principle but have not faced at production scale.
Competition is intensifying as well. AMD is pushing its Instinct accelerators, and cloud providers are expanding custom silicon such as AWS Trainium, Google Ironwood and Microsoft Maia. For technology decision makers, the keynote sharpened a choice that will define AI budgets over the coming years. Huang argued that performance per watt and the runtime that surrounds the model now matter as much as the chip itself, which means architecture decisions made over the next year will shape both capability and cost long after the hardware lands.