Nvidia overhauls the data center for OpenClaw era

Remembering the classic data center during a keynote at GTC, Nvidia CEO Jensen Huang said “it used to be … for files. It’s now a factory to generate tokens.”

Those old buildings are gone, replaced by mega installations powering AI. At GTC, Nvidia laid out a forward-looking data-center architecture with new chips, storage and networking technologies.

Nvidia is also looking outward, to the edge and space, to make AI faster and more efficient. The goal of the AI-driven data-center architecture is to reduce the cost of generating tokens (the currency for AI, describing units of data that are processed by AI models during training and inference).

“The greatest infrastructure buildout in history is underway. The world is racing to build chip systems and AI factories, and every month of delay costs billions in lost revenues,” Huang said.

The integrated blueprint has five layers: physical infrastructure, followed by silicon, the software and systems, AI models, and applications.

“Nvidia’s making a big push into helping build out AI data centers, and that’s critically important as the cost and degree of difficulty is going up dramatically,” said Jack Gold, principal analyst at J. Gold Associates.

Nvidia’s products for data centers now encompass a full stack with all the pieces, said Sandip Gupta, executive managing director and head of global strategic alliances at NTT Data. “From a customer perspective, if they believe in an integrated stack, it makes things simple,” Gupta said.

The integrated data center cuts complexity and improves efficiency across cooling, networking and storage. “It is driven by the sentiment of an enterprise on how dependent they want to be on one provider versus mix and match,” Gupta said.

AI complexity has gone up manifold with multi-agent systems and technologies like OpenClaw, which Huang said is as big a deal as HTML and Linux. Those technologies will generate tokens at an unprecedented pace and strain network, memory and storage simultaneously.

AI data also has context, and moving it inefficiently wastes power and cost. A new networking and storage layer is needed to move data intelligently and efficiently. A technology called KV Cache holds the contextual memory necessary for processing agentic AI systems.

“It’s going to pound on memory really hard… It’s going to be pounding on the storage system really really hard, which is the reason why we reinvented the storage system,” Huang said.

Nvidia’s blueprint turns data centers into one giant AI GPU. It is spearheaded by the GPU known as Rubin and CPU called Vera, which were announced at GTC. Nvidia also slipped in a new inference chip; the Groq LPU has significantly more memory bandwidth than GPUs and is designed for low-latency token generation.

The new Vera Rubin NVL72 server combines the Rubin’s extreme speed and Groq’s memory bandwidth, said Ian Buck, vice president and general manager at Nvidia, during a press briefing.

AI demands real-time access to data and contextual memory, and traditional data centers lack the responsiveness needed by AI agents, Buck said.

The GPU maker has doubled the speed of its NVLink interconnect to 260 terabytes per second. Nvidia also introduced the BlueField-4 STX rack platform for AI-native storage, which extends GPU memory across the system to extract key contextual AI data.

“We used to have humans using the storage systems. We used to have humans using SQL. Now we’re going to have AIs using these storage systems,” Huang said.

Nvidia introduced software called Dynamo that orchestrates the GPU, LPU, CPU and memory and storage layers as an integrated system.

Huang also said the world’s first Spectrum-X switch with co-packaged optics is in production. “We invented the process technology with TSMC,” Huang said, adding that “we’re the only one in production today.”

Sources: Network World
Published: Mar 20, 2026, 11:12:49 AM EDT