Skip to Main Content
 

Major Digest Home Storage constraints add to AI data center bottleneck - Major Digest

Storage constraints add to AI data center bottleneck

Storage constraints add to AI data center bottleneck
Credit: Network World

After GPUs, storage capacity has emerged as the next major constraint for AI data centers. Hard-drive lead times are ballooning to more than a year, and enterprise flash storage is also expected to see shortages and price increases, experts say. This is being driven by an explosion in AI inferencing as trained models are put to use.

“AI inference isn’t just a GPU story, it’s a data story,” says Constellation analyst Chirag Mehta. “Expect tight supply into 2026, higher pricing, and a faster move to dense [flash storage] footprints, especially where power, space, and latency are constrained.”

Dell’Oro Group projects the storage drive market, encompassing both HDDs and SSDs, to grow at a CAGR of over 20% over the next five years. “Both technologies will continue to play distinct roles across different tiers of AI infrastructure storage,” says Dell’Oro Group analyst Baron Fung.

According to a TrendForce report released earlier this month, AI inference is creating huge demand for real-time data access, causing both hard disk drive (HDD) and solid-state storage (SSD) suppliers to increase their high-capacity options.

For example, HDD manufacturers are moving to next-generation, heat-assisted magnetic recording, which takes substantial investment, and production lines aren’t yet at full speed. As a result, the average price per gigabyte has increased, diminishing the cost advantage of HDDs over SSDs. Meanwhile, flash storage vendors are upping the capacity of their SSDs to 122 terabytes and higher, leading to lower per gigabyte prices and better power consumption.

All of this is important because of the explosive growth in AI inference.

AI inference driving storage needs

AI inference is the computing that takes place when a query is sent to an AI model, and the AI model sends back an answer. On the enterprise side, this requires access to vector databases and other data sources used to enrich prompts for better results, known as retrieval-augmented generation (RAG). On the AI model side — whether with a third-party provider or a company’s own on-prem model — these longer prompts increase the storage requirements during inference.

And with new reasoning models and agentic AI systems, the number of interactions between data and AI models is only going to increase, putting even greater demands on storage systems.

Another driver? The falling prices of each interaction. According to Stanford University’s AI index report, inference costs are falling from nine-fold to 900-fold per year, depending on the task. According to Air Street Capital’s State of AI report, released in October, Google’s flagship models’ intelligence-to-cost ratio is doubling every 3.4 months, and OpenAI’s is doubling every 5.8 months.

And the less something costs, the more people use it. For example, Google is now processing more than 1.3 quadrillion tokens per month, up from 10 trillion a year ago. OpenAI doesn’t release numbers on how many tokens it processes, but its revenues hit $4.3 billion in the first half of 2025, up from $3.7 billion for all of 2024, according to news reports.

In fact, there will be severe shortages in high-capacity hard disk drives next year, TrendForce predicts, with lead times surging from weeks to more than a year.

HDDs offer low costs and are typically used for cold storage — data that doesn’t need to be accessed with extremely low latency. SSDs offer better performance for warm and hot storage but come with a higher price tag. But because of HDD shortages, some data centers are shifting some of their cold storage to SSDs, according to TrendForce, and this might happen even more in the future as SSD prices come down and HDDs run into constraints.

“HDD bit output is difficult to increase,” says TrendForce analyst Bryan Ao. “AI will generate more data than the growth in HDD output. It is necessary to prepare for this with SSD storage.”

And 256-terabyte QLC SSDs are coming in 2028, he adds. QLC stands for “quad-level cells” and is an update to the triple-level cells, or TLCs, in the previous generation of SSD storage. “Such large-capacity QLC solutions will be more cost-effective to compete with HDDs from a total cost of ownership and performance perspective,” Ao says.

QLC is optimized for network-attached storage and can handle petabyte and exabyte-scale AI pipelines, according to Roger Corell, senior director for AI and leadership marketing at Solidigm, an SSD manufacturer. “QLC has tremendous savings in terms of space and power,” he says. “And when data center operators are focused on maximizing the power and space envelopes that they have for the AI data center build, they’re looking to get as efficient storage as they can.”

According to the TrendForce report, SSD manufacturers are increasing QLC SSD production, but AI workloads will also expand, “leading to tight supply conditions for enterprise SSDs by 2026.”

Corell says his company is seeing “very, very strong demand.”

But that doesn’t mean that SSDs are going to completely take over, he adds. “I think looking into 2026 and beyond it’s going to take a mix of SSDs and HDDs,” Corell says. “We do believe that there is a place for HDDs, but some of the demands for AI are clearly pointing to QLC as being the optimal storage for AI workloads.”

AI deployment uses multiple storage layers, and each one has different requirements, says Dell’Oro’s Fung. For storing massive amounts of unstructured, raw data, cold storage on HDDs makes more sense, he says. SSDs make sense for warm storage, such as for pre-processing data and for post-training and inference. “There’s a place for each type of storage,” he says.

Planning ahead

According to Constellation’s Mehta, data center managers and other storage buyers should prepare by treating SSD procurement like they do GPUs. “Multi-source, lock in lanes early, and engineer to standards so vendor swaps don’t break your data path.” He recommends qualifying at least two vendors for both QLC and TLC and starting early.

TrendForce’s Ao agrees. “It is better to build inventory now,” he says. “It is difficult to lock-in long term deals with suppliers now due to tight supply in 2026.”

Based on suppliers’ availability, Kioxia, SanDisk, and Micron are in the best position to support 128-terabyte QLC enterprise SSD solutions, Ao says. “But in the longer term, some module houses may be able to provide similar solutions at a lower cost,” Ao adds. “We are seeing more module houses, such as Phison and Pure Storage, supporting these solutions.”

And it’s not just SSD for fast storage and HDD for slow storage. Memory solutions are becoming more complex in the AI era, says Ao. “For enterprise players with smaller-scale business models, it is important to keep an eye on Z-NAND and XL-Flash for AI inference demand,” he says.

These are memory technologies that sit somewhere between the SSDs and the RAM working memory. “These solutions will be more cost-effective compared to HBM or even HBF [high bandwidth flash],” he says.

On the positive side, SSDs use standard protocols, says Constellation’s Mehta. “So, interface lock-in is limited,” he says. “The risk is roadmap and supply, not protocol.” He recommends that companies plan ahead for price and lead-time volatility — and power.

“US data center energy constraints are tightening,” he says. “Storage total-cost-of-ownership conversations now start with watts per terabyte. In 2026, your bottleneck may be lead time or power, not just price. Architect for either constraint, and you’ll make better storage decisions.”

Sources:
Published: