Data centre phantom compute

Good data governance could improve compute power efficiency by up to 60%, says Hitachi Vantara’s Simon Ninan. Credit: Sashkin/ Shutterstock.com.

The well-worn mantra ‘build it and they will come’ seems applicable to today’s data centre boom. But conversations around grid capacity are largely based around supply side planning and based on assumptions about future demand that are rarely publicly questioned.

While future demand for AI compute power is assumed, public discourse is focused on the constraints. Power, land, and hardware supply chain constraints, as well as regulation and nimbyism are hitting the headlines almost daily. Simon Ninan, senior vice president of business strategy for Hitachi Vantara questions why data management is not more heavily scrutinised within a data centre environment as he argues it has the potential for addressing some of these grid capacity constraints.

Access deeper industry intelligence

Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.

Find out more

Ninan contends that mechanisms for storing, moving, and processing data more efficiently are being overlooked. Closer workload management and planning would mean more interrogation about when and how data is processed, for example, how demand can be offloaded by dispersing it to the edge. “There are so many mechanisms that can mitigate or optimise demand, and we are not having conversations about that,” says Ninan

According to GlobalData Thematic Intelligence’s Artificial Intelligence Executive Briefing, hyperscalers are pouring tens of billions of dollars into expanding their AI capabilities. In 2026, Alphabet, Amazon, Meta, Microsoft, and Oracle are expected to spend nearly $685bn in combined capital expenditure, a 70% increase from 2025. Much of this capex will go towards AI infrastructure such as data centres, AI chips, and energy assets.

As companies plan for exponential future demand, can better data management really move the needle on compute power—is it not just tinkering around the edges?

Ninan says not. “It’s just that most of the industry is unaware of the extent of the ‘phantom compute’ problem,” he argues.

Simon Ninan is Hitachi Vantara’s SVP of business strategy

What is phantom compute?

Phantom compute as Ninan sees it, is allocated GPU capacity that is either sitting idle or else not actually doing useful work. This is compute power that only exists in theory, on infrastructure plans but is effectively ‘phantom’ capacity in the real world.

AI data centre build-outs are full of inefficiencies, and this is what is creating the real bottleneck for AI development—not how many GPUs can be deployed, but how effective you are in storing, moving and processing data. Optimised processes, says Ninan, could account for GPUs being used up to 30% more efficiently and potentially cut compute demand by 30–40% without slowing down AI processing power.

According to Ninan, the ‘phantom compute’ problem has been a long time in the making. Over the last two decades, large organisations have collected massive amounts of data and dumped it into unstructured data lakes. Ninan attributes this to the advent of agile computing and DevOps within self-contained teams and a belief that “every single transaction generates a ton of data, and all data is valuable.” The problem has become increasingly worse, and something Ninan diagnoses as “digital exhaust”, a mindset derived from the dot-com era and amplified by the onset of IoT.

Junk data is the AI development bottleneck

The AI era has created some urgency around this morass of junk data that has become unmanageable at precisely the time when AI requires clean, reliable and valuable training data. Ninan questions how much of the training data being used on universal models is useful information. He references the news in April about OpenAI’s move to implement explicit controls in its models against mentioning goblins in outputs, as a benchmark for how junk data has infiltrated AI models.

An often quoted MIT study that found 95% of AI projects fail demonstrates the extent to which a clean data pipeline is necessary for scaling AI. Many of the failures pivoted on the lack trust in data quality and an unclear path to ROI.

These AI challenges could be somewhat addressed with a clear data management strategy, contends Ninan, who says the average enterprise is sitting on massive amounts of junk data.

Security and a push to demonstrate short-term ROI have meant the thorny and labour intensive task of sorting through data has been viewed as a non-urgent issue in boardrooms. Security and ROI have always superseded data governance, and are both extremely important issues, Ninan notes, “but you will never get to AI ROI unless you solve the data fundamentals issue first.”

He describes data governance as the engine for a “chain reaction” that can take effective GPU utilisation from 30%-80%. The bottleneck is the inability to feed clean data fast enough because the processing pipeline is clogged with junk data.

Where this data governance comes into its own is in the realm of data centre infrastructure management (DCIM). Before widespread AI adoption, DCIM software solutions managed power, cooling, operations, physical security, and the IT rack in silos. “Technically, it was possible to integrate them using APIs and software integration, but they weren’t powerful on their own,” explains Ninan.

An AI driven DCIM solution brings all of this together and could help enterprises improve their compute capacity from around 40%-60% with good data governance, and sometimes more in terms of power consumption. This is “really powerful” in orchestrating all data centre operations where AI helps drive recommendations and optimisation, says Ninan.

Power consumption in a data centre can be broken down into the various components which include CPU and GPU consumption of around 50-60%, cooling from 30-40% and then data storage and miscellaneous account for the remainder. This is why Ninan thinks most people can be forgiven for thinking storage is only a fraction of the total cost in the data centre.

He contends that it should not just be seen as storage but viewed as how it interacts with compute and cooling. “If you increase the storage efficiency by 40%-60%, guess what? It’s not just impacting that 10%-20% component, it is having a cascading effect on cooling and a cascading effect on compute. For hyperscalers and others, these are real dollar savings,” he says. And for businesses this means savings on compute requirements—something that is top of mind for all enterprises as they transition towards becoming AI first businesses.

Sections

Sections

Sections

Sections

What if building more grid capacity isn’t the answer? Solving ‘phantom compute’ could address data centre efficiency

Go deeper with GlobalData

Quantum Computing - Thematic Intelligence

High Performance Computing (HPC) Market Size, Share and Trend Analysis by Region, Component (Serv...

Data Insights

Access deeper industry intelligence

What is phantom compute?

Junk data is the AI development bottleneck

Quantum Computing - Thematic Intelligence

High Performance Computing (HPC) Market Size, Share and Trend Analysis by Region, Component (Serv...

Go deeper with GlobalData

ABB taps TCS to overhaul global network operations with AI

Intel commits $5.7bn to expand Leixlip manufacturing site

Cloudflare launches Precursor for continuous detection of web bots

Entering the Agentverse: Powering the next decade of mobile growth

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

Go deeper with GlobalData

Data Insights

Access deeper industry intelligence

What is phantom compute?

Junk data is the AI development bottleneck

Sign up for our daily news round-up!

Give your business an edge with our leading industry insights.

Go deeper with GlobalData

Go deeper with GlobalData

Access deeper industry intelligence

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing