This article mirrors my presentation at the recent Peak Nano WATTSNEXT webinar. Download the slides. Watch the video.
We are at the beginning of what can best be described as an AI-driven energy supercycle. To understand why, we need to rethink both AI and data centers from the ground up.ย Letโs start with an overview of artificial intelligence which reveals underlying AI dynamics driving an unprecedented surge in power demand. Then letโs examine how the industry is responding.
A New AI Universe and a New Power Equation
The presentation frames modern AI as a โGenerative AI Solar System,โ a layered ecosystem of users, applications, agents, models, and data centers. At the center of this system are billions of users and, increasingly, tens of billions of AI agents expected by 2030.
These agents are not passive tools. They are AI workers, autonomous systems capable of planning, executing, and coordinating tasks across applications. In a typical enterprise, hundreds or even thousands of these agents will operate simultaneously, interacting with models and each other.
All of this activity ultimately runs inside data centers, massive facilities filled with specialized AI chips, and every layer of this system is powered by electricity.

Training vs. Inference: Where the Power Really Goes
One of the most important insights for energy professionals is the distinction between AI training and inference.
- Training is the process of building models using vast datasets. It is expensive, compute-intensive, and episodic.
- Inference is what happens when users and systems interact with those models in real time.
Over the lifetime of a model (in a mature AI market with billions of agents), only 10โ20% of power is used for training, while 80โ90% is consumed by inference. This has profound implications. While much of the public conversation focuses on the energy cost of training large models, the real long-term driver of power demand is continuous usageโbillions of interactions, queries, and automated workflows occurring every day.ย Inference is not a batch process. It is a persistent, always-on load on the grid.

The GenAI Big Bang
The release of ChatGPT in November 2022 marked what the the โGenAI Big Bang.โ In a remarkably short period, AI adoption moved from gradual growth to explosive scale.ย ChatGPT reached 1 million users in five days and is projected to surpass 1 billion users in 2026 followed by Google Gemini tha tis on track to reach 1 billion users in 2027.
This rapid adoption is driving a corresponding surge in infrastructure demand. In the United States alone, data center critical IT power is expected to grow nearly 4x between 2023 and 2028.ย At the same time, the scale of individual data centers is expanding dramatically. Facilities that once consumed 100 megawatts are being replaced by next-generation campuses consuming 7,000 megawatts, equivalent to powering millions of homes.
Inference Becomes the Dominant Load
Looking forward, the presentation highlights a key trend: inference is becoming the fastest-growing segment of data center power demand.ย Forecasts show AI inference growing at 35% CAGR and AI training growing at 22% CAGR.
While inference represents a smaller share today, it is expected to overtake both training and non-AI workloads by the end of the decade and dominate power consumption in the years that follow.ย This shift reflects the transition from AI as a technology to AI as infrastructure embedded in everyday life.

Power Constraints and Industry Response
The speed of this growth is creating a structural challenge: power supply cannot keep up with demand.ย Governments and utilities often operate on multi-year planning cycles, while AI infrastructure is scaling in months. As a result, data center operators are taking a more proactive role in securing energy including building on-site generation (e.g., gas plants), partnering directly with nuclear facilities, sourcing power across state boundaries, and relocating to regions with excess capacity and favorable incentives.ย This marks a fundamental shift: data center companies are evolving into energy strategists, not just infrastructure operators.

The Efficiency Imperative
Simply building more data centers is not enough. Without major efficiency gains, projections suggest the industry could eventually exceed global energy capacity.

The response is a multi-pronged push toward efficiency, including the Top 6 listed below. These innovations are converging toward a single goal: maximizing tokens per watt, the true productivity metric of AI infrastructure:
- Liquid cooling โ reducing cooling energy by up to 65%
- More efficient chips โ improving FLOPS per watt at ~40% per year
- Eliminating power conversions โ reducing losses in UPS and distribution
- AI-driven optimization โ dynamically managing workloads and energy use
- Renewable and alternative energy integration โ including SMRs
- System-level redesign โ optimizing architecture, location, and heat reuse
The Jevons Paradox: Efficiency Drives Demand
Even these gains come with a paradox. As systems become more efficient and the cost per AI interaction decreases, usage increases. This is a classic example of Jevons Paradox; greater efficiency leads to greater total consumption. In AI, this effect is amplified by the rise of autonomous agents, expansion into new domains (e.g., robotics), and continuous, real-time interaction models. The result is clear: efficiency will not reduce power demand, it will accelerate it.

The Bottom Line
For power and energy professionals, the message is unmistakable:ย AI is not just another load on the grid. It is a new layer of the global economy, powered entirely by electricity.ย The industry is responding with innovation, investment, and urgencyโbut the scale of demand is growing even faster.ย Understanding these dynamics is no longer optional. It is essential to planning the future of energy.








