Insights
AI Factories

AI Factory Infrastructure: The 2026 Operator's Playbook

A fully-financed project died in a five-year grid queue. That's the problem AI factory infrastructure exists to solve — what it is, what kills most of them, and how we ship one in 6–12 months.

Chad Harris·May 12, 2026 ·15 min read
AI Factory Infrastructure: The 2026 Operator's Playbook Watch the AI factory walkthrough

I can tell you the exact moment AI factory infrastructure stopped being a buzzword for me. I watched a fully-financed data center project die in a grid-interconnection queue — not for lack of money, not for lack of demand, but because the utility couldn’t connect it for five years. The capital was ready. The customers were ready. The grid said no. That is the problem this entire category exists to solve, and most of the industry is still pretending it doesn’t exist.

An AI factory is the physical stack that turns electrons into tokens at gigawatt scale. Wood Mackenzie has documented interconnection queues stretching for years, and conventional builds still take 24 to 48 months to commission on top of that wait. By contrast, we ship in 6 to 12. We do it by owning the three things everyone else rents — the power, the cooling loop, and the line that builds the compute. This is the operator’s playbook for 2026: what an AI factory actually is, what kills most of them, what the economics really look like, and how we ship one before the queue ever clears.

Want the receipts? Every figure in this brief is sourced, in full, on our AI factory research page.

What changed: the unit of the business is now a token

The phrase entered the operator’s vocabulary in 2024, but it became canonical at NVIDIA’s GTC in 2026, when Jensen Huang defined an AI factory as a facility whose primary economic output is a token, as Data Center Frontier reported from the event. Read that again, because it reorganizes everything. Specifically, the product is not a virtual machine. It is not floor space. It is not a hosting tenancy billed by the rack. The product is a token, sold by the million, and bounded by the watts you can put behind it.

That single reframing rewrites the operator’s equation. Revenue equals tokens per watt multiplied by available gigawatts. Therefore every decision you make, from the substation transformer to the cold plate sitting on the GPU, lives or dies by how it moves that number. Moreover, the operator who can squeeze more tokens out of each watt — or light up more gigawatts faster than the next operator — does not win by a little. They win by an order of magnitude, because the demand curve is vertical and the supply is rationed by power.

Here is why that matters to anyone buying capacity. The conventional industry was not built for this, and it cannot retrofit its way in. Hyperscale colocation grew up around 10 to 15 kilowatts per rack, air cooling, a utility power-purchase agreement, and a 24-to-48-month build clock, per JLL’s Global Data Center Outlook. An AI factory rejects nearly every one of those assumptions at once. Consequently, what looks like a denser data center is, structurally, a different building category — and the buyers who treat it as “colo, but more” get burned on both timeline and cost.

Why a denser data center is not the answer

There are three thresholds where the old model breaks. It does not bend at these points and recover. It breaks, and the break is structural, which is exactly why retrofits keep losing to purpose-built campuses on both capex and time-to-revenue.

The 60-kilowatt wall

Air cooling fails above roughly 60 kilowatts per rack. Modern AI training racks pull 80 to 130 kilowatts under full load, and the frontier rack-scale systems pull more than that. Therefore air-cooled white space — the thing the entire colocation industry spent two decades perfecting — is technically obsolete for training work. Rear-door heat exchangers buy you a narrow band above air, and then they too run out of road.

This is not a future problem you can plan around; the market has already turned. Goldman Sachs projects liquid-cooled AI servers reaching about 76 percent of the market in 2026, up from 15 percent in 2024. In other words, in twenty-four months liquid went from the exception to the default. Cold-plate, single-phase immersion, and two-phase immersion are no longer competing options to evaluate at leisure. They are the only thermal architectures that physically work at training density, and a campus that is not designed around one of them from the foundation up is already behind.

The density wall: what air can cool vs what AI needs
Air-cooled ceiling~35 kW/rack
AI training rack80–130 kW
Air cooling runs out near 35 kW per rack. Modern AI racks start at 80. That gap is the whole reason liquid is not optional.

The 100,000-home site

A single AI factory now draws power equivalent to roughly 100,000 homes, per the International Energy Agency. The publicly announced gigawatt-class projects make the scale concrete — sites measured not in megawatts but in whole gigawatts of continuous draw. And here is the part the spreadsheets miss: that load does not arrive gently. It lands on a local grid all at once.

A draw that size can move a town’s rates and swallow its interconnection queue for years. So the question that actually matters is not how big the load is. It is where the power comes from. We answered it by drawing none of it from the grid — the campus generates and islands its own power, so the town’s grid and the town’s rates never feel us arrive.

The token layer

The last break is commercial. The conventional buyer asked for megawatts and an uptime tier and signed a floor-space lease. The AI factory buyer asks for token throughput, latency to first token, and cost per million tokens. As a result, the old sales motion fails in the first meeting. If a seller cannot speak power, cooling, and inference economics fluently, the deal is already lost, because the buyer is no longer purchasing real estate. They are purchasing a rate of intelligence production.

The four walls every AI factory project hits

Almost every project in this category runs into the same four constraints, in roughly the same order:

  • Grid interconnection. The queue, not the chips, is the binding constraint. Studies and upgrades push first power out by years, and no amount of capital shortens a utility’s calendar.
  • Long-lead equipment. Switchgear, transformers, and turbines now carry lead times measured in many months to years. The supply chain for the unglamorous steel is as much a gate as the GPUs.
  • Cooling obsolescence. An air-cooled shell cannot be cheaply converted to liquid density. The retrofit math almost never beats building the right envelope once.
  • Permitting and zoning. Power draw and water draw turn into public hearings, and public hearings turn into delay. This is where projects quietly die after everything technical is solved.

Notably, each of these is structural rather than incremental, and they compound. A project can clear three of them and still lose two years to the fourth. That compounding is the whole reason the category rewards owning the inputs instead of negotiating for them.

What an AI factory is actually made of

A working AI factory needs five systems. Notably, the word that matters is together. You cannot buy them from five vendors and integrate them on site and hit a six-month clock. They have to be designed as one machine.

  1. On-site power generation — renewable baseload, solar, and storage, firmed and islanded from the public grid, so first power is our decision and not a utility’s queue position.
  2. Liquid cooling at rack density — cold-plate or immersion engineered for 80-plus kilowatts, not bolted onto a raised floor as an afterthought.
  3. High-density compute pods — factory-built, liquid-ready blocks that arrive as a unit, instead of a hall stick-built around the racks.
  4. A closed-loop water system — a sealed cooling loop that recirculates, so the campus draws no municipal water and the heat is captured rather than evaporated.
  5. An on-site control plane — the orchestration layer that ties power, cooling, and compute into a single operable system with one source of truth.

Two of those five are also the reason a SAVRN campus is a good neighbor by design, not by press release. Because the power is self-generated and islanded, the town’s grid and rates are untouched. Because the water loop is closed, the campus draws zero municipal gallons. That is not a footnote bolted on at the end. It is the same engineering that lets us ship fast — and it is why a town has no reason to push back on what we build.

The economics nobody quotes correctly

People ask me for a build cost per megawatt, and I understand why, but it is the wrong question, and quoting it precisely would misrepresent how this category actually works. Per-megawatt capex swings widely with the power source and the cooling architecture, and the cooling system is exactly where a purpose-built campus separates from an air-cooled retrofit. Moreover, lead with capex per megawatt and you will optimize for the cheapest building and lose on the only metric that pays you back.

The metric that matters is tokens per watt per dollar — how much sellable intelligence each watt and each dollar actually produce over the life of the campus. Therefore the operator who optimizes the entire stack to that number, rather than to floor space or to headline capex, wins the unit economics even at a higher sticker price. Furthermore, time-to-first-token belongs in the economics, not beside them. A campus that earns eighteen months sooner has a different cost of capital than one that does not, even if the two quote the same build cost. Speed is not a convenience here. It is a line item.

60 kW
The rack density above which air cooling fails. Training racks pull 80–130 kW.
Industry analysis
15% → 76%
Liquid-cooled share of AI servers, 2024 to 2026. Liquid is now the default.
Goldman Sachs
100k homes
Power one AI factory draws — landing on a local grid all at once.
IEA

How we ship in 6 to 12 months

We compress the timeline by owning the parts that usually block it, and by running them in parallel instead of in sequence.

Pre-identified sites with power already contracted. We do not start the clock by entering a queue. We start it on land where the generation and the offtake are already lined up, so first power is a build problem, not a waiting problem.

Behind-the-meter power. Because we generate on site, interconnection is not the gate. This single decision removes the largest source of delay in the entire category, and it is the one most operators cannot make because they do not own their generation.

Integrated manufacturing in Fort Worth. The liquid-cooled pods are built to order on a line, not improvised on a job site. Manufacturing the compute envelope the same way you manufacture anything else — repeatably, indoors, in parallel with site work — is how you collapse the schedule.

Modular pods that arrive ready to assemble. The blocks land as units and are set, connected, and commissioned, rather than stick-built on a raised floor. Because site preparation, manufacturing, and power all advance at once, the campus reaches token-bearing operation while a conventional project is still waiting on its interconnection study.

Time to first power: conventional grid build vs SAVRN
Conventional grid build24–48 months
SAVRN campus6–12 months
Owning the power, the cooling, and the manufacturing line collapses the schedule by three to four times.

The deployment sequence, start to first token

  1. Site qualification and power contracting — confirm the land, the generation, and the offtake before anything else moves.
  2. Manufacturing — build the pods and the power-and-cooling skids on the line while the site is prepared.
  3. On-site assembly and commissioning — set the modular blocks, connect power and cooling, and test under load.
  4. Token-bearing operation — the factory begins converting watts into billable tokens, which is the only milestone that was ever the point.

How to tell a real AI factory from a brochure

When you evaluate a partner, five questions separate the operators from the slideware. First, do they own their power, or are they sitting in the same grid queue as everyone else? Second, is the cooling engineered for 80-plus kilowatts from the foundation, or is it retrofit air with a brave face? Third, can they show you tokens per watt, or do they still only talk megawatts and tiers? Fourth — and this one tells you who they are — what does the campus take from the host community, and what does it give back? Finally, can they prove a timeline with a real sequence and a real manufacturing line behind it, or is the six-month number a marketing figure with nothing under it?

Why we build it this way

I will be plain about the part that is personal, because it is the part that actually drives the engineering. My grandparents taught me that in a community you give more than you take, or you do not earn the right to stay. I believe that rule does not stop at the edge of a new technology. So we built an AI factory that takes nothing the community needs — not its grid power, not its water — and gives back the thing a town cannot build on its own: skilled work, and a campus people are proud to host. The fast timeline and the good-neighbor design are not two strategies. They are the same decision, made on purpose.

If you want the wider picture, read our other field notes or explore the rest of the SAVRN model.

Frequently asked questions

Can I convert my existing air-cooled data center into an AI factory?

Usually not economically. Because air-cooled shells top out near 60 kilowatts per rack and training racks need far more, the retrofit becomes a gut renovation of power and cooling at once. In most cases, building the right envelope from the foundation costs less and ships sooner than forcing liquid density into a building that was never meant for it.

What actually happens to the heat the campus produces?

It is captured by the closed cooling loop rather than evaporated into the air. Consequently, the heat becomes an asset to be reused on site instead of a plume of lost water and energy. That is the opposite of the open-loop evaporative cooling that drives the industry’s enormous water draw.

Is on-site, behind-the-meter power as reliable as the grid?

When it is engineered with firmed baseload plus storage, yes — and often more predictable, because you control it. Moreover, islanding from the grid removes your exposure to a utility’s queue, its outages, and its rate changes. Reliability becomes an engineering decision you own rather than a dependency you hope holds.

Who operates the factory once it is built — you or me?

That is a deal structure, not a fixed answer. We can operate the campus for a fee, transfer operations to your team, or run a hybrid where we operate and train your people in parallel. Importantly, the training path is the one that builds local capability, which is usually what the host community wants.

How do you handle GPU obsolescence across a ten-year campus?

By designing the power and cooling envelope to outlive several generations of compute. The pods are the refreshable layer; the substation, the cooling plant, and the building are the durable layer. Therefore a hardware refresh is a pod swap, not a teardown, which protects the capital that is hardest to replace.

Can the campus provide services back to the grid?

A self-generating campus can be designed to support the local grid rather than strain it, depending on the interconnection agreement and the regulatory environment. As a result, a project that the grid feared as a giant new load can, in some configurations, become a stabilizing neighbor instead.

What is the smallest AI factory that makes sense?

Smaller than people assume, because the model is modular. A single block is a viable unit, and the campus grows by adding blocks rather than by rebuilding. This means a community or a customer can start at a scale that fits the offtake and expand as demand proves out.

If this model is better, why isn't everyone building it?

Because it requires owning generation, owning a manufacturing line, and integrating five systems as one — and most operators own none of those. They rent power, buy cooling, and lease space, so they are structurally unable to make the decisions that compress the timeline. The barrier is not insight. It is vertical integration that took years to build.

What does the host community actually receive?

Concretely: jobs that do not require leaving town, a training institute that reskills local residents into those jobs, no new load on the public grid, no draw on the municipal water supply, and a campus designed to be walked into rather than fenced off. The intent is for the town to end up measurably better than it was before we arrived.

How do I verify a six-month timeline is real and not a pitch?

Ask to see the manufacturing line, the contracted power, and a prior site’s actual sequence with dates. A real timeline has a factory and a power contract behind it; a marketing timeline has a slide. The presence of owned generation is the single best tell, because without it the grid queue makes six months impossible no matter what the deck says.