vergnyx.dev
  • work
  • writing
  • experiments
  • terminal
  • /about
  • /now
  • /uses
contact me
vergnyx.dev
vergnyx.dev
workwritingexperimentsterminal/about/now/usescontact me
writing/google-xai-compute-bridge-deal.md

Google Can't Build Compute Fast Enough. So It Rents.

Google committed $180B in capex this year and still had to rent xAI's GPUs for Gemini. The AI compute crunch is real, even for the cloud.

infra·4 min read·974 words·2026·06·07google-xai-compute-bridge-deal.md

There's a line in Google's announcement of its new compute deal that keeps pulling my attention back. They described it as "a short-term, timely agreement to ensure we have bridge capacity to meet surging customer demand for our agent platform, Gemini Enterprise, which has been even higher than we expected."

"Higher than we expected." That's Google. Alphabet, which has committed over $180 billion in capital expenditures for 2026 alone — a company whose entire existence depends on managing compute at global scale, with TPUs and data centers and a whole vocabulary of infrastructure terms I don't fully understand yet. They still ran short.

What happened

In February, SpaceX completed its merger with xAI — SpaceX at $1 trillion, xAI at $250 billion, combined entity around $1.25 trillion — and inherited xAI's Colossus 1 data center near Memphis. Then SpaceX quietly started renting it out.

In May, Anthropic signed a deal for $1.25 billion a month for over 220,000 Nvidia GPUs at Colossus 1. Last week, Google followed with a similar arrangement: $920 million a month from October 2026 through June 2029, for roughly 110,000 Nvidia GPUs — about half the capacity Anthropic is renting from the same facility.

Over the full term of Google's deal, that's around $32 billion.

Both deals have an exit clause: either side can terminate with 90 days' notice after December 31, 2026. Which makes them read less like long-term infrastructure bets and more like "we need GPUs right now, and we'd rather have an exit clause than not."

Google is also a longtime SpaceX investor, with an expected $100+ billion stake post-IPO. SpaceX announced this deal one week before its expected Nasdaq listing. So there's a lot happening in one press release.

Why Google can't just build more

The frustrating thing about "bridge capacity" as a phrase is that it sounds like a temporary patch for a temporary problem. But the reason Google needs a bridge at all reveals something real about how AI infrastructure works right now.

Building a hyperscale data center from scratch takes years. You need land, permits, power — and the power contracts alone can take longer than the hardware. Then cooling. Then network infrastructure. Then the hardware itself, which requires getting in line for Nvidia GPUs that are already allocated many months out. Then you bring it online incrementally, run it through capacity tests, hand it to operations.

Google has teams that do exactly this, full-time, at massive scale, and they're excellent at it. They still can't do it fast enough when demand spikes in a way the models didn't predict.

"Bridge capacity" isn't a patch — it's what you rent when your own construction schedule can't match your growth curve.

The Gemini Enterprise demand that "came in higher than expected" apparently arrived faster than any new data center could. So Google signed a rental agreement.

The part that's structurally different

Tech companies rent infrastructure from each other all the time. Startups on AWS, enterprises on Azure, Netflix on AWS's servers while Amazon Prime Video competes in the same streaming market. Complicated, but not new.

What's different here: xAI makes Grok. Grok competes with Gemini and Claude. SpaceX owns xAI. And now both Google and Anthropic are paying SpaceX to run their competing AI products.

I want to be careful not to overstate this. Google is running its own software on rented hardware. The GPU itself doesn't care what model is running on top of it. Infrastructure providers have strong business incentives to be genuinely neutral about what their tenants compute — mixing up training data or model weights would be a catastrophic liability, and there's no indication anything like that is happening or even plausible here.

But the structural fact remains: the line between "neutral infrastructure provider" and "AI competitor" has fully collapsed, and we've collectively decided to treat it as unremarkable. AWS competes with its customers on the edges. xAI competes with Google and Anthropic directly, in the exact market where all three are now betting the most.

Maybe that's fine. I genuinely don't know enough about how these contracts are governed to say it's not fine. It's just different from anything I'd have described as "normal" a year ago.

What I keep thinking about

I've been building with Gemini's API for a few months — document embeddings, prompt pipelines, basic tool-calling stuff that I'm still figuring out. When I make an API call, I think about latency and pricing. I don't think about which physical building my request lands in.

But starting in October, some non-trivial share of those Gemini API calls will be processed on Nvidia GPUs that live in a building SpaceX/xAI owns, under a contract that either party can terminate with 90 days' notice, at a facility that also hosts Anthropic's Claude workloads. The company that owns the building also makes a model that competes with both.

I'm not trying to make that sound ominous — it's just a genuinely strange picture, and I'm still sitting with it.

The thing I keep coming back to isn't really about Google or Musk or which AI wins. It's the underlying signal: Alphabet committed $180 billion in capex and Anthropic is paying $15 billion a year in compute costs alone. These are numbers that don't make sense unless the demand is genuinely outrunning the world's ability to build — not just for startups, but for the companies that have been planning this infrastructure for years.

The gap between "how fast these models are getting adopted" and "how fast data centers can be built" is apparently real enough that Google rented from whoever had hardware available.

Right now, that's xAI.

I don't think that's the permanent arrangement. But I suspect it says more about the pace of all this than anything else I've read this week.

Similar reads

infra

Microsoft Open-Sourced Durable Execution in Postgres

Microsoft open-sourced a durable execution engine inside Postgres. If your job state is already there, maybe your orchestration should be too.

5 min · 1,019 words
engineering

autoAlpha: 0 Hides the Element. Not the LCP Clock.

Traced a 2-second LCP render delay to GSAP hiding my hero subheading before animating it in. The fix was simpler than I wanted it to be.

6 min · 1,229 words
$ exit 0 · end of filegoogle-xai-compute-bridge-deal.md · /writing© Vaibhav Verma · 2026