Noam Shazeer built the architecture every AI runs on. Now he's building the next one.

Share

In 2017, Noam Shazeer co-authored "Attention Is All You Need" — the eight-page paper that introduced the Transformer architecture underlying every major AI system on earth today. GPT. Claude. Gemini. Llama. All of them run on what Shazeer built. He left Google in 2021 to found Character.AI. Google paid $2.7 billion to get him back in 2024. On June 18, 2026, he announced he is joining OpenAI as Lead for Architecture Research. The man who designed the structural foundation of modern AI is now working on the next generation of it — at the company preparing for the largest AI IPO in history. That is not a minor personnel announcement.


THE STORY

Noam Shazeer is not famous the way Sam Altman is famous. He has never been on the cover of a magazine. He does not give keynotes. He writes papers. And the papers he writes become the operating system of an industry.

"Attention Is All You Need," published in June 2017 with seven co-authors while Shazeer was at Google Brain, introduced the self-attention mechanism that replaced recurrent neural networks as the dominant architecture for language models. The paper has been cited more than 100,000 times — one of the most cited papers in the history of computer science. Without the Transformer, there is no GPT-4, no Claude 3, no Gemini, no modern AI industry as it currently exists.

After leaving Google in 2021, Shazeer co-founded Character.AI — the AI companion platform. In 2024, Google made what analysts called one of the most expensive talent acquisitions in tech history: paying $2.7 billion to re-hire Shazeer and his team, structured as a licensing deal that kept Character.AI independent while bringing Shazeer back to DeepMind. He was working on Gemini's next-generation architecture.

Then on June 18, Shazeer announced he was leaving Google again — this time for OpenAI, where he joins as Lead for Architecture Research.

The implications are precise. Architecture research is not product development. It is not fine-tuning. It is the foundational work that determines what the next generation of models can do — how capable they are, how efficient, how scalable. Shazeer's job at OpenAI is to design the structural bones of GPT-6, GPT-7, and whatever comes after. His track record suggests he is capable of another inflection-point contribution.

The competitive context makes the timing significant. OpenAI is in the pre-IPO quiet period, targeting a Q4 2026 public offering at $850 billion to $1 trillion. GPT-5.6 is days away from launch. Anthropic's Fable 5 remains offline, giving OpenAI a window to recapture benchmark leadership. And now the architect of the Transformer is building the next generation of GPT.

Google's loss is difficult to overstate. Shazeer was the most valuable technical employee at DeepMind — not because of his title, but because of his track record of producing architecture breakthroughs that reshape the entire field. The $2.7 billion Google paid to bring him back has now produced a competitor with access to his next contribution. The talent acquisition strategies of both companies will be dissected in business school case studies for a decade.

For developers and investors, the practical signal is this: OpenAI has been criticized for shipping products faster than it advances fundamental architecture. GPT-5.5 is a capable model that trails Fable 5 on coding benchmarks. The addition of Shazeer suggests OpenAI is prioritizing architectural advancement over incremental product releases — which implies that GPT-6, whenever it arrives, may be a larger capability jump than GPT-5 to GPT-5.5 was.

That matters enormously for every company currently pricing Anthropic's architectural lead into their AI vendor decisions.


THE MONEY ANGLE

1. Shazeer at OpenAI reprices every "Anthropic has the better architecture" investment thesis. A meaningful segment of enterprise AI buyers and investors chose Anthropic because of a belief that Claude's underlying architecture — particularly Fable 5's Constitutional AI approach and multi-agent capabilities — represents a structural advantage over OpenAI's models. That thesis now has a new variable. Shazeer's track record is not incremental improvement. It is category-defining contribution. The gap between GPT-5.5 and Fable 5 on coding benchmarks is real and measurable. The gap between GPT-5.5 and GPT-6 — with Shazeer's architectural input — is not yet known. Investors pricing a permanent Anthropic architectural lead into their Anthropic IPO thesis should update their models.

2. Google just lost its most important technical employee to a pre-IPO competitor. Google paid $2.7 billion to bring Shazeer back in 2024. He left anyway. That sequence tells you something about the quality of the retention environment at DeepMind — and about OpenAI's ability to attract talent even at a stage when its equity is pre-IPO and its government relationships are adversarial. For Alphabet investors, the Shazeer departure is a signal worth monitoring. Not because one engineer breaks a company, but because foundational architecture researchers are non-fungible. There is no obvious replacement.

3. The IPO narrative just got a technical credibility upgrade money can't easily buy. OpenAI's S-1 will be read by institutional investors who understand that AI company valuations are partially bets on the trajectory of model capability. The presence of the Transformer's co-inventor on the architecture team is a credibility signal that no benchmark score can fully replicate. When pension funds and sovereign wealth funds sit across the table from OpenAI's roadshow, Shazeer's name on the org chart is a tangible answer to the question: "What gives you confidence the next generation of models will maintain OpenAI's competitive position?" That answer was harder to give two weeks ago.


THE OPPORTUNITY

1. Watch for GPT-6 architecture previews at research venues in Q3 and Q4. Shazeer's previous major contributions — the Transformer, mixture-of-experts scaling, multi-query attention — all appeared first as research papers before becoming production systems. His arrival at OpenAI likely means a research publication cadence that signals GPT-6's architectural direction before the model ships. Follow OpenAI's research blog and NeurIPS/ICML submission lists in Q3 and Q4. Architecture papers from OpenAI co-authored by Shazeer are the earliest available signal of what GPT-6 will actually be. Investor move | Career move

2. If you're building on Claude's architecture today, know what you're betting on. Fable 5 holds a measurable coding benchmark lead over GPT-5.5. That lead is part of why enterprises are choosing Anthropic. But benchmark leads in AI have historically been temporary — the Transformer itself made entire classes of previous models obsolete in 18 months. If you're making multi-year infrastructure decisions based on Anthropic's current architectural advantage, factor in that the person most capable of closing that gap just started at the competitor. That's not a reason to switch today. It's a reason to build vendor flexibility into your architecture. Business owner move

3. The $2.7 billion talent acquisition that failed is a case study in retention economics. Google paid $2.7 billion — structured as a licensing deal — to secure Shazeer's contributions for DeepMind. He left anyway within 18 months. For anyone building or managing technical teams: the Shazeer case suggests that financial retention works for talent that is primarily motivated by compensation. It does not work for talent motivated by impact, autonomy, or mission. OpenAI offered something $2.7 billion couldn't. Understanding what that was — and replicating it in your own organization — is more valuable than the retention check. Career move | Business owner move


QUICK HITS

SpaceX Is Building a GitHub Competitor on Cursor's 50,000 Enterprise Clients Reports from June 16 indicate SpaceX and Cursor are preparing to launch Origin — a new code repository platform positioned as a direct competitor to GitHub. If accurate, the SpaceX acquisition of Cursor extends beyond AI-assisted coding into the fundamental infrastructure layer of software development. GitHub, owned by Microsoft, hosts over 100 million developers and is the default repository for the majority of open-source and enterprise code. Cursor's 50,000+ enterprise clients and 7.5 million monthly active developers represent a distribution base no other repository startup has ever launched with. ◆ Money angle: An Origin launch puts SpaceX in direct competition with Microsoft across both AI coding tools and repository infrastructure — while Microsoft simultaneously holds a major stake in OpenAI. The most valuable real estate in software development is now a three-way geopolitical contest between SpaceX, Microsoft, and GitHub's open-source community.

Anthropic Is Building 1 Gigawatt of Owned Compute Anthropic announced plans to build or lease 1 gigawatt of dedicated AI compute capacity — the equivalent of powering a city of 750,000 homes — with the vast majority sited in the United States. This is a structural shift from purely cloud-rented compute to owned infrastructure, significantly improving long-term unit economics at the cost of massive upfront capital commitment. The move is consistent with Anthropic's November 2025 commitment to invest $50 billion in US computing infrastructure and de-risks the company's dependency on AWS pricing ahead of its IPO. ◆ Money angle: The Apollo and Blackstone $36 billion debt deal funding Google TPU purchases for Anthropic is the most visible example of AI infrastructure creating investment-grade debt at scale. Morgan Stanley projects that by year-end 2026, AI infrastructure will have generated more investment-grade debt than any single sector in recent memory. The infrastructure layer remains the most durable trade in AI regardless of which model wins any given benchmark.

Fable 5 Moves to Paid Credits — Still Offline As of today, June 23, Claude Fable 5 is no longer included in Pro, Max, Team, or Enterprise subscriptions. Access now requires paid usage credits at $10 per million input tokens and $50 per million output tokens — double the cost of Claude Opus 4.8. The model remains globally inaccessible due to the June 12 export control directive. Subscribers received 4–5 days of actual access out of the 13-day complimentary window. Anthropic has not announced credit extensions for the disrupted period beyond the June 20 refund deadline. ◆ Money angle: Anthropic is now charging for a product it cannot deliver. The Q2 revenue impact — already material from the lost access period — now compounds with credit sales for an offline model. The Q2 earnings report will be the most closely read financial document in AI this year.


The man who invented the architecture that runs every major AI system on earth just started his new job. The company he joined is preparing the largest AI IPO in history. What he builds next will determine whether the benchmark gaps that define today's competitive landscape survive into 2027.

See you tomorrow, The Future Geek Team