Most React applications treat AI as a separate service: a backend calls an LLM, returns JSON, and a client component renders it. This works, but it misses something fundamental about how React Server Components actually model computation.
RSCs do not just move rendering to the server. They make the server a first-class part of the React tree. And the server is exactly where AI inference belongs: close to data, streaming by design, and free from the browser's constraints.
In this talk I'll show a concrete architecture where the AI call is the render. A Server Component reaches out to a language model, receives structured output, and streams typed React elements directly to the client with no extra API route, no serialization layer, and no client-side state for the AI response.
We'll cover the happy path, the failure modes, and the real-world lessons from building this in production, including hallucination-resistant validation, Suspense boundaries for model latency, and the one architectural mistake that will destroy your Time-to-First-Byte.
This talk has been presented at React Advanced 2026, check out the latest edition of this React Conference.


















