Don't Bolt RAG Onto Your CMS: Build It In

by Josh Oransky

Every vendor is now selling a chatbot you can drop onto your existing site. Most of them are a small JavaScript snippet that opens an iframe and runs a retrieval pass against an index they built from your sitemap. They work. They also feel exactly like what they are: a thing bolted on at the end, with no real understanding of how your content is organised.

You can do better. The CMS already has the answer.

Why bolt-on RAG feels off

Bolt-on retrieval has three structural problems. First, the index is built from rendered HTML rather than from the content model, so block boundaries, fragment includes, and structured metadata all get flattened to text. Second, the bot doesn't know which content is authoritative versus draft versus deprecated, because the sitemap doesn't carry that signal. Third, when authors edit, the index is stale until the next crawl, which is usually overnight.

Authors notice. They write a new page, ask the bot about it five minutes later, and get an answer from the page they wrote in 2022. That's the moment they stop trusting the AI.

What "built in" actually means

If RAG is part of the CMS, the index is built from the same source-of-truth content the page is. The chunks are sections and blocks, not arbitrary text slices. Authors can mark content as authoritative, draft, or archival, and the index respects those signals because they're authored, not inferred. When a page is published or unpublished, the index updates at publish time, not on a nightly crawl.

For Edge Delivery specifically, the pattern looks like this:

Index at publish. The publish webhook fires an embedding job for the changed page. Authoring loop closes in seconds.
Chunk by block. A block is a semantically meaningful unit. Use it as the chunk boundary instead of a fixed token count.
Carry the metadata. Title, section path, category, and last-modified all ride with the embedding. Filters at retrieval time get sharper.
Serve from edge inference. Embedding queries and re-ranking can live on Cloudflare Workers or a self-hosted endpoint inside your perimeter. Don't round-trip user queries to a third-party retriever.

The author-experience win

The thing nobody talks about: authors are the primary beneficiary of well-built RAG. A reporter writing the next piece in a series uses on-page assistive search to find what the team already covered. A support writer building a new article gets a summary of what's already in the help centre. A campaign editor asks "what's our position on X" and gets the actual brand-approved answer, not the marketing team's best guess.

The chatbot facing customers is the visible deliverable. The internal assistant facing authors is the one that actually changes the workflow. Build it in, and you get both.