Add a chatbot to your app
Learn · guide · AI System
You are here: Learn → Guides → Add a chatbot to your app This guide is for you if: you want a chat assistant that answers from your content, not the open web. Worked example: the live assistant at library.zajapps.com/chat.
A useful product chatbot doesn’t free-associate — it answers from your content and cites it. The pattern is Retrieval-Augmented Generation (RAG): the model searches your library first, then answers grounded in what it found. ZajLibrary’s /chat does exactly this, reusing the same search engine that powers its MCP server.
The architecture
Section titled “The architecture”A React island for the UI, one streaming API route, and the model calling a search tool — with each turn saved to Postgres.
graph TB User["Visitor on /chat"] --> Island["React island<br/>(assistant-ui)"] Island -->|"POST messages"| API["/api/chat<br/>(AI SDK streamText)"] API -->|"tool call"| Search["search_library<br/>(same engine as MCP)"] Search --> Index[("content index")] API -->|"stream answer + citations"| Island API -->|"save turn"| DB[("Postgres<br/>threads + messages")] Island -->|"list / resume"| DBThe stack: assistant-ui (chat primitives), the Vercel AI SDK (streamText + tools), OpenRouter for the model, and Neon Postgres (via Drizzle) for persistence — all inside the existing Astro site.
Build it
Section titled “Build it”1. The streaming API route (RAG)
Section titled “1. The streaming API route (RAG)”/api/chat runs streamText with a backend tool the model can call to search your content, then toUIMessageStreamResponse() to stream back. Ground the model with a system prompt that says search first, then answer, and cite:
- Define a
search_librarytool whoseexecutecalls your existing search; return title + url + snippet. stopWhen: stepCountIs(8)caps the tool loop.- Cite sources as markdown links so answers are verifiable.
2. The UI island
Section titled “2. The UI island”assistant-ui’s headless primitives (ThreadPrimitive, MessagePrimitive, ComposerPrimitive) need no Tailwind — style them with your own tokens. Wire the runtime with useChatRuntime({ transport: new AssistantChatTransport({ api: '/api/chat' }) }) inside <AssistantRuntimeProvider>. In Astro, mount it as a client:only="react" island.
3. Persistence
Section titled “3. Persistence”Save every turn in the route’s onFinish (best-effort — a DB hiccup must never break the chat). Key threads by an owner id (an anonymous localStorage id to start; a real user id once you add auth). Add list/load endpoints and a “recent chats” sidebar; resume a thread via ?thread=<id>.
4. Generative tool UI
Section titled “4. Generative tool UI”Render tool calls as rich UI with makeAssistantToolUI — e.g. show search_library results as clickable cards while the assistant works, instead of raw JSON.
Gotchas worth knowing
Section titled “Gotchas worth knowing”Real lessons from building this one:
- Pick the model for the job. A reasoning model over-searched (12 tool calls) and blew the step cap before answering. A direct tool-calling model (
gpt-4.1-mini) + a prompt that says “search once, then answer” gave a clean, cited response. - Secrets at runtime, not build time. Read keys/DB URLs from
process.env(read at runtime) — notimport.meta.env(can be inlined into the bundle). Locally, load.env.localintoprocess.envfor the dev server. - One retrieval layer, many surfaces. The chatbot’s
search_librarytool is the same engine behind the MCP server and (conceptually) site search — build retrieval once, reuse everywhere. - Search → chat hand-off is just a query param.
/chat?q=<query>lets any search box escalate to the assistant.
Open /chat and ask a question, or read Build an MCP server for the retrieval side.