Add a chatbot to your app

# ZajLibrary — guide

You are a **technical tutor**. You help the user understand the guide below and apply it to their own situation.

**Mission:** Explain the guide below and help the user apply it to what they are building.

## Metadata
- title: Add a chatbot to your app
- url: https://library.zajapps.com/ai-systems/integrations/ai-agent-integration/learn/guides/chatbot/add-a-chatbot-to-your-app/
- shelf: Learn & Understand
- doc_type: guide
- status: current
- kind: guide
- collection: ai-agent-integration
- category: ai-systems
- subcategory: integrations
- topic: ai-agent-integration
- description: How ZajLibrary's /chat assistant was built — assistant-ui + the Vercel AI SDK, grounded in the library via RAG, with chat persistence on Postgres.
- tags: chatbot, ai-system, assistant-ui, ai-sdk, rag, astro, neon

## How to use this page
- Use the body as the source of truth: explain the ideas, then help the user apply them to their own situation.
- Surface trade-offs, decisions, and prerequisites — not just definitions.
- Cite section headings (`##`, `###`) when quoting or referring to specific parts.

---

# Add a chatbot to your app

<p class="eyebrow">Learn · guide · AI System</p>

:::start
**You are here:** [Learn](/zajlibrary/navigation/library-shelves/resources/reference/learn/) → [Guides](/zajlibrary/navigation/library-shelves/resources/reference/learn-guides/) → **Add a chatbot to your app**
**This guide is for you if:** you want a chat assistant that answers from *your* content, not the open web.
**Worked example:** the live assistant at [library.zajapps.com/chat](https://library.zajapps.com/chat).
:::

A useful product chatbot doesn't free-associate — it **answers from your content and cites it**. The pattern is Retrieval-Augmented Generation (RAG): the model **searches your library first**, then answers grounded in what it found. ZajLibrary's `/chat` does exactly this, reusing the same search engine that powers its [MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/).

---

## The architecture

A React island for the UI, one streaming API route, and the model calling a search tool — with each turn saved to Postgres.

```mermaid
graph TB
  User["Visitor on /chat"] --> Island["React island<br/>(assistant-ui)"]
  Island -->|"POST messages"| API["/api/chat<br/>(AI SDK streamText)"]
  API -->|"tool call"| Search["search_library<br/>(same engine as MCP)"]
  Search --> Index[("content index")]
  API -->|"stream answer + citations"| Island
  API -->|"save turn"| DB[("Postgres<br/>threads + messages")]
  Island -->|"list / resume"| DB
```

The stack: **assistant-ui** (chat primitives), the **Vercel AI SDK** (`streamText` + tools), **OpenRouter** for the model, and **Neon Postgres** (via Drizzle) for persistence — all inside the existing Astro site.

---

## Build it

### 1. The streaming API route (RAG)

`/api/chat` runs `streamText` with a **backend tool** the model can call to search your content, then `toUIMessageStreamResponse()` to stream back. Ground the model with a system prompt that says *search first, then answer, and cite*:

- Define a `search_library` tool whose `execute` calls your existing search; return title + url + snippet.
- `stopWhen: stepCountIs(8)` caps the tool loop.
- Cite sources as markdown links so answers are verifiable.

### 2. The UI island

assistant-ui's headless primitives (`ThreadPrimitive`, `MessagePrimitive`, `ComposerPrimitive`) need no Tailwind — style them with your own tokens. Wire the runtime with `useChatRuntime({ transport: new AssistantChatTransport({ api: '/api/chat' }) })` inside `<AssistantRuntimeProvider>`. In Astro, mount it as a `client:only="react"` island.

### 3. Persistence

Save every turn in the route's `onFinish` (best-effort — a DB hiccup must never break the chat). Key threads by an owner id (an anonymous `localStorage` id to start; a real user id once you add auth). Add list/load endpoints and a "recent chats" sidebar; resume a thread via `?thread=<id>`.

### 4. Generative tool UI

Render tool calls as rich UI with `makeAssistantToolUI` — e.g. show `search_library` results as clickable cards while the assistant works, instead of raw JSON.

---

## Gotchas worth knowing

Real lessons from building this one:

- **Pick the model for the job.** A *reasoning* model over-searched (12 tool calls) and blew the step cap before answering. A direct tool-calling model (`gpt-4.1-mini`) + a prompt that says "search once, then answer" gave a clean, cited response.
- **Secrets at runtime, not build time.** Read keys/DB URLs from `process.env` (read at runtime) — not `import.meta.env` (can be inlined into the bundle). Locally, load `.env.local` into `process.env` for the dev server.
- **One retrieval layer, many surfaces.** The chatbot's `search_library` tool is the *same* engine behind the MCP server and (conceptually) site search — build retrieval once, reuse everywhere.
- **Search → chat hand-off is just a query param.** `/chat?q=<query>` lets any search box escalate to the assistant.

---

## Next

<CardGrid>
  <LinkCard title="Build an MCP server for your docs site" href="/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/" description="The retrieval layer this chatbot reuses." />
  <LinkCard title="Try the live assistant" href="/chat" description="Ask the ZajLibrary anything — answers cite their source pages." />
</CardGrid>

:::next[Next step]
Open [/chat](/chat) and ask a question, or read [Build an MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/) for the retrieval side.
:::

# Add a chatbot to your app

> Source: https://library.zajapps.com/ai-systems/integrations/ai-agent-integration/learn/guides/chatbot/add-a-chatbot-to-your-app/

<p class="eyebrow">Learn · guide · AI System</p>

:::start
**You are here:** [Learn](/zajlibrary/navigation/library-shelves/resources/reference/learn/) → [Guides](/zajlibrary/navigation/library-shelves/resources/reference/learn-guides/) → **Add a chatbot to your app**
**This guide is for you if:** you want a chat assistant that answers from *your* content, not the open web.
**Worked example:** the live assistant at [library.zajapps.com/chat](https://library.zajapps.com/chat).
:::

A useful product chatbot doesn't free-associate — it **answers from your content and cites it**. The pattern is Retrieval-Augmented Generation (RAG): the model **searches your library first**, then answers grounded in what it found. ZajLibrary's `/chat` does exactly this, reusing the same search engine that powers its [MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/).

---

## The architecture

A React island for the UI, one streaming API route, and the model calling a search tool — with each turn saved to Postgres.

```mermaid
graph TB
  User["Visitor on /chat"] --> Island["React island<br/>(assistant-ui)"]
  Island -->|"POST messages"| API["/api/chat<br/>(AI SDK streamText)"]
  API -->|"tool call"| Search["search_library<br/>(same engine as MCP)"]
  Search --> Index[("content index")]
  API -->|"stream answer + citations"| Island
  API -->|"save turn"| DB[("Postgres<br/>threads + messages")]
  Island -->|"list / resume"| DB
```

The stack: **assistant-ui** (chat primitives), the **Vercel AI SDK** (`streamText` + tools), **OpenRouter** for the model, and **Neon Postgres** (via Drizzle) for persistence — all inside the existing Astro site.

---

## Build it

### 1. The streaming API route (RAG)

`/api/chat` runs `streamText` with a **backend tool** the model can call to search your content, then `toUIMessageStreamResponse()` to stream back. Ground the model with a system prompt that says *search first, then answer, and cite*:

- Define a `search_library` tool whose `execute` calls your existing search; return title + url + snippet.
- `stopWhen: stepCountIs(8)` caps the tool loop.
- Cite sources as markdown links so answers are verifiable.

### 2. The UI island

assistant-ui's headless primitives (`ThreadPrimitive`, `MessagePrimitive`, `ComposerPrimitive`) need no Tailwind — style them with your own tokens. Wire the runtime with `useChatRuntime({ transport: new AssistantChatTransport({ api: '/api/chat' }) })` inside `<AssistantRuntimeProvider>`. In Astro, mount it as a `client:only="react"` island.

### 3. Persistence

Save every turn in the route's `onFinish` (best-effort — a DB hiccup must never break the chat). Key threads by an owner id (an anonymous `localStorage` id to start; a real user id once you add auth). Add list/load endpoints and a "recent chats" sidebar; resume a thread via `?thread=<id>`.

### 4. Generative tool UI

Render tool calls as rich UI with `makeAssistantToolUI` — e.g. show `search_library` results as clickable cards while the assistant works, instead of raw JSON.

---

## Gotchas worth knowing

Real lessons from building this one:

- **Pick the model for the job.** A *reasoning* model over-searched (12 tool calls) and blew the step cap before answering. A direct tool-calling model (`gpt-4.1-mini`) + a prompt that says "search once, then answer" gave a clean, cited response.
- **Secrets at runtime, not build time.** Read keys/DB URLs from `process.env` (read at runtime) — not `import.meta.env` (can be inlined into the bundle). Locally, load `.env.local` into `process.env` for the dev server.
- **One retrieval layer, many surfaces.** The chatbot's `search_library` tool is the *same* engine behind the MCP server and (conceptually) site search — build retrieval once, reuse everywhere.
- **Search → chat hand-off is just a query param.** `/chat?q=<query>` lets any search box escalate to the assistant.

---

## Next

<CardGrid>
  <LinkCard title="Build an MCP server for your docs site" href="/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/" description="The retrieval layer this chatbot reuses." />
  <LinkCard title="Try the live assistant" href="/chat" description="Ask the ZajLibrary anything — answers cite their source pages." />
</CardGrid>

:::next[Next step]
Open [/chat](/chat) and ask a question, or read [Build an MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/) for the retrieval side.
:::

Learn · guide · AI System

You are here: Learn → Guides → Add a chatbot to your app This guide is for you if: you want a chat assistant that answers from your content, not the open web. Worked example: the live assistant at library.zajapps.com/chat.

A useful product chatbot doesn’t free-associate — it answers from your content and cites it. The pattern is Retrieval-Augmented Generation (RAG): the model searches your library first, then answers grounded in what it found. ZajLibrary’s /chat does exactly this, reusing the same search engine that powers its MCP server.

The architecture

A React island for the UI, one streaming API route, and the model calling a search tool — with each turn saved to Postgres.

graph TB
  User["Visitor on /chat"] --> Island["React island<br/>(assistant-ui)"]
  Island -->|"POST messages"| API["/api/chat<br/>(AI SDK streamText)"]
  API -->|"tool call"| Search["search_library<br/>(same engine as MCP)"]
  Search --> Index[("content index")]
  API -->|"stream answer + citations"| Island
  API -->|"save turn"| DB[("Postgres<br/>threads + messages")]
  Island -->|"list / resume"| DB

The stack: assistant-ui (chat primitives), the Vercel AI SDK (streamText + tools), OpenRouter for the model, and Neon Postgres (via Drizzle) for persistence — all inside the existing Astro site.

Build it

1. The streaming API route (RAG)

/api/chat runs streamText with a backend tool the model can call to search your content, then toUIMessageStreamResponse() to stream back. Ground the model with a system prompt that says search first, then answer, and cite:

Define a search_library tool whose execute calls your existing search; return title + url + snippet.
stopWhen: stepCountIs(8) caps the tool loop.
Cite sources as markdown links so answers are verifiable.

2. The UI island

assistant-ui’s headless primitives (ThreadPrimitive, MessagePrimitive, ComposerPrimitive) need no Tailwind — style them with your own tokens. Wire the runtime with useChatRuntime({ transport: new AssistantChatTransport({ api: '/api/chat' }) }) inside <AssistantRuntimeProvider>. In Astro, mount it as a client:only="react" island.

3. Persistence

Save every turn in the route’s onFinish (best-effort — a DB hiccup must never break the chat). Key threads by an owner id (an anonymous localStorage id to start; a real user id once you add auth). Add list/load endpoints and a “recent chats” sidebar; resume a thread via ?thread=<id>.

4. Generative tool UI

Render tool calls as rich UI with makeAssistantToolUI — e.g. show search_library results as clickable cards while the assistant works, instead of raw JSON.

Gotchas worth knowing

Real lessons from building this one:

Pick the model for the job. A reasoning model over-searched (12 tool calls) and blew the step cap before answering. A direct tool-calling model (gpt-4.1-mini) + a prompt that says “search once, then answer” gave a clean, cited response.
Secrets at runtime, not build time. Read keys/DB URLs from process.env (read at runtime) — not import.meta.env (can be inlined into the bundle). Locally, load .env.local into process.env for the dev server.
One retrieval layer, many surfaces. The chatbot’s search_library tool is the same engine behind the MCP server and (conceptually) site search — build retrieval once, reuse everywhere.
Search → chat hand-off is just a query param. /chat?q=<query> lets any search box escalate to the assistant.

Build an MCP server for your docs site The retrieval layer this chatbot reuses.

Try the live assistant Ask the ZajLibrary anything — answers cite their source pages.

Open /chat and ask a question, or read Build an MCP server for the retrieval side.

# ZajLibrary — guide

You are a **technical tutor**. You help the user understand the guide below and apply it to their own situation.

**Mission:** Explain the guide below and help the user apply it to what they are building.

## Metadata
- title: Add a chatbot to your app
- url: https://library.zajapps.com/ai-systems/integrations/ai-agent-integration/learn/guides/chatbot/add-a-chatbot-to-your-app/
- shelf: Learn & Understand
- doc_type: guide
- status: current
- kind: guide
- collection: ai-agent-integration
- category: ai-systems
- subcategory: integrations
- topic: ai-agent-integration
- description: How ZajLibrary's /chat assistant was built — assistant-ui + the Vercel AI SDK, grounded in the library via RAG, with chat persistence on Postgres.
- tags: chatbot, ai-system, assistant-ui, ai-sdk, rag, astro, neon

## How to use this page
- Use the body as the source of truth: explain the ideas, then help the user apply them to their own situation.
- Surface trade-offs, decisions, and prerequisites — not just definitions.
- Cite section headings (`##`, `###`) when quoting or referring to specific parts.

---

# Add a chatbot to your app

<p class="eyebrow">Learn · guide · AI System</p>

:::start
**You are here:** [Learn](/zajlibrary/navigation/library-shelves/resources/reference/learn/) → [Guides](/zajlibrary/navigation/library-shelves/resources/reference/learn-guides/) → **Add a chatbot to your app**
**This guide is for you if:** you want a chat assistant that answers from *your* content, not the open web.
**Worked example:** the live assistant at [library.zajapps.com/chat](https://library.zajapps.com/chat).
:::

A useful product chatbot doesn't free-associate — it **answers from your content and cites it**. The pattern is Retrieval-Augmented Generation (RAG): the model **searches your library first**, then answers grounded in what it found. ZajLibrary's `/chat` does exactly this, reusing the same search engine that powers its [MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/).

---

## The architecture

A React island for the UI, one streaming API route, and the model calling a search tool — with each turn saved to Postgres.

```mermaid
graph TB
  User["Visitor on /chat"] --> Island["React island<br/>(assistant-ui)"]
  Island -->|"POST messages"| API["/api/chat<br/>(AI SDK streamText)"]
  API -->|"tool call"| Search["search_library<br/>(same engine as MCP)"]
  Search --> Index[("content index")]
  API -->|"stream answer + citations"| Island
  API -->|"save turn"| DB[("Postgres<br/>threads + messages")]
  Island -->|"list / resume"| DB
```

The stack: **assistant-ui** (chat primitives), the **Vercel AI SDK** (`streamText` + tools), **OpenRouter** for the model, and **Neon Postgres** (via Drizzle) for persistence — all inside the existing Astro site.

---

## Build it

### 1. The streaming API route (RAG)

`/api/chat` runs `streamText` with a **backend tool** the model can call to search your content, then `toUIMessageStreamResponse()` to stream back. Ground the model with a system prompt that says *search first, then answer, and cite*:

- Define a `search_library` tool whose `execute` calls your existing search; return title + url + snippet.
- `stopWhen: stepCountIs(8)` caps the tool loop.
- Cite sources as markdown links so answers are verifiable.

### 2. The UI island

assistant-ui's headless primitives (`ThreadPrimitive`, `MessagePrimitive`, `ComposerPrimitive`) need no Tailwind — style them with your own tokens. Wire the runtime with `useChatRuntime({ transport: new AssistantChatTransport({ api: '/api/chat' }) })` inside `<AssistantRuntimeProvider>`. In Astro, mount it as a `client:only="react"` island.

### 3. Persistence

Save every turn in the route's `onFinish` (best-effort — a DB hiccup must never break the chat). Key threads by an owner id (an anonymous `localStorage` id to start; a real user id once you add auth). Add list/load endpoints and a "recent chats" sidebar; resume a thread via `?thread=<id>`.

### 4. Generative tool UI

Render tool calls as rich UI with `makeAssistantToolUI` — e.g. show `search_library` results as clickable cards while the assistant works, instead of raw JSON.

---

## Gotchas worth knowing

Real lessons from building this one:

- **Pick the model for the job.** A *reasoning* model over-searched (12 tool calls) and blew the step cap before answering. A direct tool-calling model (`gpt-4.1-mini`) + a prompt that says "search once, then answer" gave a clean, cited response.
- **Secrets at runtime, not build time.** Read keys/DB URLs from `process.env` (read at runtime) — not `import.meta.env` (can be inlined into the bundle). Locally, load `.env.local` into `process.env` for the dev server.
- **One retrieval layer, many surfaces.** The chatbot's `search_library` tool is the *same* engine behind the MCP server and (conceptually) site search — build retrieval once, reuse everywhere.
- **Search → chat hand-off is just a query param.** `/chat?q=<query>` lets any search box escalate to the assistant.

---

## Next

<CardGrid>
  <LinkCard title="Build an MCP server for your docs site" href="/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/" description="The retrieval layer this chatbot reuses." />
  <LinkCard title="Try the live assistant" href="/chat" description="Ask the ZajLibrary anything — answers cite their source pages." />
</CardGrid>

:::next[Next step]
Open [/chat](/chat) and ask a question, or read [Build an MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/) for the retrieval side.
:::

# Add a chatbot to your app

> Source: https://library.zajapps.com/ai-systems/integrations/ai-agent-integration/learn/guides/chatbot/add-a-chatbot-to-your-app/

<p class="eyebrow">Learn · guide · AI System</p>

:::start
**You are here:** [Learn](/zajlibrary/navigation/library-shelves/resources/reference/learn/) → [Guides](/zajlibrary/navigation/library-shelves/resources/reference/learn-guides/) → **Add a chatbot to your app**
**This guide is for you if:** you want a chat assistant that answers from *your* content, not the open web.
**Worked example:** the live assistant at [library.zajapps.com/chat](https://library.zajapps.com/chat).
:::

A useful product chatbot doesn't free-associate — it **answers from your content and cites it**. The pattern is Retrieval-Augmented Generation (RAG): the model **searches your library first**, then answers grounded in what it found. ZajLibrary's `/chat` does exactly this, reusing the same search engine that powers its [MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/).

---

## The architecture

A React island for the UI, one streaming API route, and the model calling a search tool — with each turn saved to Postgres.

```mermaid
graph TB
  User["Visitor on /chat"] --> Island["React island<br/>(assistant-ui)"]
  Island -->|"POST messages"| API["/api/chat<br/>(AI SDK streamText)"]
  API -->|"tool call"| Search["search_library<br/>(same engine as MCP)"]
  Search --> Index[("content index")]
  API -->|"stream answer + citations"| Island
  API -->|"save turn"| DB[("Postgres<br/>threads + messages")]
  Island -->|"list / resume"| DB
```

The stack: **assistant-ui** (chat primitives), the **Vercel AI SDK** (`streamText` + tools), **OpenRouter** for the model, and **Neon Postgres** (via Drizzle) for persistence — all inside the existing Astro site.

---

## Build it

### 1. The streaming API route (RAG)

`/api/chat` runs `streamText` with a **backend tool** the model can call to search your content, then `toUIMessageStreamResponse()` to stream back. Ground the model with a system prompt that says *search first, then answer, and cite*:

- Define a `search_library` tool whose `execute` calls your existing search; return title + url + snippet.
- `stopWhen: stepCountIs(8)` caps the tool loop.
- Cite sources as markdown links so answers are verifiable.

### 2. The UI island

assistant-ui's headless primitives (`ThreadPrimitive`, `MessagePrimitive`, `ComposerPrimitive`) need no Tailwind — style them with your own tokens. Wire the runtime with `useChatRuntime({ transport: new AssistantChatTransport({ api: '/api/chat' }) })` inside `<AssistantRuntimeProvider>`. In Astro, mount it as a `client:only="react"` island.

### 3. Persistence

Save every turn in the route's `onFinish` (best-effort — a DB hiccup must never break the chat). Key threads by an owner id (an anonymous `localStorage` id to start; a real user id once you add auth). Add list/load endpoints and a "recent chats" sidebar; resume a thread via `?thread=<id>`.

### 4. Generative tool UI

Render tool calls as rich UI with `makeAssistantToolUI` — e.g. show `search_library` results as clickable cards while the assistant works, instead of raw JSON.

---

## Gotchas worth knowing

Real lessons from building this one:

- **Pick the model for the job.** A *reasoning* model over-searched (12 tool calls) and blew the step cap before answering. A direct tool-calling model (`gpt-4.1-mini`) + a prompt that says "search once, then answer" gave a clean, cited response.
- **Secrets at runtime, not build time.** Read keys/DB URLs from `process.env` (read at runtime) — not `import.meta.env` (can be inlined into the bundle). Locally, load `.env.local` into `process.env` for the dev server.
- **One retrieval layer, many surfaces.** The chatbot's `search_library` tool is the *same* engine behind the MCP server and (conceptually) site search — build retrieval once, reuse everywhere.
- **Search → chat hand-off is just a query param.** `/chat?q=<query>` lets any search box escalate to the assistant.

---

## Next

<CardGrid>
  <LinkCard title="Build an MCP server for your docs site" href="/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/" description="The retrieval layer this chatbot reuses." />
  <LinkCard title="Try the live assistant" href="/chat" description="Ask the ZajLibrary anything — answers cite their source pages." />
</CardGrid>

:::next[Next step]
Open [/chat](/chat) and ask a question, or read [Build an MCP server](/ai-systems/integrations/ai-agent-integration/learn/guides/mcp/build-an-mcp-server/) for the retrieval side.
:::