RAG
Ground answers
RAG (retrieval-augmented generation) grounds an agent’s answers in your documents. The shape is always the same pipeline: split documents into chunks, turn each chunk into an embedding, store those vectors, and at query time retrieve the closest chunks to feed the model as context. Mastra ships building blocks for each stage so you don’t wire it from scratch.
The RAG pipeline
Build the index
Section titled “Build the index”Split a document, embed the chunks, and upsert them into a vector store. The embedder is a model-router string; the vector store is whichever backend you registered.
import { MDocument } from '@mastra/rag';import { openai } from '@ai-sdk/openai';import { embedMany } from 'ai';
const doc = MDocument.fromText(longText);const chunks = await doc.chunk({ strategy: 'recursive', size: 512, overlap: 50 });
const { embeddings } = await embedMany({ model: openai.embedding('text-embedding-3-small'), values: chunks.map((c) => c.text),});
await pgVector.upsert({ indexName: 'docs', vectors: embeddings });Retrieve at query time
Section titled “Retrieve at query time”The cleanest path is a vector query tool — hand it to an agent and the model retrieves grounding context on its own.
import { createVectorQueryTool } from '@mastra/rag';import { ModelRouterEmbeddingModel } from '@mastra/core/llm';
const vectorQueryTool = createVectorQueryTool({ vectorStoreName: 'pgVector', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),});
// Give it to an agent like any other tool:export const docsAgent = new Agent({ id: 'docs-agent', name: 'Docs Agent', instructions: 'Answer using retrieved context. Cite the source chunks.', model: 'openai/gpt-4o', tools: { vectorQueryTool },});Vector stores
Section titled “Vector stores”The pipeline is backend-agnostic — swap the store, keep the code. Common targets:
| Store | Package | Notes |
|---|---|---|
| PgVector | @mastra/pg | Postgres + pgvector; pairs with PostgresStore for memory too. |
| LibSQL | @mastra/libsql | LibSQLVector — zero-infra local/dev option. |
| Pinecone | @mastra/pinecone | Managed, serverless vector DB. |
| Qdrant | @mastra/qdrant | Self-host or cloud. |
Reference: RAG overview · Chunking & embedding · Retrieval
Next: Multi-agent systems — coordinate specialists under a supervisor.