Memory
Continuity
By default an LLM forgets everything between calls. Memory is what lets a Mastra agent remember a user — within a conversation and across them. It comes in three layers that stack on the same storage, and you turn on only what you need.
The three layers of agent memory
The three kinds
Section titled “The three kinds”- Conversation history — the last N messages, included verbatim. Cheap, automatic, bounded by
lastMessages. - Semantic recall — older messages retrieved by similarity from a vector index, so the agent can surface a relevant exchange from days ago without stuffing the whole transcript into context.
- Working memory — a small, persistent profile the agent maintains (name, preferences, current goals) via a template it updates as it learns. This is what makes an agent feel like it knows you.
Configure it
Section titled “Configure it”A Memory instance needs storage (for messages), and for semantic recall also a vector store and an embedder. Everything else is tuning under options.
import { Agent } from '@mastra/core/agent';import { Memory } from '@mastra/memory';import { LibSQLStore, LibSQLVector } from '@mastra/libsql';
const memory = new Memory({ storage: new LibSQLStore({ id: 'mem-store', url: 'file:./memory.db' }), vector: new LibSQLVector({ id: 'mem-vector', url: 'file:./vector.db' }), embedder: 'openai/text-embedding-3-small', options: { lastMessages: 20, // conversation history semanticRecall: { topK: 3, messageRange: { before: 2, after: 1 } }, workingMemory: { enabled: true }, // persistent profile },});
export const memoryAgent = new Agent({ id: 'memory-agent', name: 'Memory Agent', instructions: 'Remember what the user tells you about themselves and use it.', model: 'openai/gpt-4o', memory,});Working-memory template
Section titled “Working-memory template”Give working memory a template and the agent fills it in over time — a structured profile beats free-form notes.
workingMemory: { enabled: true, template: `# User Profile- Name:- Timezone:- Preferences:- Current goals:`,}Threads & resources
Section titled “Threads & resources”Memory is scoped per resource (usually a user) and thread (a conversation). Pass them when you call the agent so continuity stays isolated per user/conversation:
await memoryAgent.generate('What did I ask you to remember?', { memory: { resource: 'user-123', thread: 'support-chat' },});Reference: Memory overview · Working memory · Semantic recall
Next: RAG — ground answers in your own documents.