Memory

# ZajLibrary — handbook

You are a **technical tutor**. You help the user understand the handbook below and apply it to their own situation.

**Mission:** Explain the handbook below and help the user apply it to what they are building.

## Metadata
- title: Memory
- url: https://library.zajapps.com/ai-systems/agent-frameworks/mastra/learn/handbooks/memory/
- shelf: Learn & Understand
- doc_type: handbook
- status: current
- kind: handbook
- collection: mastra
- category: ai-systems
- subcategory: agent-frameworks
- topic: mastra
- description: How agents keep continuity — conversation history, semantic recall, and working memory, all backed by storage and a vector index.
- tags: mastra, memory, storage

## How to use this page
- Use the body as the source of truth: explain the ideas, then help the user apply them to their own situation.
- Surface trade-offs, decisions, and prerequisites — not just definitions.
- Cite section headings (`##`, `###`) when quoting or referring to specific parts.

---

# Memory

<p class="eyebrow">Continuity</p>

By default an LLM forgets everything between calls. **Memory** is what lets a Mastra agent remember a user — within a conversation and across them. It comes in three layers that stack on the same storage, and you turn on only what you need.

<LayerStack
  title="The three layers of agent memory"
  caption="Each layer reads from the same storage (and a vector index for recall). Conversation history is automatic; semantic recall and working memory are opt-in."
  layers={[
    { label: 'Conversation history', sub: 'recent turns', items: ['lastMessages'] },
    { label: 'Semantic recall', sub: 'similar past messages', tone: 'core', items: ['topK', 'messageRange', 'vector'] },
    { label: 'Working memory', sub: 'persistent profile', items: ['template', 'enabled'] },
    { label: 'Storage + vector', sub: 'the durable backbone', items: ['LibSQL', 'Postgres'] },
  ]}
/>

## The three kinds

1. **Conversation history** — the last *N* messages, included verbatim. Cheap, automatic, bounded by `lastMessages`.
2. **Semantic recall** — older messages retrieved by **similarity** from a vector index, so the agent can surface a relevant exchange from days ago without stuffing the whole transcript into context.
3. **Working memory** — a small, persistent **profile** the agent maintains (name, preferences, current goals) via a template it updates as it learns. This is what makes an agent feel like it *knows* you.

## Configure it

A `Memory` instance needs **storage** (for messages), and for semantic recall also a **vector** store and an **embedder**. Everything else is tuning under `options`.

```ts

const memory = new Memory({
  storage: new LibSQLStore({ id: 'mem-store', url: 'file:./memory.db' }),
  vector: new LibSQLVector({ id: 'mem-vector', url: 'file:./vector.db' }),
  embedder: 'openai/text-embedding-3-small',
  options: {
    lastMessages: 20,                                  // conversation history
    semanticRecall: { topK: 3, messageRange: { before: 2, after: 1 } },
    workingMemory: { enabled: true },                  // persistent profile
  },
});

  id: 'memory-agent',
  name: 'Memory Agent',
  instructions: 'Remember what the user tells you about themselves and use it.',
  model: 'openai/gpt-4o',
  memory,
});
```

## Working-memory template

Give working memory a **template** and the agent fills it in over time — a structured profile beats free-form notes.

```ts
workingMemory: {
  enabled: true,
  template: `
# User Profile
- Name:
- Timezone:
- Preferences:
- Current goals:
`,
}
```

## Threads & resources

Memory is scoped per **resource** (usually a user) and **thread** (a conversation). Pass them when you call the agent so continuity stays isolated per user/conversation:

```ts
await memoryAgent.generate('What did I ask you to remember?', {
  memory: { resource: 'user-123', thread: 'support-chat' },
});
```

> [!NOTE]
> Semantic recall and durable memory **require a storage backend**. On serverless, point storage at a hosted DB — the ephemeral filesystem won't persist `file:./memory.db` across invocations. See [Deployment → production concerns](/ai-systems/agent-frameworks/mastra/learn/handbooks/deployment/#production-concerns).

---

**Reference:** [Memory overview](https://mastra.ai/docs/memory/overview) · [Working memory](https://mastra.ai/docs/memory/working-memory) · [Semantic recall](https://mastra.ai/docs/memory/semantic-recall)

Next: [**RAG**](/ai-systems/agent-frameworks/mastra/learn/handbooks/rag/) — ground answers in your own documents.

# Memory

> Source: https://library.zajapps.com/ai-systems/agent-frameworks/mastra/learn/handbooks/memory/

<p class="eyebrow">Continuity</p>

By default an LLM forgets everything between calls. **Memory** is what lets a Mastra agent remember a user — within a conversation and across them. It comes in three layers that stack on the same storage, and you turn on only what you need.

<LayerStack
  title="The three layers of agent memory"
  caption="Each layer reads from the same storage (and a vector index for recall). Conversation history is automatic; semantic recall and working memory are opt-in."
  layers={[
    { label: 'Conversation history', sub: 'recent turns', items: ['lastMessages'] },
    { label: 'Semantic recall', sub: 'similar past messages', tone: 'core', items: ['topK', 'messageRange', 'vector'] },
    { label: 'Working memory', sub: 'persistent profile', items: ['template', 'enabled'] },
    { label: 'Storage + vector', sub: 'the durable backbone', items: ['LibSQL', 'Postgres'] },
  ]}
/>

## The three kinds

1. **Conversation history** — the last *N* messages, included verbatim. Cheap, automatic, bounded by `lastMessages`.
2. **Semantic recall** — older messages retrieved by **similarity** from a vector index, so the agent can surface a relevant exchange from days ago without stuffing the whole transcript into context.
3. **Working memory** — a small, persistent **profile** the agent maintains (name, preferences, current goals) via a template it updates as it learns. This is what makes an agent feel like it *knows* you.

## Configure it

A `Memory` instance needs **storage** (for messages), and for semantic recall also a **vector** store and an **embedder**. Everything else is tuning under `options`.

```ts

const memory = new Memory({
  storage: new LibSQLStore({ id: 'mem-store', url: 'file:./memory.db' }),
  vector: new LibSQLVector({ id: 'mem-vector', url: 'file:./vector.db' }),
  embedder: 'openai/text-embedding-3-small',
  options: {
    lastMessages: 20,                                  // conversation history
    semanticRecall: { topK: 3, messageRange: { before: 2, after: 1 } },
    workingMemory: { enabled: true },                  // persistent profile
  },
});

  id: 'memory-agent',
  name: 'Memory Agent',
  instructions: 'Remember what the user tells you about themselves and use it.',
  model: 'openai/gpt-4o',
  memory,
});
```

## Working-memory template

Give working memory a **template** and the agent fills it in over time — a structured profile beats free-form notes.

```ts
workingMemory: {
  enabled: true,
  template: `
# User Profile
- Name:
- Timezone:
- Preferences:
- Current goals:
`,
}
```

## Threads & resources

Memory is scoped per **resource** (usually a user) and **thread** (a conversation). Pass them when you call the agent so continuity stays isolated per user/conversation:

```ts
await memoryAgent.generate('What did I ask you to remember?', {
  memory: { resource: 'user-123', thread: 'support-chat' },
});
```

> [!NOTE]
> Semantic recall and durable memory **require a storage backend**. On serverless, point storage at a hosted DB — the ephemeral filesystem won't persist `file:./memory.db` across invocations. See [Deployment → production concerns](/ai-systems/agent-frameworks/mastra/learn/handbooks/deployment/#production-concerns).

---

**Reference:** [Memory overview](https://mastra.ai/docs/memory/overview) · [Working memory](https://mastra.ai/docs/memory/working-memory) · [Semantic recall](https://mastra.ai/docs/memory/semantic-recall)

Next: [**RAG**](/ai-systems/agent-frameworks/mastra/learn/handbooks/rag/) — ground answers in your own documents.

Continuity

By default an LLM forgets everything between calls. Memory is what lets a Mastra agent remember a user — within a conversation and across them. It comes in three layers that stack on the same storage, and you turn on only what you need.

The three layers of agent memory

Each layer reads from the same storage (and a vector index for recall). Conversation history is automatic; semantic recall and working memory are opt-in.

The three kinds

Conversation history — the last N messages, included verbatim. Cheap, automatic, bounded by lastMessages.
Semantic recall — older messages retrieved by similarity from a vector index, so the agent can surface a relevant exchange from days ago without stuffing the whole transcript into context.
Working memory — a small, persistent profile the agent maintains (name, preferences, current goals) via a template it updates as it learns. This is what makes an agent feel like it knows you.

Configure it

A Memory instance needs storage (for messages), and for semantic recall also a vector store and an embedder. Everything else is tuning under options.

import { Agent } from '@mastra/core/agent';
import { Memory } from '@mastra/memory';
import { LibSQLStore, LibSQLVector } from '@mastra/libsql';

const memory = new Memory({
  storage: new LibSQLStore({ id: 'mem-store', url: 'file:./memory.db' }),
  vector: new LibSQLVector({ id: 'mem-vector', url: 'file:./vector.db' }),
  embedder: 'openai/text-embedding-3-small',
  options: {
    lastMessages: 20,                                  // conversation history
    semanticRecall: { topK: 3, messageRange: { before: 2, after: 1 } },
    workingMemory: { enabled: true },                  // persistent profile
  },
});

export const memoryAgent = new Agent({
  id: 'memory-agent',
  name: 'Memory Agent',
  instructions: 'Remember what the user tells you about themselves and use it.',
  model: 'openai/gpt-4o',
  memory,
});

Working-memory template

Give working memory a template and the agent fills it in over time — a structured profile beats free-form notes.

workingMemory: {
  enabled: true,
  template: `
# User Profile
- Name:
- Timezone:
- Preferences:
- Current goals:
`,
}

Threads & resources

Memory is scoped per resource (usually a user) and thread (a conversation). Pass them when you call the agent so continuity stays isolated per user/conversation:

await memoryAgent.generate('What did I ask you to remember?', {
  memory: { resource: 'user-123', thread: 'support-chat' },
});

Reference: Memory overview · Working memory · Semantic recall

Next: RAG — ground answers in your own documents.

# ZajLibrary — handbook

You are a **technical tutor**. You help the user understand the handbook below and apply it to their own situation.

**Mission:** Explain the handbook below and help the user apply it to what they are building.

## Metadata
- title: Memory
- url: https://library.zajapps.com/ai-systems/agent-frameworks/mastra/learn/handbooks/memory/
- shelf: Learn & Understand
- doc_type: handbook
- status: current
- kind: handbook
- collection: mastra
- category: ai-systems
- subcategory: agent-frameworks
- topic: mastra
- description: How agents keep continuity — conversation history, semantic recall, and working memory, all backed by storage and a vector index.
- tags: mastra, memory, storage

## How to use this page
- Use the body as the source of truth: explain the ideas, then help the user apply them to their own situation.
- Surface trade-offs, decisions, and prerequisites — not just definitions.
- Cite section headings (`##`, `###`) when quoting or referring to specific parts.

---

# Memory

<p class="eyebrow">Continuity</p>

By default an LLM forgets everything between calls. **Memory** is what lets a Mastra agent remember a user — within a conversation and across them. It comes in three layers that stack on the same storage, and you turn on only what you need.

<LayerStack
  title="The three layers of agent memory"
  caption="Each layer reads from the same storage (and a vector index for recall). Conversation history is automatic; semantic recall and working memory are opt-in."
  layers={[
    { label: 'Conversation history', sub: 'recent turns', items: ['lastMessages'] },
    { label: 'Semantic recall', sub: 'similar past messages', tone: 'core', items: ['topK', 'messageRange', 'vector'] },
    { label: 'Working memory', sub: 'persistent profile', items: ['template', 'enabled'] },
    { label: 'Storage + vector', sub: 'the durable backbone', items: ['LibSQL', 'Postgres'] },
  ]}
/>

## The three kinds

1. **Conversation history** — the last *N* messages, included verbatim. Cheap, automatic, bounded by `lastMessages`.
2. **Semantic recall** — older messages retrieved by **similarity** from a vector index, so the agent can surface a relevant exchange from days ago without stuffing the whole transcript into context.
3. **Working memory** — a small, persistent **profile** the agent maintains (name, preferences, current goals) via a template it updates as it learns. This is what makes an agent feel like it *knows* you.

## Configure it

A `Memory` instance needs **storage** (for messages), and for semantic recall also a **vector** store and an **embedder**. Everything else is tuning under `options`.

```ts

const memory = new Memory({
  storage: new LibSQLStore({ id: 'mem-store', url: 'file:./memory.db' }),
  vector: new LibSQLVector({ id: 'mem-vector', url: 'file:./vector.db' }),
  embedder: 'openai/text-embedding-3-small',
  options: {
    lastMessages: 20,                                  // conversation history
    semanticRecall: { topK: 3, messageRange: { before: 2, after: 1 } },
    workingMemory: { enabled: true },                  // persistent profile
  },
});

  id: 'memory-agent',
  name: 'Memory Agent',
  instructions: 'Remember what the user tells you about themselves and use it.',
  model: 'openai/gpt-4o',
  memory,
});
```

## Working-memory template

Give working memory a **template** and the agent fills it in over time — a structured profile beats free-form notes.

```ts
workingMemory: {
  enabled: true,
  template: `
# User Profile
- Name:
- Timezone:
- Preferences:
- Current goals:
`,
}
```

## Threads & resources

Memory is scoped per **resource** (usually a user) and **thread** (a conversation). Pass them when you call the agent so continuity stays isolated per user/conversation:

```ts
await memoryAgent.generate('What did I ask you to remember?', {
  memory: { resource: 'user-123', thread: 'support-chat' },
});
```

> [!NOTE]
> Semantic recall and durable memory **require a storage backend**. On serverless, point storage at a hosted DB — the ephemeral filesystem won't persist `file:./memory.db` across invocations. See [Deployment → production concerns](/ai-systems/agent-frameworks/mastra/learn/handbooks/deployment/#production-concerns).

---

**Reference:** [Memory overview](https://mastra.ai/docs/memory/overview) · [Working memory](https://mastra.ai/docs/memory/working-memory) · [Semantic recall](https://mastra.ai/docs/memory/semantic-recall)

Next: [**RAG**](/ai-systems/agent-frameworks/mastra/learn/handbooks/rag/) — ground answers in your own documents.

# Memory

> Source: https://library.zajapps.com/ai-systems/agent-frameworks/mastra/learn/handbooks/memory/

<p class="eyebrow">Continuity</p>

By default an LLM forgets everything between calls. **Memory** is what lets a Mastra agent remember a user — within a conversation and across them. It comes in three layers that stack on the same storage, and you turn on only what you need.

<LayerStack
  title="The three layers of agent memory"
  caption="Each layer reads from the same storage (and a vector index for recall). Conversation history is automatic; semantic recall and working memory are opt-in."
  layers={[
    { label: 'Conversation history', sub: 'recent turns', items: ['lastMessages'] },
    { label: 'Semantic recall', sub: 'similar past messages', tone: 'core', items: ['topK', 'messageRange', 'vector'] },
    { label: 'Working memory', sub: 'persistent profile', items: ['template', 'enabled'] },
    { label: 'Storage + vector', sub: 'the durable backbone', items: ['LibSQL', 'Postgres'] },
  ]}
/>

## The three kinds

1. **Conversation history** — the last *N* messages, included verbatim. Cheap, automatic, bounded by `lastMessages`.
2. **Semantic recall** — older messages retrieved by **similarity** from a vector index, so the agent can surface a relevant exchange from days ago without stuffing the whole transcript into context.
3. **Working memory** — a small, persistent **profile** the agent maintains (name, preferences, current goals) via a template it updates as it learns. This is what makes an agent feel like it *knows* you.

## Configure it

A `Memory` instance needs **storage** (for messages), and for semantic recall also a **vector** store and an **embedder**. Everything else is tuning under `options`.

```ts

const memory = new Memory({
  storage: new LibSQLStore({ id: 'mem-store', url: 'file:./memory.db' }),
  vector: new LibSQLVector({ id: 'mem-vector', url: 'file:./vector.db' }),
  embedder: 'openai/text-embedding-3-small',
  options: {
    lastMessages: 20,                                  // conversation history
    semanticRecall: { topK: 3, messageRange: { before: 2, after: 1 } },
    workingMemory: { enabled: true },                  // persistent profile
  },
});

  id: 'memory-agent',
  name: 'Memory Agent',
  instructions: 'Remember what the user tells you about themselves and use it.',
  model: 'openai/gpt-4o',
  memory,
});
```

## Working-memory template

Give working memory a **template** and the agent fills it in over time — a structured profile beats free-form notes.

```ts
workingMemory: {
  enabled: true,
  template: `
# User Profile
- Name:
- Timezone:
- Preferences:
- Current goals:
`,
}
```

## Threads & resources

Memory is scoped per **resource** (usually a user) and **thread** (a conversation). Pass them when you call the agent so continuity stays isolated per user/conversation:

```ts
await memoryAgent.generate('What did I ask you to remember?', {
  memory: { resource: 'user-123', thread: 'support-chat' },
});
```

> [!NOTE]
> Semantic recall and durable memory **require a storage backend**. On serverless, point storage at a hosted DB — the ephemeral filesystem won't persist `file:./memory.db` across invocations. See [Deployment → production concerns](/ai-systems/agent-frameworks/mastra/learn/handbooks/deployment/#production-concerns).

---

**Reference:** [Memory overview](https://mastra.ai/docs/memory/overview) · [Working memory](https://mastra.ai/docs/memory/working-memory) · [Semantic recall](https://mastra.ai/docs/memory/semantic-recall)

Next: [**RAG**](/ai-systems/agent-frameworks/mastra/learn/handbooks/rag/) — ground answers in your own documents.