Skip to content
prod e051e98
Browse

Concepts

The vocabulary

Mastra is a small set of primitives that compose. Learn the words here; the Architecture page shows how they fit together.

Everything below hangs off one object — the Mastra instance. This map shows how the primitives relate; read the definitions that follow with this shape in mind.

graph TD
M["Mastra instance<br/>runtime + registry"]
M --> A["Agents"]
M --> W["Workflows"]
M --> T["Tools"]
M --> S["Storage"]
M --> O["Observability<br/>OpenTelemetry"]
A -->|reason with| MD["Model"]
A -->|call| T
A -->|remember via| MEM["Memory"]
W -->|graph of| ST["Steps"]
MEM -->|persisted in| S
R["RAG"] -->|grounds answers of| A
V["Voice"] -->|speak / listen for| A
SC["Scorers / Evals"] -->|grade output of| A
MCP["MCP"] -->|bridges tools to/from| T
O -.->|traces| A
O -.->|traces| W
O -.->|traces| T

The whole vocabulary reduces to one sentence: things the Mastra instance registers, and what each one talks to. Agents, workflows, tools, storage, and telemetry are registered on the instance; everything else (memory, RAG, voice, scorers, MCP) attaches to an agent or flows through the runtime.

The Mastra instance is the central runtime and registry. You construct it once (new Mastra({ ... })) and register your agents, workflows, tools, storage, and telemetry on it. Everything else in your app reaches capabilities through this instance — it owns configuration and execution.

An agent is an LLM bound to instructions (its system prompt/role), a model, a set of tools it may call, and optionally memory. The agent decides which tools to invoke and when, then produces a result. Agents are the non-deterministic, reasoning unit.

A tool is a typed function the model can call — id, description, an input schema and output schema (Zod), and an execute function. Schemas mean the model’s arguments are validated before your code runs, and outputs are predictable. Tools are how an agent does things (query a DB, hit an API, run a calculation).

A workflow is a graph of steps with explicit control flow — sequencing (.then), parallelism (.parallel), branching (.branch), loops (.dowhile/.dountil), and suspend/resume for human-in-the-loop. Workflows wrap the non-deterministic model in deterministic, durable orchestration: they can pause, persist state, and resume later.

Memory gives agents continuity. Working memory holds the current task’s context; recall (semantic) memory stores and retrieves past messages by similarity across threads. Memory is what lets an agent “remember” a user within and across conversations.

RAG (retrieval-augmented generation) is the chunk → embed → store → retrieve pipeline for grounding answers in your own documents. Mastra provides the building blocks (document processing, embedMany, vector store integrations — PgVector / Pinecone / Qdrant / Mongo, retrieval) so an agent can cite your data rather than only its training data.

Voice is a unified TTS (text-to-speech), STT (speech-to-text), and real-time speech-to-speech surface across many providers (OpenAI, Azure, ElevenLabs, Google, Deepgram, …) via CompositeVoice — so an agent can speak and listen without per-provider glue.

Scorers (evals) are automated quality checks on agent output — relevance, correctness, safety, custom rubrics. They turn “looks fine” into a measurable signal you can track over time and gate releases on.

Observability is built on OpenTelemetry: every model call, tool call, and workflow step is traced. Integrations export those traces to tools like Langfuse or Braintrust so you can debug and monitor agents in production.

Mastra speaks MCP both ways: as a client it can consume tools exposed by external MCP servers; as a server it can expose your tools/agents to other MCP-aware apps. This is how Mastra interoperates with the wider agent ecosystem.

For problems too big for one agent, Mastra coordinates multiple agents. The current pattern is a supervisor agent — a lead agent given an agents prop that delegates to specialist sub-agents (other patterns: handoff, workflow-driven, council).


Next: Architecture — how these pieces connect and how a request flows end to end.