Architecture

Multi-Vector Encoding

How MetaMemory encodes every memory into four specialized vector spaces — semantic, emotional, process, and context — for retrieval that captures what happened, how it felt, what steps were taken, and why it mattered.

Last updated: March 2026

Overview

Rather than representing each memory as a single embedding vector, MetaMemory encodes every memory into four specialized vector spaces. Each captures a different dimension of the interaction:

Semantic — What was discussed. Generated by your embedding provider (OpenAI, Gemini, Cohere, etc.). Dimensions vary by model (768-4096).
Emotional — How it felt. A 132-dimensional trajectory vector encoding the sequence of emotional states over time, not just a point-in-time snapshot.
Process — What steps were taken. A 132-dimensional vector encoding the action sequence, outcomes, and process characteristics.
Context — Why it mattered. A 64-dimensional vector encoding the task type, domain, complexity, and time pressure.

At retrieval time, all four similarities are computed and combined using learned weights. The system adapts which dimensions matter most for each context.

How Retrieval Works

When you search memories, MetaMemory computes a weighted combination of cosine similarities across all four vector spaces. The overall score is calculated as:

overallScore = (semantic x α₁ + emotional x α₂ + process x α₃ + context x α₄) / Σα

The default weights are:

α₁ (semantic) = 0.45 — content similarity
α₂ (emotional) = 0.25 — emotional trajectory similarity
α₃ (process) = 0.15 — action sequence similarity
α₄ (context) = 0.15 — situational similarity

Each score is a cosine similarity in its respective vector space, normalized to [0, 1]. Weights are normalized to sum to 1.0 before scoring.

Semantic Embeddings

Generated asynchronously via your configured embedding provider. MetaMemory supports any provider through the BYOK architecture:

OpenAI — text-embedding-3-small (1536d), text-embedding-3-large (3072d)
Google Gemini — gemini-embedding-001
Cohere — embed-v4.0
Voyage AI, Mistral, Ollama, Qwen

Semantic embeddings are stored in pgvector for efficient approximate nearest neighbor search. They handle the "what was discussed" dimension — matching queries to memories based on content similarity.

Emotional Trajectory Encoding

The emotional encoder captures the sequence of emotional states across an interaction, not just the final state. The encoding pipeline:

Each emotional state in the sequence is mapped to a 128-dimensional base vector, scaled by intensity (0-1)
Temporal weighting is applied — recent emotions are weighted at 1.0, decaying to 0.5 for the oldest point
Weighted embeddings are averaged to produce a 128-dim base representation
Four trajectory features are appended:
- Emotional range — variance in intensity values
- Trajectory length — normalized count of state changes
- Volatility — rate of change between adjacent states
- Trend — net valence shift from first to last emotion (-1 to +1)
The final 132-dim vector is L2-normalized for cosine similarity

This means "find memories where the user had a similar emotional experience" is a query that works — matching not just the emotion label but the shape of the emotional arc.

Process Sequence Encoding

The process encoder captures the steps taken during an interaction — what actions were performed, in what order, and whether they succeeded or failed.

Each action step is mapped to a 128-dim base vector from 9 action categories (search, analyze, create, update, delete, validate, transform, aggregate, unknown)
Vectors are scaled by outcome: success (1.0), partial (0.75), failure (0.5)
Positional encoding is applied using sinusoidal functions to preserve step order
Four process features are appended: step count (normalized), success rate, process complexity, and process efficiency
The final 132-dim vector is L2-normalized

This enables queries like "find memories where a similar debugging process was followed" — matching not just the topic but the methodology.

Context Encoding

The context encoder captures the situational metadata — task type, domain, complexity, and time pressure — in a 64-dimensional vector.

Dimensions are allocated as follows:

Dimensions 0-15: Task type (debugging, writing, research, learning, analysis, planning, review)
Dimensions 16-31: Domain (authentication, database, UI, API, security, performance, testing)
Dimensions 32-39: Complexity (1-10 scale)
Dimensions 40-47: Time pressure (low, medium, high)
Dimensions 48-63: Additional features derived from complexity

If context metadata isn't provided explicitly, MetaMemory infers it from the query text — detecting task type, domain, and estimated complexity automatically.

Adaptive Weight Learning

The default weights work well for general use, but MetaMemory learns optimal weights per context using gradient descent. After each retrieval, the system updates weights based on effectiveness feedback: dimensions that contributed more to successful retrievals get higher weight, and vice versa.

Key parameters: the learning rate is 0.1, a minimum of 5 observations are required before learned weights activate, and weights are normalized to sum to 1.0 after each update.

Weights are bucketed by context hash — SHA256(taskType|emotion|complexity|domain) truncated to 16 characters. This means the system learns that frustrated users in debugging contexts benefit from higher emotional weight, while confident users doing research benefit from higher semantic weight.

Two-Phase Storage

When a memory is created, embeddings are generated in two phases for optimal latency:

Phase 1 (synchronous): Emotional, process, and context embeddings are generated locally with no API calls. The memory is stored immediately and the API returns.
Phase 2 (asynchronous): The semantic embedding is generated via your embedding provider (API call) and stored to pgvector in the background. Graph sync to Neo4j also happens in this phase.

This means memory creation latency is determined by database write time, not embedding API latency. Semantic search becomes available within seconds as the async phase completes.

Search with Multi-Vector Context

To take advantage of multi-vector retrieval, include context metadata in your search requests. Providing fields like taskType, domain, complexity, and emotionalTags gives the retrieval system more signals to work with across all four vector spaces.

The hybrid strategy activates multi-vector retrieval. Without it, search falls back to semantic-only matching. If context metadata isn't provided explicitly, MetaMemory infers task type, domain, and complexity from the query text automatically.

PreviousEpisode Tracking