Memvid

@memvid15004

Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.

Video Compression

Semantic Search

Knowledge Base

AI Memory

MP4

QR codes

Python

Local Server

Social Cover (9)

Memvid is a single-file memory layer for AI agents with instant retrieval and long-term memory.
Persistent, versioned, and portable memory, without databases.

Website · Try Sandbox · Docs · Discussions

Benchmark Highlights

🚀 Higher accuracy than any other memory system : +35% SOTA on LoCoMo, best-in-class long-horizon conversational recall & reasoning

🧠 Superior multi-hop & temporal reasoning: +76% multi-hop, +56% temporal vs. the industry average

⚡ Ultra-low latency at scale 0.025ms P50 and 0.075ms P99, with 1,372× higher throughput than standard

🔬 Fully reproducible benchmarks: LoCoMo (10 × ~26K-token conversations), open-source eval, LLM-as-Judge

What is Memvid?

Memvid is a portable AI memory system that packages your data, embeddings, search structure, and metadata into a single file.

Instead of running complex RAG pipelines or server-based vector databases, Memvid enables fast retrieval directly from the file.

The result is a model-agnostic, infrastructure-free memory layer that gives AI agents persistent, long-term memory they can carry anywhere.

What are Smart Frames?

Memvid draws inspiration from video encoding, not to store video, but to organize AI memory as an append-only, ultra-efficient sequence of Smart Frames.

A Smart Frame is an immutable unit that stores content along with timestamps, checksums and basic metadata. Frames are grouped in a way that allows efficient compression, indexing, and parallel reads.

This frame-based design enables:

Append-only writes without modifying or corrupting existing data
Queries over past memory states
Timeline-style inspection of how knowledge evolves
Crash safety through committed, immutable frames
Efficient compression using techniques adapted from video encoding

The result is a single file that behaves like a rewindable memory timeline for AI systems.

Core Concepts

Living Memory Engine Continuously append, branch, and evolve memory across sessions.
Capsule Context (.mv2) Self-contained, shareable memory capsules with rules and expiry.
Time-Travel Debugging Rewind, replay, or branch any memory state.
Smart Recall Sub-5ms local memory access with predictive caching.
Codec Intelligence Auto-selects and upgrades compression over time.

Use Cases

Memvid is a portable, serverless memory layer that gives AI agents persistent memory and fast recall. Because it's model-agnostic, multi-modal, and works fully offline, developers are using Memvid across a wide range of real-world applications.

Long-Running AI Agents
Enterprise Knowledge Bases
Offline-First AI Systems
Codebase Understanding
Customer Support Agents
Workflow Automation
Sales and Marketing Copilots
Personal Knowledge Assistants
Medical, Legal, and Financial Agents
Auditable and Debuggable AI Workflows
Custom Applications

SDKs & CLI

Use Memvid in your preferred language:

Package	Install	Links
CLI	`npm install -g memvid-cli`
Node.js SDK	`npm install @memvid/sdk`
Python SDK	`pip install memvid-sdk`
Rust	`cargo add memvid-core`

Installation (Rust)

Requirements

Rust 1.85.0+ — Install from rustup.rs

Add to Your Project

[dependencies]
memvid-core = "2.0"

Feature Flags

Feature	Description
`lex`	Full-text search with BM25 ranking (Tantivy)
`pdf_extract`	Pure Rust PDF text extraction
`vec`	Vector similarity search (HNSW + local text embeddings via ONNX)
`clip`	CLIP visual embeddings for image search
`whisper`	Audio transcription with Whisper
`api_embed`	Cloud API embeddings (OpenAI)
`temporal_track`	Natural language date parsing ("last Tuesday")
`parallel_segments`	Multi-threaded ingestion
`encryption`	Password-based encryption capsules (.mv2e)
`symspell_cleanup`	Robust PDF text repair (fixes "emp lo yee" -> "employee")

Enable features as needed:

[dependencies]
memvid-core = { version = "2.0", features = ["lex", "vec", "temporal_track"] }

Quick Start

use memvid_core::{Memvid, PutOptions, SearchRequest};

fn main() -> memvid_core::Result<()> {
    // Create a new memory file
    let mut mem = Memvid::create("knowledge.mv2")?;

    // Add documents with metadata
    let opts = PutOptions::builder()
        .title("Meeting Notes")
        .uri("mv2://meetings/2024-01-15")
        .tag("project", "alpha")
        .build();
    mem.put_bytes_with_options(b"Q4 planning discussion...", opts)?;
    mem.commit()?;

    // Search
    let response = mem.search(SearchRequest {
        query: "planning".into(),
        top_k: 10,
        snippet_chars: 200,
        ..Default::default()
    })?;

    for hit in response.hits {
        println!("{}: {}", hit.title.unwrap_or_default(), hit.text);
    }

    Ok(())
}

Build

Clone the repository:

git clone https://github.com/memvid/memvid.git
cd memvid

Build in debug mode:

cargo build

Build in release mode (optimized):

cargo build --release

Build with specific features:

cargo build --release --features "lex,vec,temporal_track"

Run Tests

Run all tests:

cargo test

Run tests with output:

cargo test -- --nocapture

Run a specific test:

cargo test test_name

Run integration tests only:

cargo test --test lifecycle
cargo test --test search
cargo test --test mutation

Examples

The examples/ directory contains working examples:

Basic Usage

Demonstrates create, put, search, and timeline operations:

cargo run --example basic_usage

PDF Ingestion

Ingest and search PDF documents (uses the "Attention Is All You Need" paper):

cargo run --example pdf_ingestion

CLIP Visual Search

Image search using CLIP embeddings (requires clip feature):

cargo run --example clip_visual_search --features clip

Whisper Transcription

Audio transcription (requires whisper feature):

cargo run --example test_whisper --features whisper -- /path/to/audio.mp3

Available Models:

Model	Size	Speed	Use Case
`whisper-small-en`	244 MB	Slowest	Best accuracy (default)
`whisper-tiny-en`	75 MB	Fast	Balanced
`whisper-tiny-en-q8k`	19 MB	Fastest	Quick testing, resource-constrained

Model Selection:

# Default (FP32 small, highest accuracy)
cargo run --example test_whisper --features whisper -- audio.mp3

# Quantized tiny (75% smaller, faster)
MEMVID_WHISPER_MODEL=whisper-tiny-en-q8k cargo run --example test_whisper --features whisper -- audio.mp3

Programmatic Configuration:

use memvid_core::{WhisperConfig, WhisperTranscriber};

// Default FP32 small model
let config = WhisperConfig::default();

// Quantized tiny model (faster, smaller)
let config = WhisperConfig::with_quantization();

// Specific model
let config = WhisperConfig::with_model("whisper-tiny-en-q8k");

let transcriber = WhisperTranscriber::new(&config)?;
let result = transcriber.transcribe_file("audio.mp3")?;
println!("{}", result.text);

Text Embedding Models

The vec feature includes local text embedding support using ONNX models. Before using local text embeddings, you need to download the model files manually.

Quick Start: BGE-small (Recommended)

Download the default BGE-small model (384 dimensions, fast and efficient):

mkdir -p ~/.cache/memvid/text-models

# Download ONNX model
curl -L 'https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx' \\
  -o ~/.cache/memvid/text-models/bge-small-en-v1.5.onnx

# Download tokenizer
curl -L 'https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json' \\
  -o ~/.cache/memvid/text-models/bge-small-en-v1.5_tokenizer.json

Available Models

Model	Dimensions	Size	Best For
`bge-small-en-v1.5`	384	~120MB	Default, fast
`bge-base-en-v1.5`	768	~420MB	Better quality
`nomic-embed-text-v1.5`	768	~530MB	Versatile tasks
`gte-large`	1024	~1.3GB	Highest quality

Other Models

BGE-base (768 dimensions):

curl -L 'https://huggingface.co/BAAI/bge-base-en-v1.5/resolve/main/onnx/model.onnx' \\
  -o ~/.cache/memvid/text-models/bge-base-en-v1.5.onnx
curl -L 'https://huggingface.co/BAAI/bge-base-en-v1.5/resolve/main/tokenizer.json' \\
  -o ~/.cache/memvid/text-models/bge-base-en-v1.5_tokenizer.json

Nomic (768 dimensions):

curl -L 'https://huggingface.co/nomic-ai/nomic-embed-text-v1.5/resolve/main/onnx/model.onnx' \\
  -o ~/.cache/memvid/text-models/nomic-embed-text-v1.5.onnx
curl -L 'https://huggingface.co/nomic-ai/nomic-embed-text-v1.5/resolve/main/tokenizer.json' \\
  -o ~/.cache/memvid/text-models/nomic-embed-text-v1.5_tokenizer.json

GTE-large (1024 dimensions):

curl -L 'https://huggingface.co/thenlper/gte-large/resolve/main/onnx/model.onnx' \\
  -o ~/.cache/memvid/text-models/gte-large.onnx
curl -L 'https://huggingface.co/thenlper/gte-large/resolve/main/tokenizer.json' \\
  -o ~/.cache/memvid/text-models/gte-large_tokenizer.json

Usage in Code

use memvid_core::text_embed::{LocalTextEmbedder, TextEmbedConfig};
use memvid_core::types::embedding::EmbeddingProvider;

// Use default model (BGE-small)
let config = TextEmbedConfig::default();
let embedder = LocalTextEmbedder::new(config)?;

let embedding = embedder.embed_text("hello world")?;
assert_eq!(embedding.len(), 384);

// Use different model
let config = TextEmbedConfig::bge_base();
let embedder = LocalTextEmbedder::new(config)?;

See examples/text_embedding.rs for a complete example with similarity computation and search ranking.

Model Consistency

To prevent accidental model mixing (e.g., querying a BGE-small index with OpenAI embeddings), you can explicitly bind your Memvid instance to a specific model name:

// Bind the index to a specific model.
// If the index was previously created with a different model, this will return an error.
mem.set_vec_model("bge-small-en-v1.5")?;

This binding is persistent. Once set, future attempts to use a different model name will fail fast with a ModelMismatch error.

API Embeddings (OpenAI)

The api_embed feature enables cloud-based embedding generation using OpenAI's API.

Setup

Set your OpenAI API key:

export OPENAI_API_KEY="sk-..."

Usage

use memvid_core::api_embed::{OpenAIConfig, OpenAIEmbedder};
use memvid_core::types::embedding::EmbeddingProvider;

// Use default model (text-embedding-3-small)
let config = OpenAIConfig::default();
let embedder = OpenAIEmbedder::new(config)?;

let embedding = embedder.embed_text("hello world")?;
assert_eq!(embedding.len(), 1536);

// Use higher quality model
let config = OpenAIConfig::large();  // text-embedding-3-large (3072 dims)
let embedder = OpenAIEmbedder::new(config)?;

Available Models

Model	Dimensions	Best For
`text-embedding-3-small`	1536	Default, fastest, cheapest
`text-embedding-3-large`	3072	Highest quality
`text-embedding-ada-002`	1536	Legacy model

See examples/openai_embedding.rs for a complete example.

File Format

Everything lives in a single .mv2 file:

┌────────────────────────────┐
│ Header (4KB)               │  Magic, version, capacity
├────────────────────────────┤
│ Embedded WAL (1-64MB)      │  Crash recovery
├────────────────────────────┤
│ Data Segments              │  Compressed frames
├────────────────────────────┤
│ Lex Index                  │  Tantivy full-text
├────────────────────────────┤
│ Vec Index                  │  HNSW vectors
├────────────────────────────┤
│ Time Index                 │  Chronological ordering
├────────────────────────────┤
│ TOC (Footer)               │  Segment offsets
└────────────────────────────┘

No .wal, .lock, .shm, or sidecar files. Ever.

See MV2_SPEC.md for the complete file format specification.

Support

Have questions or feedback? Email: [email protected]

Drop a ⭐ to show support

Memvid v1 (QR-based memory) is deprecated

If you are referencing QR codes, you are using outdated information.

See: https://docs.memvid.com/memvid-v1-deprecation

License

Apache License 2.0 — see the LICENSE file for details.

Transport:

sse

Language:

Created: 5/27/2025

Updated: 5/6/2026

Homepage:

https://www.memvid.com

Recommend MCP Servers 💡

Agentic_Longterm_Memory

ankitmalik84

Sophisticated AI chatbot with long-term memory capabilities, complete Notion workspace integration, and MCP (Model Context Protocol) implementation. Features semantic, episodic, and procedural memory systems.

2025-07-23

@xiaohui-wang/mcpadvisor

istarwyh

A discovery and recommendation service that helps AI assistants find and leverage Model Context Protocol (MCP) servers using natural language queries.

2025-05-01

mcp_email_reader

karateboss

MCP server to connect to an email server and read emails, exposing tools like search_emails, download_attachment, and list_folders

2025-04-24

mcp-server-datahub

acryldata

An official Model Context Protocol (MCP) server for DataHub, enabling search, metadata fetching, lineage traversal, and SQL query listing for DataHub entities.

2025-03-01

figma-developer-mcp

GLips

Provide Figma layout information to AI coding agents via MCP.

2025-02-13

job-searchoor

0xDAEF0F

A simple MCP server that delivers you jobs based on your needs

2025-04-05

Memvid

Benchmark Highlights

What is Memvid?

What are Smart Frames?

Core Concepts

Use Cases

SDKs & CLI

Installation (Rust)

Requirements

Add to Your Project

Feature Flags

Quick Start

Build

Run Tests

Examples

Basic Usage

PDF Ingestion

CLIP Visual Search

Whisper Transcription

Text Embedding Models

Quick Start: BGE-small (Recommended)

Available Models

Other Models

Usage in Code

Model Consistency

API Embeddings (OpenAI)

Setup

Usage

Available Models

File Format

Support

License

# mcpServer Config

# sseURL

Recommend MCP Servers 💡

Agentic_Longterm_Memory

@xiaohui-wang/mcpadvisor

mcp_email_reader

mcp-server-datahub

figma-developer-mcp

job-searchoor

# `mcpServer` Config