io.github.tobs-code/cozo-memory

编码与调试

by tobs-code

面向 AI agents 的本地优先 memory system,支持 hybrid search 与 graph reasoning。

什么是 io.github.tobs-code/cozo-memory

面向 AI agents 的本地优先 memory system,支持 hybrid search 与 graph reasoning。

README

CozoDB Memory MCP Server

npm Node License MCP Badge

Local-first memory for Claude & AI agents with hybrid search, Graph-RAG, and time-travel – all in a single binary, no cloud, no Docker.

Table of Contents

Quick Start

Option 1: Install via npm (Recommended)

bash
# Install globally
npm install -g cozo-memory

# Or run directly with npx (no installation needed)
npx cozo-memory

Option 2: Build from Source

bash
git clone https://github.com/tobs-code/cozo-memory
cd cozo-memory
npm install && npm run build
npm run start

Now add the server to your MCP client (e.g. Claude Desktop) – see Integration below.

Key Features

🔍 Hybrid Search - Combines semantic (HNSW), full-text (FTS), and graph signals via Reciprocal Rank Fusion for intelligent retrieval

🧠 Agentic Retrieval - Auto-routing engine analyzes query intent via local LLM to select optimal search strategy (Vector, Graph, or Community)

⏱️ Time-Travel Queries - Version all changes via CozoDB Validity; query any point in history with full audit trails

🎯 GraphRAG-R1 Adaptive Retrieval - Intelligent system with Progressive Retrieval Attenuation (PRA) and Cost-Aware F1 (CAF) scoring that learns from usage

Temporal Conflict Resolution - Automatic detection and resolution of contradictory observations with semantic analysis and audit preservation

🏠 100% Local - Embeddings via ONNX/Transformers; no external services, no cloud, complete data ownership

🧠 Multi-Hop Reasoning - Logic-aware graph traversal with vector pivots for deep relational reasoning

🗂️ Hierarchical Memory - Multi-level architecture (L0-L3) with intelligent compression and LLM-backed summarization

→ See all features | Version History

Positioning & Comparison

Most "Memory" MCP servers fall into two categories:

  1. Simple Knowledge Graphs: CRUD operations on triples, often only text search
  2. Pure Vector Stores: Semantic search (RAG), but little understanding of complex relationships

This server fills the gap in between ("Sweet Spot"): A local, database-backed memory engine combining vector, graph, and keyword signals.

Comparison with other solutions

FeatureCozoDB Memory (This Project)Official Reference (@modelcontextprotocol/server-memory)mcp-memory-service (Community)Database Adapters (Qdrant/Neo4j)
BackendCozoDB (Graph + Vector + Relational)JSON file (memory.jsonl)SQLite / CloudflareSpecialized DB (only Vector or Graph)
Search LogicAgentic (Auto-Route): Hybrid + Graph + SummariesKeyword only / Exact Graph MatchVector + KeywordMostly only one dimension
InferenceYes: Built-in engine for implicit knowledgeNoNo ("Dreaming" is consolidation)No (Retrieval only)
CommunityYes: Hierarchical Community SummariesNoNoOnly clustering (no summary)
Time-TravelYes: Queries at any point in time (Validity)No (current state only)History available, no native DB featureNo
MaintenanceJanitor: LLM-backed cleanupManualAutomatic consolidationMostly manual
DeploymentLocal (Node.js + Embedded DB)Local (Docker/NPX)Local or CloudOften requires external DB server

The core advantage is Intelligence and Traceability: By combining an Agentic Retrieval Layer with Hierarchical GraphRAG, the system can answer both specific factual questions and broad thematic queries with much higher accuracy than pure vector stores.

Installation

Prerequisites

  • Node.js 20+ (recommended)
  • RAM: 1.7 GB minimum (for default bge-m3 model)
  • CozoDB native dependency is installed via cozo-node

Via npm (Easiest)

bash
# Install globally
npm install -g cozo-memory

# Or use npx without installation
npx cozo-memory

From Source

bash
git clone https://github.com/tobs-code/cozo-memory
cd cozo-memory
npm install
npm run build

Windows Quickstart

bash
npm install
npm run build
npm run start

Notes:

  • On first start, @xenova/transformers downloads the embedding model (may take time)
  • Embeddings are processed on the CPU

Embedding Model Options

CozoDB Memory supports multiple embedding models via the EMBEDDING_MODEL environment variable:

ModelSizeRAMDimensionsBest For
Xenova/bge-m3 (default)~600 MB~1.7 GB1024High accuracy, production use
Xenova/all-MiniLM-L6-v2~80 MB~400 MB384Low-spec machines, development
Xenova/bge-small-en-v1.5~130 MB~600 MB384Balanced performance

Configuration Options:

Option 1: Using .env file (Easiest for beginners)

bash
# Copy the example file
cp .env.example .env

# Edit .env and set your preferred model
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2

Option 2: MCP Server Config (For Claude Desktop / Kiro)

json
{
  "mcpServers": {
    "cozo-memory": {
      "command": "npx",
      "args": ["cozo-memory"],
      "env": {
        "EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2"
      }
    }
  }
}

Option 3: Command Line

bash
# Use lightweight model for development
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2 npm run start

Download Model First (Recommended):

bash
# Set model in .env or via command line, then:
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2 npm run download-model

Note: Changing models requires re-embedding existing data. The model is downloaded once on first use.

Integration

Claude Desktop

Using npx (Recommended)

json
{
  "mcpServers": {
    "cozo-memory": {
      "command": "npx",
      "args": ["cozo-memory"]
    }
  }
}

Using global installation

json
{
  "mcpServers": {
    "cozo-memory": {
      "command": "cozo-memory"
    }
  }
}

Using local build

json
{
  "mcpServers": {
    "cozo-memory": {
      "command": "node",
      "args": ["C:/Path/to/cozo-memory/dist/index.js"]
    }
  }
}

Framework Adapters

Official adapters for seamless integration with popular AI frameworks:

🦜 LangChain Adapter

bash
npm install @cozo-memory/langchain @cozo-memory/adapters-core
typescript
import { CozoMemoryChatHistory, CozoMemoryRetriever } from '@cozo-memory/langchain';

const chatHistory = new CozoMemoryChatHistory({ sessionName: 'user-123' });
const retriever = new CozoMemoryRetriever({ useGraphRAG: true, graphRAGDepth: 2 });

🦙 LlamaIndex Adapter

bash
npm install @cozo-memory/llamaindex @cozo-memory/adapters-core
typescript
import { CozoVectorStore } from '@cozo-memory/llamaindex';

const vectorStore = new CozoVectorStore({ useGraphRAG: true });

Documentation: See adapters/README.md for complete examples and API reference.

CLI & TUI

CLI Tool

Full-featured CLI for all operations:

bash
# System operations
cozo-memory system health
cozo-memory system metrics

# Entity operations
cozo-memory entity create -n "MyEntity" -t "person"
cozo-memory entity get -i <entity-id>

# Search
cozo-memory search query -q "search term" -l 10
cozo-memory search agentic -q "agentic query"

# Graph operations
cozo-memory graph pagerank
cozo-memory graph communities

# Export/Import
cozo-memory export json -o backup.json
cozo-memory import file -i data.json -f cozo

# All commands support -f json or -f pretty for output formatting

See CLI help for complete command reference: cozo-memory --help

TUI (Terminal User Interface)

Interactive TUI with mouse support powered by Python Textual:

bash
# Install Python dependencies (one-time)
pip install textual

# Launch TUI
npm run tui
# or directly:
cozo-memory-tui

TUI Features:

  • 🖱️ Full mouse support (click buttons, scroll, select inputs)
  • ⌨️ Keyboard shortcuts (q=quit, h=help, r=refresh)
  • 📊 Interactive menus for all operations
  • 🎨 Rich terminal UI with colors and animations

Architecture Overview

mermaid
graph TB
    Client[MCP Client<br/>Claude Desktop, etc.]
    Server[MCP Server<br/>FastMCP + Zod Schemas]
    Services[Memory Services]
    Embeddings[Embeddings<br/>ONNX Runtime]
    Search[Hybrid Search<br/>RRF Fusion]
    Cache[Semantic Cache<br/>L1 + L2]
    Inference[Inference Engine<br/>Multi-Strategy]
    DB[(CozoDB SQLite<br/>Relations + Validity<br/>HNSW Indices<br/>Datalog/Graph)]
    
    Client -->|stdio| Server
    Server --> Services
    Services --> Embeddings
    Services --> Search
    Services --> Cache
    Services --> Inference
    Services --> DB
    
    style Client fill:#e1f5ff,color:#000
    style Server fill:#fff4e1,color:#000
    style Services fill:#f0e1ff,color:#000
    style DB fill:#e1ffe1,color:#000

See docs/ARCHITECTURE.md for detailed architecture documentation

MCP Tools Overview

The interface is reduced to 5 consolidated tools:

ToolPurposeKey Actions
mutate_memoryWrite operationscreate_entity, update_entity, delete_entity, add_observation, create_relation, transactions, sessions, tasks
query_memoryRead operationssearch, advancedSearch, context, graph_rag, graph_walking, agentic_search, adaptive_retrieval
analyze_graphGraph analysisexplore, communities, pagerank, betweenness, hits, shortest_path, semantic_walk
manage_systemMaintenancehealth, metrics, export, import, cleanup, defrag, reflect, snapshots
edit_user_profileUser preferencesEdit global user profile with preferences and work style

See docs/API.md for complete API reference with all parameters and examples

Troubleshooting

Common Issues

First Start Takes Long

  • The embedding model download takes 30-90 seconds on first start (Transformers loads ~500MB of artifacts)
  • This is normal and only happens once
  • Subsequent starts are fast (< 2 seconds)

Cleanup/Reflect Requires Ollama

  • If using cleanup or reflect actions, an Ollama service must be running locally
  • Install Ollama from https://ollama.ai
  • Pull the desired model: ollama pull demyagent-4b-i1:Q6_K (or your preferred model)

Windows-Specific

  • Embeddings are processed on CPU for maximum compatibility
  • RocksDB backend requires Visual C++ Redistributable if using that option

Performance Issues

  • First query after restart is slower (cold cache)
  • Use health action to check cache hit rates
  • Consider RocksDB backend for datasets > 100k entities

See docs/BENCHMARKS.md for performance optimization tips

Documentation

Development

Structure

  • src/index.ts: MCP Server + Tool Registration
  • src/memory-service.ts: Core business logic
  • src/db-service.ts: Database operations
  • src/embedding-service.ts: Embedding Pipeline + Cache
  • src/hybrid-search.ts: Search Strategies + RRF
  • src/inference-engine.ts: Inference Strategies
  • src/api_bridge.ts: Express API Bridge (optional)

Scripts

bash
npm run build        # TypeScript Build
npm run dev          # ts-node Start of MCP Server
npm run start        # Starts dist/index.js (stdio)
npm run bridge       # Build + Start of API Bridge
npm run benchmark    # Runs performance tests
npm run eval         # Runs evaluation suite

Roadmap

Near-Term (v1.x)

  • GPU Acceleration - CUDA support for embedding generation (10-50x faster)
  • Streaming Ingestion - Real-time data ingestion from logs, APIs, webhooks
  • Advanced Chunking - Semantic chunking for ingest_file (paragraph-aware splitting)
  • Query Optimization - Automatic query plan optimization for complex graph traversals
  • Additional Export Formats - Notion, Roam Research, Logseq compatibility

Mid-Term (v2.x)

  • Multi-Modal Embeddings - Support for images, audio, code
  • Distributed Memory - Sharding and replication for large-scale deployments
  • Advanced Inference - Neural-symbolic reasoning, causal inference
  • Real-Time Sync - WebSocket-based real-time updates
  • Web UI - Browser-based management interface

Long-Term (v3.x)

  • Federated Learning - Privacy-preserving collaborative learning
  • Quantum-Inspired Algorithms - Advanced graph algorithms
  • Multi-Agent Coordination - Shared memory across multiple agents

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

Apache 2.0 - See LICENSE for details.

Acknowledgments

Built with:

Research foundations:

  • GraphRAG-R1 (Yu et al., WWW 2026)
  • HopRAG (ACL 2025)
  • T-GRAG (Li et al., 2025)
  • FEEG Framework (Samuel et al., 2026)
  • Allan-Poe (arXiv:2511.00855)

常见问题

io.github.tobs-code/cozo-memory 是什么?

面向 AI agents 的本地优先 memory system,支持 hybrid search 与 graph reasoning。

相关 Skills

网页构建器

by anthropics

Universal
热门

面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。

在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。

编码与调试
未扫描114.1k

前端设计

by anthropics

Universal
热门

面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。

想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。

编码与调试
未扫描114.1k

网页应用测试

by anthropics

Universal
热门

用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。

借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。

编码与调试
未扫描114.1k

相关 MCP Server

GitHub

编辑精选

by GitHub

热门

GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。

这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。

编码与调试
83.4k

by Context7

热门

Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。

它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。

编码与调试
52.2k

by tldraw

热门

tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。

这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。

编码与调试
46.3k

评论