Codebase Memory

AI 与智能体

by deusdata

持久化的代码库知识图谱,可跨会话保留上下文,在 session 重启或上下文压缩后仍能继续使用。

专治 AI 编程助手“会话失忆”,把代码库沉淀为持久知识图谱,重启或压缩上下文后也能无缝续上开发状态。

1.2kGitHub

什么是 Codebase Memory

持久化的代码库知识图谱,可跨会话保留上下文,在 session 重启或上下文压缩后仍能继续使用。

README

codebase-memory-mcp

GitHub Release License CI Tests Languages Agents Pure C Platform OpenSSF Scorecard

The fastest and most efficient code intelligence engine for AI coding agents. Full-indexes an average repository in milliseconds, the Linux kernel (28M LOC, 75K files) in 3 minutes. Answers structural queries in under 1ms. Ships as a single static binary for macOS, Linux, and Windows — download, run install, done.

High-quality parsing through tree-sitter AST analysis across all 66 languages, enhanced with LSP-style hybrid type resolution for Go, C, and C++ (more languages coming soon) — producing a persistent knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links. 14 MCP tools. Zero dependencies. Plug and play across 10 coding agents.

<p align="center"> <img src="docs/graph-ui-screenshot.png" alt="Graph visualization UI showing the codebase-memory-mcp knowledge graph" width="800"> <br> <em>Built-in 3D graph visualization (UI variant) — explore your knowledge graph at localhost:9749</em> </p>

Why codebase-memory-mcp

  • Extreme indexing speed — Linux kernel (28M LOC, 75K files) in 3 minutes. RAM-first pipeline: LZ4 compression, in-memory SQLite, fused Aho-Corasick pattern matching. Memory released after indexing.
  • Plug and play — single static binary for macOS (arm64/amd64), Linux (arm64/amd64), and Windows (amd64). No Docker, no runtime dependencies, no API keys. Download → install → restart agent → done.
  • 66 languages — vendored tree-sitter grammars compiled into the binary. Nothing to install, nothing that breaks.
  • 120x fewer tokens — 5 structural queries: ~3,400 tokens vs ~412,000 via file-by-file search. One graph query replaces dozens of grep/read cycles.
  • 10 agents, one commandinstall auto-detects Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, and OpenClaw — configures MCP entries, instruction files, and pre-tool hooks for each.
  • Built-in graph visualization — 3D interactive UI at localhost:9749 (optional UI binary variant).
  • Infrastructure-as-code indexing — Dockerfiles, Kubernetes manifests, and Kustomize overlays indexed as graph nodes with cross-references. Resource nodes for K8s kinds, Module nodes for Kustomize overlays with IMPORTS edges to referenced resources.
  • 14 MCP tools — search, trace, architecture, impact analysis, Cypher queries, dead code detection, cross-service HTTP linking, ADR management, and more.

Quick Start

One-line install (macOS / Linux):

bash
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash

With graph visualization UI:

bash
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash -s -- --ui

Windows (PowerShell):

powershell
powershell -ExecutionPolicy ByPass -c "irm https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.ps1 | iex"

Options: --ui (graph visualization), --skip-config (binary only, no agent setup), --dir=<path> (custom location).

Restart your coding agent. Say "Index this project" — done.

<details> <summary>Manual install</summary>
  1. Download the archive for your platform from the latest release:

    • codebase-memory-mcp-<os>-<arch>.tar.gz (macOS/Linux) or .zip (Windows) — standard
    • codebase-memory-mcp-ui-<os>-<arch>.tar.gz / .zip — with graph visualization
  2. Extract and install (each archive includes install.sh or install.ps1):

    macOS / Linux:

    bash
    tar xzf codebase-memory-mcp-*.tar.gz
    ./install.sh
    

    Windows (PowerShell):

    powershell
    Expand-Archive codebase-memory-mcp-windows-amd64.zip -DestinationPath .
    .\install.ps1
    
  3. Restart your coding agent.

The install command automatically strips macOS quarantine attributes and ad-hoc signs the binary — no manual xattr/codesign needed.

</details>

The install command auto-detects all installed coding agents and configures MCP server entries, instruction files, skills, and pre-tool hooks for each.

Graph Visualization UI

If you downloaded the ui variant:

bash
codebase-memory-mcp --ui=true --port=9749

Open http://localhost:9749 in your browser. The UI runs as a background thread alongside the MCP server — it's available whenever your agent is connected.

Auto-Index

Enable automatic indexing on MCP session start:

bash
codebase-memory-mcp config set auto_index true

When enabled, new projects are indexed automatically on first connection. Previously-indexed projects are registered with the background watcher for ongoing git-based change detection. Configurable file limit: config set auto_index_limit 50000.

Keeping Up to Date

bash
codebase-memory-mcp update

The MCP server also checks for updates on startup and notifies on the first tool call if a newer release is available.

Uninstall

bash
codebase-memory-mcp uninstall

Removes all agent configs, skills, hooks, and instructions. Does not remove the binary or SQLite databases.

Features

  • Architecture overview: get_architecture returns languages, packages, entry points, routes, hotspots, boundaries, layers, and clusters in a single call
  • Architecture Decision Records: manage_adr persists architectural decisions across sessions
  • Louvain community detection: Discovers functional modules by clustering call edges
  • Git diff impact mapping: detect_changes maps uncommitted changes to affected symbols with risk classification
  • Call graph: Resolves function calls across files and packages (import-aware, type-inferred)
  • Cross-service HTTP linking: Discovers REST routes and matches them to HTTP call sites with confidence scoring
  • Auto-sync: Background watcher detects file changes and re-indexes automatically
  • Cypher-like queries: MATCH (f:Function)-[:CALLS]->(g) WHERE f.name = 'main' RETURN g.name
  • Dead code detection: Finds functions with zero callers, excluding entry points
  • Route nodes: REST endpoints are first-class graph entities
  • CLI mode: codebase-memory-mcp cli search_graph '{"name_pattern": ".*Handler.*"}'
  • Single binary, zero infrastructure: SQLite-backed, persists to ~/.cache/codebase-memory-mcp/

How It Works

codebase-memory-mcp is a structural analysis backend — it builds and queries the knowledge graph. It does not include an LLM. Instead, it relies on your MCP client (Claude Code, or any MCP-compatible agent) to be the intelligence layer.

code
You: "what calls ProcessOrder?"

Agent calls: trace_call_path(function_name="ProcessOrder", direction="inbound")

codebase-memory-mcp: executes graph query, returns structured results

Agent: presents the call chain in plain English

Why no built-in LLM? Other code graph tools embed an LLM for natural language → graph query translation. This means extra API keys, extra cost, and another model to configure. With MCP, the agent you're already talking to is the query translator.

Performance

Benchmarked on Apple M3 Pro:

OperationTimeNotes
Linux kernel full index3 min28M LOC, 75K files → 2.1M nodes, 4.9M edges
Linux kernel fast index1m 12s1.88M nodes
Django full index~6s49K nodes, 196K edges
Cypher query<1msRelationship traversal
Name search (regex)<10msSQL LIKE pre-filtering
Dead code detection~150msFull graph scan with degree filtering
Trace call path (depth=5)<10msBFS traversal

RAM-first pipeline: All indexing runs in memory (LZ4 HC compressed read, in-memory SQLite, single dump at end). Memory is released back to the OS after indexing completes.

Token efficiency: Five structural queries consumed ~3,400 tokens via codebase-memory-mcp versus ~412,000 tokens via file-by-file grep exploration — a 99.2% reduction.

Installation

Pre-built Binaries

PlatformStandardWith Graph UI
macOS (Apple Silicon)codebase-memory-mcp-darwin-arm64.tar.gzcodebase-memory-mcp-ui-darwin-arm64.tar.gz
macOS (Intel)codebase-memory-mcp-darwin-amd64.tar.gzcodebase-memory-mcp-ui-darwin-amd64.tar.gz
Linux (x86_64)codebase-memory-mcp-linux-amd64.tar.gzcodebase-memory-mcp-ui-linux-amd64.tar.gz
Linux (ARM64)codebase-memory-mcp-linux-arm64.tar.gzcodebase-memory-mcp-ui-linux-arm64.tar.gz
Windows (x86_64)codebase-memory-mcp-windows-amd64.zipcodebase-memory-mcp-ui-windows-amd64.zip

Every release includes checksums.txt with SHA-256 hashes. All binaries are statically linked — no shared library dependencies.

Windows note: SmartScreen may show a warning for unsigned software. Click "More info""Run anyway". Verify integrity with checksums.txt.

Setup Scripts

<details> <summary>Automated download + install</summary>

macOS / Linux:

bash
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/scripts/setup.sh | bash

Windows (PowerShell):

powershell
irm https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/scripts/setup-windows.ps1 | iex
</details>

Install via Claude Code

code
You: "Install this MCP server: https://github.com/DeusData/codebase-memory-mcp"

Build from Source

<details> <summary>Prerequisites: C compiler + zlib</summary>
RequirementCheckInstall
C compiler (gcc or clang)gcc --version or clang --versionmacOS: xcode-select --install, Linux: apt install build-essential
C++ compilerg++ --version or clang++ --versionSame as above
zlibmacOS: included, Linux: apt install zlib1g-dev
Gitgit --versionPre-installed on most systems
</details>
bash
git clone https://github.com/DeusData/codebase-memory-mcp.git
cd codebase-memory-mcp
scripts/build.sh                    # standard binary
scripts/build.sh --with-ui          # with graph visualization
# Binary at: build/c/codebase-memory-mcp

Manual MCP Configuration

<details> <summary>If you prefer not to use the install command</summary>

Add to ~/.claude/.mcp.json (global) or project .mcp.json:

json
{
  "mcpServers": {
    "codebase-memory-mcp": {
      "command": "/path/to/codebase-memory-mcp",
      "args": []
    }
  }
}

Restart your agent. Verify with /mcp — you should see codebase-memory-mcp with 14 tools.

</details>

Multi-Agent Support

install auto-detects and configures all installed agents:

AgentMCP ConfigInstructionsHooks
Claude Code.claude/.mcp.json4 SkillsPreToolUse (Grep/Glob/Read reminder)
Codex CLI.codex/config.toml.codex/AGENTS.md
Gemini CLI.gemini/settings.json.gemini/GEMINI.mdBeforeTool (grep/read reminder)
Zedsettings.json (JSONC)
OpenCodeopencode.jsonAGENTS.md
Antigravitymcp_config.jsonAGENTS.md
AiderCONVENTIONS.md
KiloCodemcp_settings.json~/.kilocode/rules/
VS CodeCode/User/mcp.json
OpenClawopenclaw.json

Hooks are advisory (exit code 0) — they remind agents to prefer MCP graph tools when they reach for grep/glob/read, without blocking the tool call.

CLI Mode

Every MCP tool can be invoked from the command line:

bash
codebase-memory-mcp cli index_repository '{"repo_path": "/path/to/repo"}'
codebase-memory-mcp cli search_graph '{"name_pattern": ".*Handler.*", "label": "Function"}'
codebase-memory-mcp cli trace_call_path '{"function_name": "Search", "direction": "both"}'
codebase-memory-mcp cli query_graph '{"query": "MATCH (f:Function) RETURN f.name LIMIT 5"}'
codebase-memory-mcp cli list_projects
codebase-memory-mcp cli --raw search_graph '{"label": "Function"}' | jq '.results[].name'

MCP Tools

Indexing

ToolDescription
index_repositoryIndex a repository into the graph. Auto-sync keeps it fresh after that.
list_projectsList all indexed projects with node/edge counts.
delete_projectRemove a project and all its graph data.
index_statusCheck indexing status of a project.

Querying

ToolDescription
search_graphStructured search by label, name pattern, file pattern, degree filters. Pagination via limit/offset.
trace_call_pathBFS traversal — who calls a function and what it calls. Depth 1-5.
detect_changesMap git diff to affected symbols + blast radius with risk classification.
query_graphExecute Cypher-like graph queries (read-only).
get_graph_schemaNode/edge counts, relationship patterns. Run this first.
get_code_snippetRead source code for a function by qualified name.
get_architectureCodebase overview: languages, packages, routes, hotspots, clusters, ADR.
search_codeGrep-like text search within indexed project files.
manage_adrCRUD for Architecture Decision Records.
ingest_tracesIngest runtime traces to validate HTTP_CALLS edges.

Graph Data Model

Node Labels

Project, Package, Folder, File, Module, Class, Function, Method, Interface, Enum, Type, Route, Resource

Edge Types

CONTAINS_PACKAGE, CONTAINS_FOLDER, CONTAINS_FILE, DEFINES, DEFINES_METHOD, IMPORTS, CALLS, HTTP_CALLS, ASYNC_CALLS, IMPLEMENTS, HANDLES, USAGE, CONFIGURES, WRITES, MEMBER_OF, TESTS, USES_TYPE, FILE_CHANGES_WITH

Qualified Names

get_code_snippet uses qualified names: <project>.<path_parts>.<name>. Use search_graph to discover them first.

Supported Cypher Subset

query_graph supports: MATCH with labels and relationship types, variable-length paths, WHERE with comparisons/regex/CONTAINS, RETURN with property access and COUNT/DISTINCT, ORDER BY, LIMIT. Not supported: WITH, COLLECT, OPTIONAL MATCH, mutations.

Ignoring Files

Layered: hardcoded patterns (.git, node_modules, etc.) → .gitignore hierarchy → .cbmignore (project-specific, gitignore syntax). Symlinks are always skipped.

Configuration

bash
codebase-memory-mcp config list                          # show all settings
codebase-memory-mcp config set auto_index true           # auto-index on session start
codebase-memory-mcp config set auto_index_limit 50000    # max files for auto-index
codebase-memory-mcp config reset auto_index              # reset to default

Environment Variables

VariableDefaultDescription
CBM_CACHE_DIR~/.cache/codebase-memory-mcpOverride the database storage directory. All project indexes and config are stored here.
CBM_DIAGNOSTICSfalseSet to 1 or true to enable periodic diagnostics output to /tmp/cbm-diagnostics-<pid>.json.
CBM_DOWNLOAD_URL(GitHub releases)Override the download URL for updates. Used for testing or self-hosted deployments.
bash
# Store indexes in a custom directory
export CBM_CACHE_DIR=~/my-projects/cbm-data

Custom File Extensions

Map additional file extensions to supported languages via JSON config files. Useful for framework-specific extensions like .blade.php (Laravel) or .mjs (ES modules).

Per-project (in your repo root):

json
// .codebase-memory.json
{"extra_extensions": {".blade.php": "php", ".mjs": "javascript"}}

Global (applies to all projects):

json
// ~/.config/codebase-memory-mcp/config.json  (or $XDG_CONFIG_HOME/...)
{"extra_extensions": {".twig": "html", ".phtml": "php"}}

Project config overrides global for conflicting extensions. Unknown language values are silently skipped. Missing config files are ignored.

Persistence

SQLite databases stored at ~/.cache/codebase-memory-mcp/. Persists across restarts (WAL mode, ACID-safe). To reset: rm -rf ~/.cache/codebase-memory-mcp/.

Troubleshooting

ProblemFix
/mcp doesn't show the serverCheck .mcp.json path is absolute. Restart agent. Test: echo '{}' | /path/to/binary should output JSON.
index_repository failsPass absolute path: index_repository(repo_path="/absolute/path")
trace_call_path returns 0 resultsUse search_graph(name_pattern=".*PartialName.*") first to find the exact name.
Queries return wrong project resultsAdd project="name" parameter. Use list_projects to see names.
Binary not found after installAdd to PATH: export PATH="$HOME/.local/bin:$PATH"
UI not loadingEnsure you downloaded the ui variant and ran --ui=true. Check http://localhost:9749.

Language Support

66 languages. Benchmarked against 64 real open-source repositories (78 to 49K nodes):

TierScoreLanguages
Excellent (>= 90%)Lua, Kotlin, C++, Perl, Objective-C, Groovy, C, Bash, Zig, Swift, CSS, YAML, TOML, HTML, SCSS, HCL, Dockerfile
Good (75-89%)Python, TypeScript, TSX, Go, Rust, Java, R, Dart, JavaScript, Erlang, Elixir, Scala, Ruby, PHP, C#, SQL
Functional (< 75%)OCaml, Haskell

Plus: Clojure, F#, Julia, Vim Script, Nix, Common Lisp, Elm, Fortran, CUDA, COBOL, Verilog, Emacs Lisp, MATLAB, Lean 4, FORM, Magma, Wolfram, JSON, XML, Markdown, Makefile, CMake, Protobuf, GraphQL, Vue, Svelte, Meson, GLSL, INI.

Architecture

code
src/
  main.c              Entry point (MCP stdio server + CLI + install/update/config)
  mcp/                MCP server (14 tools, JSON-RPC 2.0, session detection, auto-index)
  cli/                Install/uninstall/update/config (10 agents, hooks, instructions)
  store/              SQLite graph storage (nodes, edges, traversal, search, Louvain)
  pipeline/           Multi-pass indexing (structure → definitions → calls → HTTP links → config → tests)
  cypher/             Cypher query lexer, parser, planner, executor
  discover/           File discovery (.gitignore, .cbmignore, symlink handling)
  watcher/            Background auto-sync (git polling, adaptive intervals)
  traces/             Runtime trace ingestion
  ui/                 Embedded HTTP server + 3D graph visualization
  foundation/         Platform abstractions (threads, filesystem, logging, memory)
internal/cbm/         Vendored tree-sitter grammars (66 languages) + AST extraction engine

License

MIT

常见问题

Codebase Memory 是什么?

持久化的代码库知识图谱,可跨会话保留上下文,在 session 重启或上下文压缩后仍能继续使用。

相关 Skills

Claude接口

by anthropics

Universal
热门

面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。

想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心

AI 与智能体
未扫描109.6k

提示工程专家

by alirezarezvani

Universal
热门

覆盖Prompt优化、Few-shot设计、结构化输出、RAG评测与Agent工作流编排,适合分析token成本、评估LLM输出质量,并搭建可落地的AI智能体系统。

把提示优化、LLM评测到RAG与智能体设计串成一套方法,适合想系统提升AI开发效率的人。

AI 与智能体
未扫描9.0k

智能体流程设计

by alirezarezvani

Universal
热门

面向生产级多 Agent 编排,梳理顺序、并行、分层、事件驱动、共识五种工作流设计,覆盖 handoff、状态管理、容错重试、上下文预算与成本优化,适合搭建复杂 AI 协作系统。

帮你把多智能体流程设计、编排和自动化统一起来,复杂工作流也能更稳地落地,适合追求强控制力的团队。

AI 与智能体
未扫描9.0k

相关 MCP Server

顺序思维

编辑精选

by Anthropic

热门

Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。

这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。

AI 与智能体
82.9k

知识图谱记忆

编辑精选

by Anthropic

热门

Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。

帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。

AI 与智能体
82.9k

PraisonAI

编辑精选

by mervinpraison

热门

PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。

如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。

AI 与智能体
6.4k

评论