io.github.michael-denyer/hot-memory-mcp
编码与调试by michael-denyer
提供双层 memory:0ms 热缓存结合 semantic search,并具备自组织能力以提升检索效率。
什么是 io.github.michael-denyer/hot-memory-mcp?
提供双层 memory:0ms 热缓存结合 semantic search,并具备自组织能力以提升检索效率。
README
🧠 Memory MCP
Give your AI assistant a persistent second brain
<br />Stop re-explaining your project every session.
Memory MCP learns what matters and keeps it ready — instant recall for the stuff you use most, semantic search for everything else.
</div>The Problem
Every new chat starts from scratch. You explain your architecture again. You paste the same patterns again. Your context window bloats with repetition.
Other memory solutions help, but they still require tool calls for every lookup — adding latency and eating into Claude's thinking budget.
Memory MCP fixes this with a two-tier architecture:
- Hot cache (0ms) — Frequently-used knowledge auto-injected into context before Claude even starts thinking. No tool call needed.
- Cold storage (~50ms) — Everything else, searchable by meaning via semantic similarity.
The system learns what you use and promotes it automatically. Your most valuable knowledge becomes instantly available. No manual curation required.
Before & After
| 😤 Without Memory MCP | 🎯 With Memory MCP |
|---|---|
| "Let me explain our architecture again..." | Project facts persist and isolate per repo |
| Copy-paste the same patterns every session | Patterns auto-promoted to instant access |
| 500k+ token context windows | Hot cache keeps it lean (~20 items) |
| Tool call latency on every memory lookup | Hot cache: 0ms — already in context |
| Stale information lingers forever | Trust scoring demotes outdated facts |
| Flat list of disconnected facts | Knowledge graph connects related concepts |
Install
# Install package
uv tool install hot-memory-mcp # or: pip install hot-memory-mcp
# Add plugin (recommended)
claude plugins add michael-denyer/memory-mcp
The plugin gives you auto-configured hooks, slash commands, and the Memory Analyst agent. MLX is auto-detected on Apple Silicon.
<details> <summary>Manual config (no plugin)</summary>Add to ~/.claude.json:
{
"mcpServers": {
"memory": {
"command": "memory-mcp"
}
}
}
See Reference for full configuration options.
</details>Restart Claude Code. The hot cache auto-populates from your project docs.
First run: Embedding model (~90MB) downloads automatically. Takes 30-60 seconds once.
How It Works
flowchart LR
subgraph LLM["Claude"]
REQ((Request))
end
subgraph Hot["HOT CACHE · 0ms"]
HC[Session context]
PM[(Promoted memories)]
end
subgraph Cold["COLD STORAGE · ~50ms"]
VS[(Vector search)]
KG[(Knowledge graph)]
end
REQ -->|"auto-injected"| HC
HC -.->|"draws from"| PM
REQ -->|"recall()"| VS
VS <-->|"related"| KG
The hot cache (~10 items) is injected into every request — it combines recent recalls, predicted next memories, and top promoted items. Promoted memories (~20 items) is the backing store of frequently-used memories. Memories used 3+ times auto-promote; unused ones demote after 14 days.
What Makes It Different
Most memory systems make you pay a tool-call tax on every lookup. Memory MCP's hot cache bypasses this entirely — your most-used knowledge is already in context when Claude starts thinking.
| Memory MCP | Generic Memory Servers | |
|---|---|---|
| Hot cache | Auto-injected at 0ms | Every lookup = tool call |
| Self-organizing | Learns and promotes automatically | Manual curation required |
| Project-aware | Auto-isolates by git repo | One big pile of memories |
| Knowledge graph | Multi-hop recall across concepts | Flat list of facts |
| Pattern mining | Learns from Claude's outputs | Not available |
| Trust scoring | Outdated info decays and sinks | All memories equal |
| Setup | One command, local SQLite | Often needs cloud setup |
The Engram Insight: Human memory doesn't search — frequently-used patterns are already there. That's what hot cache does for Claude.
Quick Reference
| Slash Command | Tool | Description |
|---|---|---|
/memory-mcp:remember | remember | Store a memory with semantic embedding |
/memory-mcp:recall | recall | Search memories by meaning |
/memory-mcp:hot-cache | promote / demote | Manage promoted memories |
/memory-mcp:stats | memory_stats | Show statistics |
/memory-mcp:bootstrap | bootstrap_project | Seed from project docs |
| — | link_memories | Knowledge graph connections |
See Reference for all 14 slash commands and full tool API.
Dashboard
memory-mcp-cli dashboard # Opens at http://localhost:8765

Browse memories, hot cache, mining candidates, sessions, and knowledge graph.
How to Use
Memory MCP is designed to run as three complementary components:
| Component | Purpose |
|---|---|
| Claude Code Plugin | Hooks, slash commands, and Memory Analyst agent for seamless integration |
| MCP Server | Core memory tools available to Claude via Model Context Protocol |
| Dashboard | Web UI to browse, manage, and debug your memory database |
The plugin is recommended for most users — it auto-configures the MCP server and adds productivity features. Run the dashboard alongside when you want visibility into what's being stored.
Documentation
| Document | Description |
|---|---|
| Reference | Full API, CLI, configuration, MCP resources |
| Troubleshooting | Common issues and solutions |
License
MIT
常见问题
io.github.michael-denyer/hot-memory-mcp 是什么?
提供双层 memory:0ms 热缓存结合 semantic search,并具备自组织能力以提升检索效率。
相关 Skills
网页构建器
by anthropics
面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。
✎ 在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。
前端设计
by anthropics
面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。
✎ 想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。
网页应用测试
by anthropics
用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。
✎ 借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。
相关 MCP Server
GitHub
编辑精选by GitHub
GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。
✎ 这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。
Context7 文档查询
编辑精选by Context7
Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。
✎ 它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。
by tldraw
tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。
✎ 这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。