io.github.kael-bit/engram

编码与调试

by kael-bit

为 AI agents 提供分层记忆系统,包含 buffer、working、core 三层,并支持 decay 与 promotion。

什么是 io.github.kael-bit/engram

为 AI agents 提供分层记忆系统,包含 buffer、working、core 三层,并支持 decay 与 promotion。

README

engram-rs

CI License: MIT Rust GitHub stars Docker

Memory engine for AI agents. Two axes: time (three-layer decay & promotion) and space (self-organizing topic tree). Important memories get promoted, noise fades, related knowledge clusters automatically.

Most agent memory is a flat store — dump everything in, keyword search to get it back. No forgetting, no organization, no lifecycle. engram-rs adds the part that makes memory actually useful: the ability to forget what doesn't matter and surface what does.

<p align="center"> <img src="docs/engram-quickstart.gif" alt="engram demo — store, context reset, recall" width="720"> </p>

Single Rust binary, one SQLite file, zero external dependencies. No Python, no Redis, no vector DB — curl | bash and it runs. ~10 MB binary, ~100 MB RSS, single-digit ms search latency.

Quick Start

bash
# Install (interactive — will prompt for embedding provider config)
curl -fsSL https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.sh | bash

# Store a memory
curl -X POST http://localhost:3917/memories \
  -d '{"content": "Always run tests before deploying", "tags": ["deploy"]}'

# Recall by meaning
curl -X POST http://localhost:3917/recall \
  -d '{"query": "deployment checklist"}'

# Restore full context (session start)
curl http://localhost:3917/resume

What It Does

Three-Layer Lifecycle

Inspired by the Atkinson–Shiffrin memory model, memories are managed across three layers by importance:

code
Buffer (short-term) → Working (active knowledge) → Core (long-term identity)
      ↓                       ↓                           ↑
   eviction              importance decay           LLM quality gate
  • Buffer: Entry point for all new memories. Temporary staging — evicted when below threshold
  • Working: Promoted via consolidation. Never deleted, importance decays at different rates by kind
  • Core: Promoted through LLM quality gate. Never deleted

LLM Quality Gate

Promotion isn't rule-based guesswork — an LLM evaluates each memory in context and decides whether it genuinely warrants long-term retention.

code
Buffer → [LLM gate: "Is this a decision, lesson, or preference?"] → Working
Working → [sustained access + LLM gate] → Core

Automatic Decay

Decay is activity-driven — it only fires during active consolidation cycles, not wall-clock time. If the system is idle, memories stay intact.

Exponential decay follows the Ebbinghaus forgetting curve — fast at first, then long-tail. Memories never fully vanish (floor = 0.01), remaining retrievable under precise queries. When a memory is recalled, it gets an activation boost, strengthening frequently-used knowledge.

KindDecay rateHalf-lifeUse case
episodicFastest~35 epochsEvents, experiences, time-bound context
semanticMedium~58 epochsKnowledge, preferences, lessons (default)
proceduralSlowest~173 epochsWorkflows, instructions, how-to

Algorithm Visualizations

ChartWhat it shows
<img src="docs/images/chart_scoring.png" width="600">Sigmoid score compression. Raw scores are mapped through a sigmoid function, approaching 1.0 asymptotically. High-relevance results remain distinguishable instead of being crushed into the same value.
<img src="docs/images/chart_decay.png" width="600">Ebbinghaus forgetting curve. Exponential decay with kind-differentiated rates — episodic memories fade fastest, procedural slowest. Floor at 0.01 means memories never fully vanish; they remain retrievable under precise queries.
<img src="docs/images/chart_bias.png" width="600">Kind × layer weight bias. Additive biases adjust memory weight by type and layer. Procedural+core memories rank highest, episodic+buffer lowest — but the spread stays bounded so no single combination dominates.
<img src="docs/images/chart_reinforcement.png" width="600">Reinforcement signals. Repetition and access bonuses follow logarithmic saturation. Early interactions matter most; later ones contribute diminishing returns, discriminating between "used occasionally" and "used daily".
<img src="docs/images/chart_lifecycle.png" width="600">Use it or lose it. Left: a memory that's never recalled decays into the buffer layer. Right: periodic recall triggers activation boosts that keep the memory in the working layer. Dashed line shows the unrecalled trajectory for comparison.

Semantic Dedup & Merge

Two memories saying the same thing in different words? Detected and merged automatically:

code
"use PostgreSQL for auth" + "auth service runs on Postgres"
→ Merged into one, preserving context from both

Self-Organizing Topic Tree

Vector clustering groups related memories together, LLM names the clusters. No manual tagging required:

code
Memory Architecture
├── Three-layer lifecycle [4]
├── Embedding pipeline [3]
└── Consolidation logic [5]
Deploy & Ops
├── CI/CD procedures [3]
└── Production incidents [2]
User Preferences [6]

The problem this solves: vector search requires asking the right question. Topic trees let agents browse by subject — scan the directory, drill into the right branch.

Triggers

Tag a memory with trigger:deploy, and the agent can recall all deployment lessons before executing:

bash
curl -X POST http://localhost:3917/memories \
  -d '{"content": "LESSON: always backup DB before migration", "tags": ["trigger:deploy", "lesson"]}'

# Pre-deployment check
curl http://localhost:3917/triggers/deploy

Session Recovery

Agent wakes up, calls GET /resume, gets full context back. No file scanning needed:

code
=== Core (24) ===
deploy: test → build → stop → start (procedural)
LESSON: never force-push to main
...

=== Recent ===
switched auth to OAuth2
published API docs

=== Topics (Core: 24, Working: 57, Buffer: 7) ===
kb1: "Deploy Procedures" [5]
kb2: "Auth Architecture" [3]
kb3: "Memory Design" [8]
...

Triggers: deploy, git-push, database-migration
SectionContentPurpose
CoreFull text of permanent rules and identityThe unforgettable stuff
RecentRecently changed memoriesShort-term continuity
TopicsTopic index (table of contents)Drill in on demand, no full load
TriggersPre-action tagsAuto-recall lessons before risky ops

Agent reads the directory, finds relevant topics, calls POST /topic to expand on demand.

Search & Retrieval

Semantic embeddings + BM25 keyword search with CJK tokenization (jieba). IDF-weighted scoring — rare terms get boosted, common terms auto-downweighted. No stopword lists to maintain.

bash
# Semantic search
curl -X POST http://localhost:3917/recall \
  -d '{"query": "how do we handle auth", "budget_tokens": 2000}'
# Note: min_score defaults to 0.30. Use "min_score": 0.0 to get all results.

# Topic drill-down
curl -X POST http://localhost:3917/topic \
  -d '{"ids": ["kb3"]}'

Background Maintenance

Fully automatic, activity-driven — no writes means the cycle is skipped:

Consolidation (every 30 minutes)

  1. Decay — reduce importance of unaccessed memories
  2. Dedup — merge near-identical memories (cosine > 0.78)
  3. Triage — LLM categorizes new Buffer memories
  4. Gate — LLM batch-evaluates promotion candidates
  5. Reconcile — resolve ambiguous similar pairs (results cached)
  6. Topic tree rebuild — re-cluster and name

Topic Distillation — when a topic grows too large (10+ memories), overlapping content is condensed into fewer, richer entries.

Namespace Isolation

Single instance, multiple projects. Use X-Namespace to isolate:

bash
curl -X POST http://localhost:3917/memories \
  -H "X-Namespace: my-project" \
  -d '{"content": "API uses OAuth2 bearer tokens"}'

Install

bash
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.ps1 | iex

Docker

bash
docker run -d --name engram-rs \
  -p 3917:3917 \
  -v engram-data:/data \
  -e ENGRAM_LLM_URL=https://api.openai.com/v1 \
  -e ENGRAM_LLM_KEY=sk-... \
  ghcr.io/kael-bit/engram-rs:latest

LLM Configuration

engram-rs works without an LLM — keyword search and rule-based lifecycle still function. Adding an LLM unlocks intelligent features (triage, gate, merge, topic naming):

bash
export ENGRAM_LLM_URL=https://api.openai.com/v1
export ENGRAM_LLM_KEY=sk-...

# Any OpenAI-compatible API (Ollama, LM Studio, etc.)
export ENGRAM_LLM_URL=http://localhost:11434/v1

# Anthropic native
export ENGRAM_LLM_PROVIDER=anthropic
export ENGRAM_LLM_KEY=sk-ant-...

Two-tier model config — strong model for judgment, lightweight for text processing:

bash
ENGRAM_GATE_MODEL=gpt-4o          # quality decisions
ENGRAM_LLM_MODEL=gpt-4o-mini     # text processing

For AI Agents

Add to your agent's system prompt:

code
Set up engram-rs (persistent memory) by following the guide at:
https://raw.githubusercontent.com/kael-bit/engram-rs/main/docs/SETUP.md

Integration

Works with Claude Code, Cursor, Windsurf, OpenClaw, and any MCP-compatible tool.

17 MCP tools (MCP docs) | Full HTTP API (Setup guide)

bash
# MCP (Claude Code)
npx engram-rs-mcp

# MCP (Cursor / Windsurf / generic)
{"mcpServers": {"engram": {"command": "npx", "args": ["-y", "engram-rs-mcp"]}}}

Web Dashboard

Built-in web UI at http://localhost:3917/ui for browsing memories, viewing the topic tree, and monitoring LLM usage.

Specs

Binary~10 MB
Memory~100 MB RSS in production
StorageSQLite, no external database
LanguageRust
PlatformsLinux, macOS, Windows (x86_64 + aarch64)
LicenseMIT

License

MIT

<a href="https://glama.ai/mcp/servers/@kael-bit/engram-rs"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@kael-bit/engram-rs/badge" /> </a>

常见问题

io.github.kael-bit/engram 是什么?

为 AI agents 提供分层记忆系统,包含 buffer、working、core 三层,并支持 decay 与 promotion。

相关 Skills

前端设计

by anthropics

Universal
热门

面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。

想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。

编码与调试
未扫描111.8k

网页构建器

by anthropics

Universal
热门

面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。

在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。

编码与调试
未扫描111.8k

网页应用测试

by anthropics

Universal
热门

用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。

借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。

编码与调试
未扫描111.8k

相关 MCP Server

GitHub

编辑精选

by GitHub

热门

GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。

这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。

编码与调试
83.1k

by Context7

热门

Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。

它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。

编码与调试
51.8k

by tldraw

热门

tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。

这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。

编码与调试
46.2k

评论