什么是 ContextLattice?
面向 MCP 应用和 agents 的记忆、上下文与任务编排工具,默认注重隐私并优先本地运行。
README
ContextLattice
<p align="center"> <a href="https://contextlattice.io/" target="_blank" rel="noopener noreferrer"> <img src="docs/readme/contextlattice-architecture-readme-v2-2026-04-28.png" alt="ContextLattice architecture overview" width="100%" /> </a> </p> <p align="center"> Private-by-default memory and context orchestration for AI agents. </p> <p align="center"> <a href="https://modelcontextprotocol.io/"><img src="https://img.shields.io/badge/MCP-HTTP%20Gateway-6b7280?style=for-the-badge" alt="MCP HTTP Gateway"></a> <a href="#quickstart"><img src="https://img.shields.io/badge/Deploy-Docker%20Compose-4b5563?style=for-the-badge" alt="Docker Compose"></a> <a href="LICENSE"><img src="https://img.shields.io/badge/License-BSL%201.1-1f2937?style=for-the-badge" alt="BSL 1.1"></a> </p>What ContextLattice Does
ContextLattice provides a single memory contract for agentic systems:
- Unified write/read contract for memory and context.
- Durable fanout across retrieval/storage lanes.
- Staged retrieval (fast now, deep continuation when needed).
- Agent sessions that turn prior work, objective lineage, graph touches, skills, checkpoints, and handoffs into prompt-ready packages and exportable run cards.
- Go/Rust runtime ownership for the active application path.
- Legacy Python runtime archived under
archive/services/orchestrator_legacy_pythonfor tooling/test compatibility only. - Local-first deployment with optional hosted surfaces.
Current Public Baseline
v3.4.2 is the public agent runtime contract baseline: universal adapter lifecycle, native agent sessions, objective runtime state, scoped recall, checkpoints, handoffs, completion flow, runtime telemetry, one-command runtime proof, storage-governance hardening, and local session-store diagnostics behind one local contract.
v4 remains the private tuning lane for experiments that still need benchmark, recall, and soak gates before public promotion.
Public Runtime Stack (v3.4)
- Ingress:
gateway-go. - Core memory + retrieval lanes: Go + Rust services.
- Degradation policy: fail-open retrieval with continuation lifecycle.
- Tooling compatibility: MCP + HTTP clients.
- Single-container lite builds (
Dockerfile.hf-lite) also rungateway-go(no Python runtime dependency). - Public single-container lite vector default:
topic_rollupsonly. - Public local lite core default:
topic_rollups + qdrant; pgvector and memory-bank spike adapters are not started by default. - Public local lite advanced: opt-in adapter lab via
gmake mem-up-lite-advanced. - Full/operator stacks: Qdrant remains the primary vector-native lane; pgvector stays supported for SQL-co-located vector workloads.
Quickstart
1) Clone and configure
git clone git@github.com:sheawinkler/ContextLattice.git
cd ContextLattice
cp .env.example .env
2) Launch (recommended)
gmake quickstart
gmake quickstart prompts for runtime profile and then launches the selected stack.
3) Verify
curl -fsS http://127.0.0.1:8075/health | jq
scripts/agent/agent-runtime-proof-pack --pretty
scripts/agent/agent-adoption-proof-matrix --skip-provider-smoke --progress --pretty
Expected:
/healthreturns{"ok": true, ...}agent-runtime-proof-packcompletes bootstrap, scoped recall, checkpoint, handoff, completion, status, prompt context package, and runtime telemetry phases.agent-adoption-proof-matrixverifies configured agent profiles and reports the skills, context, session, graph, and handoff evidence shaping each run, with trace commands for run-card export.
Model Runtime
Task inference defaults to ORCH_INFER_PROVIDER=auto. gateway-go detects the host profile and probes local backends before selecting a route.
- Apple Silicon default priority:
mlx,vllm-metal,ane_sidecar,llama-cpp,ollama. - CUDA/ROCm default priority:
sglang,vllm,openai-compatible,llama-cpp,lmstudio,ollama. - Generic CPU default priority:
openai-compatible,llama-cpp,lmstudio,ollama. - Supported provider ids include
sglang,vllm,vllm-metal,mlx,mtplx(alias for MLX),openai-compatible,lmstudio,llama-cpp,tgi,tensorrt-llm,ane_sidecar, andollama. /v1/inference/runtime-policyreturns live provider health plus resource-aware model guidance. If host memory/VRAM is not identifiable, it falls back to generic local advice: start with Q4/IQ4 7B-9B models, benchmark, then scale up.- Large Qwen3.6 Dream Mode models are opt-in only; ContextLattice does not bundle or pull them by default. The default GGUF recommendation is
mudler/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled-APEX-MTP-GGUFfor llama.cpp-compatible advanced users. Abliterated variants are private-eval only behindCONTEXTLATTICE_DREAM_ALLOW_PRIVATE_EVAL_MODELS=true(GO_DREAM_ALLOW_UNCENSORED_MODELS=trueremains a legacy alias). - Inference runtimes must emit final assistant content through their API. Reasoning-only responses fail with repair instructions instead of being accepted. For MLX Qwen thinking templates, use
scripts/inference_mlx_server.sh --model /path/to/mlx/model --template-profile qwen-final-content, then verify withscripts/inference_template_conformance.sh --provider mlx --model /path/to/mlx/model. - Dream Mode reflects on generated hypotheses by default and performs one bounded deepening pass when the best output misses the sigma target (
GO_DREAM_REFLECT_ENABLED=true,GO_DREAM_DEEPEN_ON_WEAK_OUTPUT=true,GO_DREAM_REFLECTION_MIN_SCORE=0.74). - Ollama remains a compatibility fallback, not the preferred always-on embedding path.
- Local helpers enforce one active LLM backend by default (
CONTEXTLATTICE_SINGLE_ACTIVE_INFER_BACKEND=true).
Inspect live routing and benchmark configured backends:
scripts/inference_runtime_policy.sh
scripts/benchmark_inference_backends.sh
scripts/inference_template_conformance.sh --provider mlx --model /path/to/mlx/model
Embedding defaults to the Rust fastembed-rs sidecar. Ollama stays available as an explicit compatibility fallback, not the preferred embedding path.
Useful model runtime knobs:
ORCH_INFER_PROVIDER=auto
ORCH_INFER_PROVIDER_PRIORITY=mlx,vllm-metal,ane_sidecar,sglang,vllm,openai-compatible,llama-cpp,ollama
ORCH_INFER_AUTO_PROBE_ENABLED=true
SGLANG_BASE_URL=http://127.0.0.1:30000
VLLM_BASE_URL=http://127.0.0.1:8000
VLLM_METAL_BASE_URL=http://127.0.0.1:8000
MLX_API_BASE=http://127.0.0.1:18087/v1
LLAMA_CPP_BASE_URL=http://127.0.0.1:8080
Agent CLI
Installer and quickstart paths install agent helpers under ~/.contextlattice/bin.
contextlattice_agent_adapter profiles
contextlattice_adopt status --pretty
contextlattice_doctor --agents codex --skip-provider-smoke --pretty
contextlattice_agent_start --soft --compact
contextlattice_agent_trace --session-id <session-id> --tree
contextlattice_pack "what should the next agent know?" --project my-project --pretty
contextlattice_search -h
contextlattice_write -h
contextlattice_checkpoint -h
contextlattice_skills_index search "browser automation" --pretty
contextlattice_agent_adapteris the first-class lifecycle helper for bootstrap, context-pack, checkpoint, handoff, event, and completion flows.contextlattice_adoptis the zero-friction front door for local readiness, install guidance, profiles, and lifecycle proof;contextlattice_doctorcombines readiness, proof, and trace evidence in one bounded report.contextlattice_agent_startruns the lightweight startup guard for agents.contextlattice_agent_tracerenders the bounded run-shaping trail as a terminal tree, JSON, or Markdown run card.contextlattice_packcompiles a bounded prompt-ready packet with ranked evidence, files to inspect, risks, checks, source coverage, and areference_prompt.contextlattice_checkpointwrites a checkpoint and verifies readback.contextlattice_skills_indexdiscovers capabilities without loading every skill into startup context.contextlattice_source_backfillis an optional development helper, installed withscripts/install_global_agent_tools.sh --include-dev-python-tools, for bounded data imports.- Hook pack details:
docs/agent-hooks.md.
Agent Runtime Sessions
ContextLattice tracks live agent work as first-class sessions, independent of the runner or model provider.
- Start/list/read sessions through
GET|POST /v1/agents/sessionsandGET /v1/agents/sessions/{session_id}. - Emit normalized events through
POST /v1/agents/sessions/eventorPOST /v1/agents/sessions/{session_id}/events. - Inspect a bounded run trace through
GET /v1/agents/sessions/{session_id}/trace; the trace reports context, skills that may be helpful, source coverage, graph touches, handoffs, checkpoints, and timeline events without raw provider payloads. - Read live runtime telemetry from
GET /telemetry/agents/runtime. - Compile task context through
POST /memory/context-pack,POST /tools/context_pack, or globalcontextlattice_pack; responses includecontext_compiler, ranked evidence, prompt sections, and a boundedreference_prompt. - Watch long-running recall through
scripts/agent/contextlattice-session watch --session-id <id> --continuation-token <token>; continuation responses includeretrieval_progress.v1, dashboard status links, and agent-visible steering when async work is ready. - Preflight, context-pack, and Dream Mode return
objective_runtime_state.v1withobjective_state,action_executed,evidence,objective_delta,risk_or_blocker, andnext_action. - Use
scripts/agent/contextlattice-agent-adapteror globalcontextlattice_agent_adapteras the first-class product path for agent bootstrap, context-pack, checkpoint, handoff, event, and completion flows. - Use
scripts/agent/contextlattice-adoptor globalcontextlattice_adoptbefore handing ContextLattice to a new agent/account;doctorcombines gateway health, helper install state, shell PATH, storage posture, session store, profile coverage, runtime-doctor checks, lifecycle proof, and run trace evidence into one bounded report. - Run
contextlattice_doctor --agents codex --skip-provider-smoke --prettyfor the fastest new-agent adoption proof. - The same doctor works for other agent profiles:
contextlattice_doctor --agents claude-code --skip-provider-smoke --pretty,contextlattice_doctor --agents opencode --skip-provider-smoke --pretty, orcontextlattice_doctor --agents codex,claude-code,opencode --skip-provider-smoke --pretty. - Run
scripts/agent/agent-runtime-proof-pack --prettyor globalcontextlattice_agent_runtime_proof --prettyfor a one-command live proof that bootstrap, scoped recall, checkpoint, handoff, completion, status, and runtime telemetry are wired end to end. - Use
scripts/agent/contextlattice-sessionfor CLI start/event/complete/fail/status/runtime/trace flows. - Use
scripts/agent/agent-run-trace --session-id <id> --treeor globalcontextlattice_agent_trace --session-id <id> --treeto see the terminal trace, then--markdownto export the run card. - Use
scripts/agent/contextlattice-session sweep-stale-audits --all-projects --prettyfor dry-run-first cleanup of stale objective-runtime audit/preflight sessions; add--confirmonly after reviewing matches. scripts/agent/contextlattice-pack,scripts/agent/contextlattice-dream,scripts/agent/writeback, and compaction hooks auto-start or recover a session whenCONTEXTLATTICE_SESSION_IDis absent.- Pass
--session-idorCONTEXTLATTICE_SESSION_IDto force a specific session. SetCONTEXTLATTICE_AUTO_SESSION_DISABLED=1to disable automatic session creation.
Canonical event families include session.started, context_pack.completed, retrieval.continuation.progress, retrieval.continuation.ready, retrieval.continuation.degraded, dream.completed, graph.neighbors_returned, graph.edge_touched, decision.made, test.ran, handoff.created, writeback.completed, and session.completed.
Download Installers
- macOS DMG:
https://github.com/sheawinkler/ContextLattice/releases/latest/download/ContextLattice-macOS-universal.dmg - macOS signing/notarization operator notes:
docs/releases/macos-signing-notarization.md - Homebrew cask:
brew tap sheawinkler/contextlattice && brew install --cask contextlattice - Windows MSI:
https://github.com/sheawinkler/ContextLattice/releases/latest/download/ContextLattice-windows-x64.msi - Linux bundle:
https://github.com/sheawinkler/ContextLattice/releases/latest/download/ContextLattice-linux-bootstrap.tar.gz
Resource Profiles
| Profile | CPU | RAM | Storage |
|---|---|---|---|
| Lite core | 2-4 vCPU | 8-12 GB | 25-80 GB |
| Lite advanced | 4-6 vCPU | 12-16 GB | 80-140 GB |
| Full | 6-8 vCPU | 12-20 GB | 100-180 GB |
Memory Graph
GET|POST /v1/memory/edgespersists explicit typed relationships.POST /v1/memory/edges/backfillaudits or applies deterministic retroactive edges and opt-in same-projectinferred_relatedscoring. It is dry-run by default.POST /v1/memory/neighborsreturns explicit/inferred edge neighbors merged with semantic/topic neighbors.
./scripts/agent/memory-edge-backfill
./scripts/agent/memory-edge-backfill --include-inferred --min-confidence 0.90
./scripts/agent/memory-edge-backfill --write
./scripts/agent/memory-edge-inferred-retrofill --all-projects
./scripts/agent/memory-edge-inferred-retrofill --all-projects --profile exploratory
./scripts/agent/memory-edge-inferred-retrofill --all-projects --profile exploratory --write --confirm-retrofill ALL_PROJECTS
./scripts/agent/memory-edge-inferred-retrofill --project hermes-agent-ultra --corpus disk --profile exploratory
Source Backfill
Bring existing data into ContextLattice without changing the ingest boundary.
Backfill is dry-run by default, writes go through /memory/write, and writes
require --write --confirm-write <project>.
./scripts/agent/source-backfill-memory --source jsonl --path exports/tasks.jsonl --project my-project --pretty
./scripts/agent/source-backfill-memory --source sqlite --path app.db --table notes --project my-project --pretty
./scripts/agent/source-backfill-memory --source parquet --path warehouse/events.parquet --project my-project --pretty
./scripts/agent/source-backfill-memory --source postgres --dsn "$DATABASE_URL" --query "select id,title,body from notes limit 100" --project my-project --pretty
./scripts/agent/source-backfill-memory --source jsonl --path exports/tasks.jsonl --project my-project --write --confirm-write my-project --apply-edges --pretty
Supported adapters: files/directories, JSONL, JSON, CSV, SQLite, DuckDB, Parquet
via DuckDB, and Postgres via optional psycopg. Import caps cover records, row
bytes, document bytes, total bytes, and structured-list items. Secret-like
fields are redacted by default, and graph edge repair is optional and bounded.
Skills Index And Quarantine Discovery
ContextLattice exposes active skills as a native Go Skills Index so agents can discover relevant capabilities without loading every SKILL.md into prompt context. In local installs, the active index mounts ${HOME}/.codex/skills read-only by default. Quarantined/vendor skill discovery remains a separate read-only lane and does not auto-load quarantined skills.
- Active index endpoint:
GET|POST /v1/skills/index/search - Active index tool:
GET|POST /tools/skills_index_search - Active index status/reindex endpoint:
POST /v1/skills/index/reindex(live native scan; no prompt loading) - Search endpoint:
GET|POST /v1/skills/quarantine/search - Tool alias:
GET|POST /tools/skills_quarantine_search - Reindex endpoint:
POST /v1/skills/quarantine/reindex(off by default; enable explicitly)
Runtime knobs:
ORCH_SKILLS_QUARANTINE_ENABLED=true
ORCH_SKILLS_QUARANTINE_HOST_BIN_DIR=${HOME}/.local/bin
ORCH_SKILLS_INDEX_HOST_ACTIVE_ROOT_DIR=${HOME}/.codex/skills
ORCH_SKILLS_INDEX_HOST_SYSTEM_ROOT_DIR=${HOME}/.codex/skills/.system
ORCH_SKILLS_INDEX_ROOTS=/opt/contextlattice/skills_active:/opt/contextlattice/skills_system
ORCH_SKILLS_QUARANTINE_HOST_ROOT_DIR=${HOME}/.codex/skills_quarantine
ORCH_SKILLS_QUARANTINE_SEARCH_CMD=/opt/contextlattice/skills/bin/codex-skills-quarantine-search
ORCH_SKILLS_QUARANTINE_REINDEX_CMD=/opt/contextlattice/skills/bin/codex-skills-quarantine-reindex
ORCH_SKILLS_QUARANTINE_TIMEOUT_SECS=8
ORCH_SKILLS_QUARANTINE_DEFAULT_LIMIT=20
ORCH_SKILLS_QUARANTINE_MAX_LIMIT=100
ORCH_SKILLS_QUARANTINE_REINDEX_ENABLED=false
CODEX_SKILLS_QUARANTINE_ROOT=/opt/contextlattice/skills_quarantine
CODEX_SKILLS_QUARANTINE_INDEX_DIR=/opt/contextlattice/skills_quarantine/index
CODEX_SKILLS_QUARANTINE_INDEX=/opt/contextlattice/skills_quarantine/index/skills_index.jsonl
Security and Privacy
- Local-first by default.
- API-key protected operational routes.
- Secret-like content redaction controls.
- Premium billing/provider route maps are intentionally kept out of public docs.
Docs Index
- Overview:
https://contextlattice.io/ - Architecture:
https://contextlattice.io/architecture.html - Local AI workspace comparison:
https://contextlattice.io/local-ai-workspaces.html - Scaling memory:
https://contextlattice.io/scaling-memory.html - Wiki:
https://contextlattice.io/wiki.html - Installation:
https://contextlattice.io/installation.html - Integrations:
https://contextlattice.io/integration.html - Troubleshooting:
https://contextlattice.io/troubleshooting.html - Updates:
https://contextlattice.io/updates.html - Release notes:
docs/releases/v3.4.14.mddocs/releases/v3.4.13.mddocs/releases/v3.4.12.mddocs/releases/v3.4.11.mddocs/releases/v3.4.10.mddocs/releases/v3.4.5.mddocs/releases/v3.4.2.mddocs/releases/v3.4.1.md
License
Business Source License 1.1 (LICENSE).
常见问题
ContextLattice 是什么?
面向 MCP 应用和 agents 的记忆、上下文与任务编排工具,默认注重隐私并优先本地运行。
相关 Skills
Claude接口
by anthropics
面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。
✎ 想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心
RAG架构师
by alirezarezvani
聚焦生产级RAG系统设计与优化,覆盖文档切块、检索链路、索引构建、召回评估等关键环节,适合搭建可扩展、高准确率的知识库问答与检索增强应用。
✎ 面向RAG落地,把知识库、向量检索和生成链路系统串联起来,做架构设计时更清晰,也更少踩坑。
多智能体架构
by alirezarezvani
聚焦多智能体系统架构设计,梳理 Supervisor、Swarm、分层和 Pipeline 等模式,覆盖角色定义、通信协作与性能评估,适合规划稳健可扩展的 AI agent 编排方案。
✎ 帮你系统解决多智能体应用的架构设计与协同编排难题,适合构建复杂 AI 工作流,成熟度高、社区认可也很亮眼。
相关 MCP Server
顺序思维
编辑精选by Anthropic
Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。
✎ 这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。
知识图谱记忆
编辑精选by Anthropic
Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。
✎ 帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。
PraisonAI
编辑精选by mervinpraison
PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。
✎ 如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。