io.github.base76-research-lab/cognos-session-memory
编码与调试by base76-research-lab
CognOS trust scoring (C=p·(1-Ue-Ua)) and session trace storage as MCP tools.
什么是 io.github.base76-research-lab/cognos-session-memory?
CognOS trust scoring (C=p·(1-Ue-Ua)) and session trace storage as MCP tools.
README
CognOS Session Memory
mcp-name: io.github.base76-research-lab/cognos-session-memory
Verified context injection via epistemic trust scoring for LLMs.
Solves session fragmentation by maintaining verified, high-confidence session context between conversations.
Problem
Large language models suffer from session fragmentation: each new conversation starts without verified context of previous work. This forces repeated explanations, loses decision history, and breaks long-running workflows.
Existing solutions (persistent memory systems, vector retrieval) either:
- Lack trust scores before injection → hallucinations propagate
- Don't audit which context was injected → compliance gaps
- Treat all past information equally → noise overwhelms signal
Solution
A plan-mode gateway that:
- Extracts structured context from 3-5 recent traces
- Scores context quality via CognOS epistemic formula:
C = p · (1 − Ue − Ua) - Injects as system prompt only if
C > threshold - Flags for manual review if
C < threshold - Audits every context injection with trace IDs → EU AI Act compliance
Architecture
recent_traces (n=5)
↓
extract_context() → ContextField + coverage
↓
compute_trust_score(p, ue, ua) → C, R, decision
↓
if C > threshold:
system_prompt ← inject
else:
flagged_reason ← manual review
Core Formula
C = p · (1 − Ue − Ua)
R = 1 − C
where:
p = prediction confidence (coverage of required fields)
Ue = epistemic uncertainty (divergence between traces)
Ua = aleatoric uncertainty (mean risk in traces)
Action Gate
R < 0.25 → PASS (inject without review)
0.25 ≤ R < 0.60 → REFINE (inject with caution)
R ≥ 0.60 → ESCALATE (flag for manual review)
API
POST /v1/plan
Extract and score context.
Request:
{
"n": 5,
"trust_threshold": 0.75,
"mode": "auto"
}
Response (if injected):
{
"status": "injected",
"trust_score": 0.82,
"confidence": 0.82,
"risk": 0.18,
"decision": "PASS",
"context": {
"active_project": "CognOS mHC research",
"last_decision": "Verify P1 hypothesis",
"open_questions": ["How does routing entropy scale?"],
"current_output": "exp_008 complete",
"recent_models": ["gpt-4", "claude-3", "mistral"]
},
"system_prompt": "## CognOS Context...",
"trace_ids": ["uuid-1", "uuid-2", ...]
}
Response (if flagged):
{
"status": "flagged",
"trust_score": 0.45,
"decision": "REFINE",
"flagged_reason": "Trust score 0.45 below threshold 0.75. Manual review recommended.",
"trace_ids": [...]
}
Modes
- auto (default) — inject if
trust_score ≥ threshold, else flag - force — always inject (for testing)
- dry_run — compute score but never inject
Claude Code Integration
As a /compact replacement
# In any Claude Code session:
/save
Claude writes a structured summary, trust-scores it, and persists it to SQLite.
Next session: automatically injected as SESSION_CONTEXT before your first prompt.
See docs/COMPACT_ALTERNATIVE.md for a full comparison.
As an MCP server
Add to ~/.claude/settings.json:
{
"mcpServers": {
"cognos-session-memory": {
"command": "python3",
"args": ["/path/to/cognos-session-memory/mcp_server.py"]
}
}
}
Tools exposed:
| Tool | Description |
|---|---|
save_session(summary, project?) | Trust-score and persist a session summary |
load_session(threshold?) | Retrieve last verified context (default threshold: 0.45) |
Quick Start
Installation
git clone https://github.com/base76-research-lab/cognos-session-memory
cd cognos-session-memory
pip install -e .
Run Gateway
python3 -m uvicorn --app-dir src main:app --port 8788
Test /v1/plan (dry_run)
curl -X POST http://127.0.0.1:8788/v1/plan \
-H 'Content-Type: application/json' \
-d '{"n": 5, "mode": "dry_run"}'
Test /v1/plan (auto)
curl -X POST http://127.0.0.1:8788/v1/plan \
-H 'Content-Type: application/json' \
-d '{"n": 5, "trust_threshold": 0.75, "mode": "auto"}'
Modules
- trust.py — CognOS confidence formula, action gate, signal extractors
- trace_store.py — SQLite persistence (write/read/purge)
- plan.py — Context extraction, trust scoring, system prompt building
- main.py — FastAPI gateway + middleware
- mcp_server.py — MCP stdio server (
save_session,load_session)
Testing
pytest tests/ -v --cov=src
Documentation
- COMPACT_ALTERNATIVE.md — Why this beats
/compact - PAPER.md — Research paper
Research Paper
See docs/PAPER.md — "Verified Context Injection: Epistemically Scored Session Memory for Large Language Models"
Status: Independent research — Base76 Research Lab, 2026 Authors: Björn André Wikström (Base76)
Citation
@software{wikstrom2026cognos,
author = {Wikström, Björn André},
title = {{CognOS Session Memory}: Verified Context Injection via Epistemic Trust Scoring},
year = {2026},
url = {https://github.com/base76-research-lab/cognos-session-memory}
}
License
MIT
Contact
- Author: Björn André Wikström
- Email: bjorn@base76.se
- ORCID: 0009-0000-4015-2357
- GitHub: base76-research-lab
常见问题
io.github.base76-research-lab/cognos-session-memory 是什么?
CognOS trust scoring (C=p·(1-Ue-Ua)) and session trace storage as MCP tools.
相关 Skills
网页构建器
by anthropics
面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。
✎ 在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。
前端设计
by anthropics
面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。
✎ 想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。
网页应用测试
by anthropics
用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。
✎ 借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。
相关 MCP Server
GitHub
编辑精选by GitHub
GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。
✎ 这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。
Context7 文档查询
编辑精选by Context7
Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。
✎ 它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。
by tldraw
tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。
✎ 这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。