io.github.base76-research-lab/cognos-session-memory

编码与调试

by base76-research-lab

CognOS trust scoring (C=p·(1-Ue-Ua)) and session trace storage as MCP tools.

什么是 io.github.base76-research-lab/cognos-session-memory

CognOS trust scoring (C=p·(1-Ue-Ua)) and session trace storage as MCP tools.

README

CognOS Session Memory

mcp-name: io.github.base76-research-lab/cognos-session-memory

Verified context injection via epistemic trust scoring for LLMs.

Solves session fragmentation by maintaining verified, high-confidence session context between conversations.

Problem

Large language models suffer from session fragmentation: each new conversation starts without verified context of previous work. This forces repeated explanations, loses decision history, and breaks long-running workflows.

Existing solutions (persistent memory systems, vector retrieval) either:

  • Lack trust scores before injection → hallucinations propagate
  • Don't audit which context was injected → compliance gaps
  • Treat all past information equally → noise overwhelms signal

Solution

A plan-mode gateway that:

  1. Extracts structured context from 3-5 recent traces
  2. Scores context quality via CognOS epistemic formula: C = p · (1 − Ue − Ua)
  3. Injects as system prompt only if C > threshold
  4. Flags for manual review if C < threshold
  5. Audits every context injection with trace IDs → EU AI Act compliance

Architecture

code
recent_traces (n=5)
    ↓
extract_context() → ContextField + coverage
    ↓
compute_trust_score(p, ue, ua) → C, R, decision
    ↓
if C > threshold:
    system_prompt ← inject
else:
    flagged_reason ← manual review

Core Formula

code
C = p · (1 − Ue − Ua)
R = 1 − C

where:
  p   = prediction confidence (coverage of required fields)
  Ue  = epistemic uncertainty (divergence between traces)
  Ua  = aleatoric uncertainty (mean risk in traces)

Action Gate

code
R < 0.25       → PASS      (inject without review)
0.25 ≤ R < 0.60 → REFINE   (inject with caution)
R ≥ 0.60       → ESCALATE  (flag for manual review)

API

POST /v1/plan

Extract and score context.

Request:

json
{
  "n": 5,
  "trust_threshold": 0.75,
  "mode": "auto"
}

Response (if injected):

json
{
  "status": "injected",
  "trust_score": 0.82,
  "confidence": 0.82,
  "risk": 0.18,
  "decision": "PASS",
  "context": {
    "active_project": "CognOS mHC research",
    "last_decision": "Verify P1 hypothesis",
    "open_questions": ["How does routing entropy scale?"],
    "current_output": "exp_008 complete",
    "recent_models": ["gpt-4", "claude-3", "mistral"]
  },
  "system_prompt": "## CognOS Context...",
  "trace_ids": ["uuid-1", "uuid-2", ...]
}

Response (if flagged):

json
{
  "status": "flagged",
  "trust_score": 0.45,
  "decision": "REFINE",
  "flagged_reason": "Trust score 0.45 below threshold 0.75. Manual review recommended.",
  "trace_ids": [...]
}

Modes

  • auto (default) — inject if trust_score ≥ threshold, else flag
  • force — always inject (for testing)
  • dry_run — compute score but never inject

Claude Code Integration

As a /compact replacement

bash
# In any Claude Code session:
/save

Claude writes a structured summary, trust-scores it, and persists it to SQLite. Next session: automatically injected as SESSION_CONTEXT before your first prompt.

See docs/COMPACT_ALTERNATIVE.md for a full comparison.

As an MCP server

Add to ~/.claude/settings.json:

json
{
  "mcpServers": {
    "cognos-session-memory": {
      "command": "python3",
      "args": ["/path/to/cognos-session-memory/mcp_server.py"]
    }
  }
}

Tools exposed:

ToolDescription
save_session(summary, project?)Trust-score and persist a session summary
load_session(threshold?)Retrieve last verified context (default threshold: 0.45)

Quick Start

Installation

bash
git clone https://github.com/base76-research-lab/cognos-session-memory
cd cognos-session-memory
pip install -e .

Run Gateway

bash
python3 -m uvicorn --app-dir src main:app --port 8788

Test /v1/plan (dry_run)

bash
curl -X POST http://127.0.0.1:8788/v1/plan \
  -H 'Content-Type: application/json' \
  -d '{"n": 5, "mode": "dry_run"}'

Test /v1/plan (auto)

bash
curl -X POST http://127.0.0.1:8788/v1/plan \
  -H 'Content-Type: application/json' \
  -d '{"n": 5, "trust_threshold": 0.75, "mode": "auto"}'

Modules

  • trust.py — CognOS confidence formula, action gate, signal extractors
  • trace_store.py — SQLite persistence (write/read/purge)
  • plan.py — Context extraction, trust scoring, system prompt building
  • main.py — FastAPI gateway + middleware
  • mcp_server.py — MCP stdio server (save_session, load_session)

Testing

bash
pytest tests/ -v --cov=src

Documentation

Research Paper

See docs/PAPER.md — "Verified Context Injection: Epistemically Scored Session Memory for Large Language Models"

Status: Independent research — Base76 Research Lab, 2026 Authors: Björn André Wikström (Base76)

Citation

bibtex
@software{wikstrom2026cognos,
  author = {Wikström, Björn André},
  title = {{CognOS Session Memory}: Verified Context Injection via Epistemic Trust Scoring},
  year = {2026},
  url = {https://github.com/base76-research-lab/cognos-session-memory}
}

License

MIT

Contact

常见问题

io.github.base76-research-lab/cognos-session-memory 是什么?

CognOS trust scoring (C=p·(1-Ue-Ua)) and session trace storage as MCP tools.

相关 Skills

网页构建器

by anthropics

Universal
热门

面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。

在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。

编码与调试
未扫描114.1k

前端设计

by anthropics

Universal
热门

面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。

想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。

编码与调试
未扫描114.1k

网页应用测试

by anthropics

Universal
热门

用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。

借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。

编码与调试
未扫描114.1k

相关 MCP Server

GitHub

编辑精选

by GitHub

热门

GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。

这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。

编码与调试
83.4k

by Context7

热门

Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。

它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。

编码与调试
52.2k

by tldraw

热门

tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。

这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。

编码与调试
46.3k

评论