sayou

效率与工作流

by pixell-global

为 AI agents 提供持久化 knowledge workspace,支持版本化文件、搜索与 MCP tools。

什么是 sayou

为 AI agents 提供持久化 knowledge workspace,支持版本化文件、搜索与 MCP tools。

README

sayou

A file-system inspired context store for AI agents.

Built to replace the databases of the web era. Open source. File-first. SQL-compatible.

License Python

Databases were designed for transactions — they reduce nuance to fit a schema. Agents think deeply, then forget everything when the session ends. sayou is where reasoning persists, context accumulates, and knowledge compounds over time.

  • Files that hold what databases can't — Frontmatter for structure. Markdown for context. Versioned. Auditable.
  • One read. Full context. — Every read accepts a token_budget. Returns summaries with section pointers when content exceeds the budget.
  • Knowledge that compounds — Append-only version history. Every change is a new version. Full audit trail and time-travel reads.
  • Any agent can connect — MCP server, Python library, or CLI. Optional REST API with pip install sayou[api].

Quick Start

Claude Code (recommended)

From within Claude Code:

code
/plugin install sayou@pixell-global

Or from the terminal:

bash
claude plugin install sayou@pixell-global

One command. This installs the plugin with lifecycle hooks (workspace context on session start, passive activity capture, session summaries) and skills (/ws, /save, /recall). If sayou isn't installed yet, the plugin auto-installs it on first run.

Cloud mode: To sync your workspace via Sayou Drive, run sayou auth after installing and paste your API key from Settings.

pip install

bash
pip install sayou && sayou init --claude

This installs sayou and configures ~/.claude/mcp.json. You get the 11 MCP tools but no hooks or skills. You can also use --cursor or --windsurf, or run sayou init without flags to get the config snippet to paste manually.

To verify either method, run sayou status — you should see your workspace path, database location, and 11 tools registered. If you see errors, jump to Troubleshooting.

Try It

Open Claude Code and paste any of these prompts. Each one triggers a different MCP tool — no setup beyond the Quick Start above.

PromptWhat happensTool
"Save a note about our Q1 goals with status: draft"Creates a versioned file with YAML frontmatterworkspace_write
"What files are in my workspace?"Lists all files and subfoldersworkspace_list
"Search for anything tagged status: draft"Filters files by frontmatter metadataworkspace_search
"Read the Q1 goals note"Retrieves file content with token-budget awarenessworkspace_read
"Update the status to reviewed"Writes a new version (old version preserved)workspace_write
"Show me the diff between versions"Compares two versions with unified diffworkspace_history
"Store a setting: theme = dark"Saves a key-value pair (with optional TTL)workspace_kv
"Find all markdown files in research/"Matches files by glob patternworkspace_glob
"Search file contents for the word 'budget'"Grep-style content search with contextworkspace_grep

Every write creates a new version. Nothing is overwritten. You can always go back.

What It Looks Like

A short conversation showing sayou in action across sessions:


You: Research competitor pricing for our enterprise tier and save your findings.

Claude Code: I'll research this and save my findings to the workspace.

[Uses workspace_write to create research/competitor-pricing.md]

Saved to research/competitor-pricing.md with frontmatter: status: draft, topic: competitor-pricing


You (next day, new session): What did we learn about competitor pricing?

Claude Code: Let me check our workspace.

[Uses workspace_search with query "competitor pricing"]

Found research/competitor-pricing.md (saved yesterday):

  • Competitor A: $99/seat/month
  • Competitor B: $149/seat/month (includes support)
  • Competitor C: $79/seat/month (limited features)

You: Update the status to "reviewed" and show me the diff.

Claude Code: [Uses workspace_write to create version 2, then workspace_history to diff]

diff
- status: draft
+ status: reviewed
+ reviewer_notes: |
+   Competitor B is closest to our positioning.

The key insight: Claude Code remembered the research across sessions because sayou persisted it as a versioned file — not a chat message that disappears.

Setup for Other Editors

Cursor

bash
sayou init --cursor

This adds sayou to .cursor/mcp.json in your current working directory.

Windsurf

bash
sayou init --windsurf

This adds sayou to ~/.codeium/windsurf/mcp_config.json.

Any MCP-compatible client

sayou is a standard MCP server. Run sayou init (no flag) to get the config snippet, then paste it into your editor's MCP config. The entry is always the same — just "command": "sayou".

MCP Tools

The agent gets 11 tools (12 with embeddings enabled):

ToolDescription
workspace_writeWrite or update a file (text or binary with YAML frontmatter)
workspace_readRead latest or specific version, with optional line range
workspace_listList files and subfolders with auto-generated index
workspace_searchSearch by full-text query, frontmatter filters, or chunk-level
workspace_deleteSoft-delete a file (history preserved)
workspace_historyVersion history with timestamps, or diff between two versions
workspace_globFind files matching a glob pattern
workspace_grepSearch file contents with context lines
workspace_kvKey-value store (get/set/list/delete with optional TTL)
workspace_linksFile links and knowledge graph (get or add links)
workspace_chunksChunk outline or read a specific chunk by index
workspace_semantic_searchVector similarity search (requires SAYOU_EMBEDDING_PROVIDER)

Python API

python
import asyncio
from sayou import Workspace

async def main():
    async with Workspace() as ws:
        # Write a file with YAML frontmatter
        await ws.write("notes/hello.md", """\
---
status: active
tags: [demo, quickstart]
---
# Hello from sayou
This file is versioned and searchable.
""")

        # Read it back
        doc = await ws.read("notes/hello.md")
        print(doc["content"])

        # Search by frontmatter
        results = await ws.search(filters={"status": "active"})
        print(f"Found {results['total']} active files")

asyncio.run(main())

See examples/quickstart.py for a runnable version.

CLI

bash
# File operations
sayou file read notes/hello.md
sayou file write notes/hello.md "# Hello World"
sayou file list /
sayou file search --query "hello" --filter status=active

# KV store
sayou kv set config.theme '"dark"'
sayou kv get config.theme

# Cloud authentication
sayou auth            # Connect to Sayou Drive (interactive)
sayou auth status     # Show current mode (cloud/local)
sayou auth logout     # Disconnect from Sayou Drive

# Diagnostics
sayou init      # Initialize local setup
sayou status    # Show diagnostic info

Examples

ExampleWhat it shows
quickstart.pyHello World — write, read, search, list in 30 lines
kv_config.pyKV store for config, feature flags, caching with TTL
version_control.pyVersion history, diff, time-travel reads
file_operations.pyMove, copy, binary files, glob patterns
multi_agent.pyMulti-agent collaboration with shared workspace
research_agent.pyAll methods exercised — the comprehensive reference

Reference Agent

sayou ships with a reference agent server — a multi-turn assistant that can search, read, write, and research using your workspace. It's a complete working example of building an agent on sayou.

Quick start

bash
# Install with agent dependencies
pip install sayou[agent]

# Configure (copy and fill in your OpenAI key)
cp agent/.env.example .env

# Run the agent server
python -m sayou.agent

The agent runs on http://localhost:9008 with a streaming SSE endpoint at POST /chat/stream.

What the agent can do

CapabilityHow it works
Answer questionsSearches workspace first, falls back to web search
Research topicsMultiple web searches, extracts facts, saves structured findings
Store knowledgeWrites files with YAML frontmatter, section headings, source citations
Execute codeOptional E2B sandbox for Python and bash (set SAYOU_AGENT_E2B_API_KEY)

Evaluate the agent

bash
# Start agent in one terminal
python -m sayou.agent

# Quick pass/fail eval
python -m sayou.agent.benchmarks.eval

# Detailed scoring (0-10 per capability)
python -m sayou.agent.benchmarks.eval_full

Architecture

code
Client → FastAPI (port 9008)
         ↓
      Orchestrator
         ├─ LLMProvider (OpenAI streaming + tool calls)
         ├─ ToolFactory
         │  ├─ workspace_search/read/list/write (→ sayou SDK)
         │  ├─ web_search (→ Tavily API, optional)
         │  └─ execute_bash/python (→ E2B sandbox, optional)
         └─ SandboxManager (per-session isolation, auto-cleanup)

SAMB: Structured Agent Memory Benchmark

sayou includes SAMB — an open benchmark for evaluating memory systems on real agentic workflows. Existing benchmarks (LOCOMO, LongMemEval, DMR) test conversation recall. SAMB tests what agents actually need: recalling decisions, retrieving artifact contents, and connecting knowledge across sessions.

What SAMB measures

DimensionWhat it tests
Decision reasoning"Why was bcrypt chosen over Argon2?"
Artifact content"What endpoints are in the API docs?"
Cross-session"How does session 3's auth decision affect session 5's implementation?"
Fact recall"What was the monthly GCP cost estimate?"
Temporal"What changed between the first and second architecture review?"

10 scenarios, 62 sessions, 131 QA pairs across 7 question types. Each scenario simulates a multi-session professional project (auth system design, cloud migration, email campaigns, incident response, etc.) with realistic conversations, decisions, and artifacts.

Run the benchmark

bash
# Prerequisites: pip install sayou mem0ai zep-cloud
# Requires: OPENAI_API_KEY (for judge/answer models)
#           ZEP_API_KEY (for zep adapter)

# Run all adapters on all scenarios
python -m benchmarks.runner.cli

# Specific adapters
python -m benchmarks.runner.cli --adapter sayou mem0

# Specific scenarios
python -m benchmarks.runner.cli --adapter sayou --scenario 01 03 08

# Verbose output (per-question scores)
python -m benchmarks.runner.cli --verbose

# Override judge/answer models
python -m benchmarks.runner.cli --judge-model gpt-4o --answer-model gpt-4o

Results are saved to benchmarks/results/ as JSON with full per-question breakdowns.

Available adapters

AdapterSystemRetrieval approach
sayousayou workspaceFTS5 + grep + file read (agentic, multi-tool)
mem0mem0LLM fact extraction + embedding search (agentic)
zepZep CloudKnowledge graph + temporal edges (agentic)
oracleBaselineDirect access to source sessions (upper bound)
no_memoryBaselineNo retrieval (lower bound)

Methodology

Each adapter uses agentic retrieval — an LLM generates multiple search queries rather than a single-shot lookup. This gives every system a fair chance at finding relevant information.

Scoring: LLM-judged (gpt-4o-mini) on a 0–3 scale, normalized to percentage. Task-type questions add holistic scoring (1–5) and evidence coverage (per-item FOUND/MISSING). Statistical significance via bootstrap confidence intervals with Bonferroni correction.

Full methodology: benchmarks/dataset/METHODOLOGY.md Dataset card: benchmarks/dataset/DATASET_CARD.md

Installation Options

bash
# Basic (MCP server + CLI + SQLite)
pip install sayou

# With REST API support
pip install sayou[api]

# With S3 storage
pip install sayou[s3]

# With reference agent server
pip install sayou[agent]

# Full installation (all features)
pip install sayou[all]

Production Deployment

For team/production use with MySQL + S3:

json
{
  "mcpServers": {
    "sayou": {
      "command": "sayou",
      "env": {
        "SAYOU_ORG_ID": "my-org",
        "SAYOU_USER_ID": "alice",
        "SAYOU_DATABASE_URL": "mysql+aiomysql://user:pass@host/sayou",
        "SAYOU_S3_BUCKET_NAME": "my-bucket",
        "SAYOU_S3_ACCESS_KEY_ID": "...",
        "SAYOU_S3_SECRET_ACCESS_KEY": "..."
      }
    }
  }
}

Install with all backends: pip install sayou[all]

Storage Backends

BackendConfigUse case
SQLite + local disk (default)No config neededLocal dev, single-machine agents, MCP server
MySQL + S3Set database_url, S3 credentialsProduction, multi-agent, shared workspaces

Troubleshooting

Verify your setup

bash
sayou status

This shows your workspace path, database location, storage backend, and tool count. If everything is working, you'll see 11 tools registered.

Common issues

ProblemCauseFix
Claude Code doesn't see sayou toolsMCP config not loadedRestart Claude Code after editing ~/.claude/mcp.json
sayou: command not foundNot on PATHRun pip install sayou again, or use full path in MCP config: "command": "/path/to/sayou"
sayou status shows 0 toolsServer didn't initializeRun sayou init first, then check for errors in output
Files not persistingWrong workspace pathCheck sayou status for the workspace path — default is ~/.sayou/
Import errors on startupMissing optional dependencyInstall the extra you need: pip install sayou[api], sayou[s3], or sayou[all]

Get help

What sayou is NOT

  • Not a vector database. Pinecone, Weaviate, and Chroma store embeddings for similarity search. sayou stores structured files that agents read, write, and reason over.
  • Not a memory layer. Mem0 and similar tools store conversation snippets. sayou stores work product — research, client records, project documentation — that compounds over time.
  • Not a sandbox. E2B provides ephemeral execution environments. sayou provides persistent storage that outlives any single execution.
  • Not a filesystem. AgentFS intercepts syscalls to virtualize file operations. A knowledge workspace with versioning and indexing.

Philosophy

Read PHILOSOPHY.md for the founding vision and design principles.

Contributing

See CONTRIBUTING.md.

License

Apache 2.0 — See LICENSE

<!-- mcp-name: io.github.pixell-global/sayou -->

常见问题

sayou 是什么?

为 AI agents 提供持久化 knowledge workspace,支持版本化文件、搜索与 MCP tools。

相关 Skills

表格处理

by anthropics

Universal
热门

围绕 .xlsx、.xlsm、.csv、.tsv 做读写、修复、清洗、格式整理、公式计算与格式转换,适合修改现有表格、生成新报表或把杂乱数据整理成交付级电子表格。

做 Excel/CSV 相关任务很省心,能直接读写、修复、清洗和格式转换,尤其擅长把乱七八糟的表格整理成交付级文件。

效率与工作流
未扫描109.6k

PDF处理

by anthropics

Universal
热门

遇到 PDF 读写、文本表格提取、合并拆分、旋转加水印、表单填写或加解密时直接用它,也能提取图片、生成新 PDF,并把扫描件通过 OCR 变成可搜索文档。

PDF杂活别再来回切工具了,文本表格提取、合并拆分到OCR识别一次搞定,连扫描件也能变可搜索。

效率与工作流
未扫描109.6k

Word文档

by anthropics

Universal
热门

覆盖Word/.docx文档的创建、读取、编辑与重排,适合生成报告、备忘录、信函和模板,也能处理目录、页眉页脚、页码、图片替换、查找替换、修订批注及内容提取整理。

搞定 .docx 的创建、改写与精排版,目录、批量替换、批注修订和图片更新都能自动化,做正式文档尤其省心。

效率与工作流
未扫描109.6k

相关 MCP Server

文件系统

编辑精选

by Anthropic

热门

Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。

这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。

效率与工作流
82.9k

by wonderwhy-er

热门

Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。

这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。

效率与工作流
5.8k

EdgarTools

编辑精选

by dgunning

热门

EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。

这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。

效率与工作流
1.9k

评论