验证代理
proof-agent
by andreagriffiths11
Adversarial verification of AI-generated work. Spawns an independent verifier to check for false claims, broken code, and security issues.
安装
claude skill add --url https://github.com/openclaw/skills文档
Proof Agent
Independent adversarial verification for AI work. The worker and the verifier are always separate agents — self-verification is not verification.
When to Verify
Verify automatically when:
- Subagent changed 3+ files
- ANY changed file matches:
*auth*,*secret*,*permission*,Dockerfile,*.env* - User explicitly asks for verification
Skip verification for:
- Formatting-only changes (whitespace, linting fixes)
.gitignorechanges
How to Verify
- Spawn an independent verifier subagent — the worker CANNOT verify its own work
- Give the verifier ONLY: the original request, files changed, and approach taken
- Do NOT share the worker's self-assessment or test results
- The verifier must run its own commands and provide evidence
- If no subagent ran (manual changes or user says "verify this"), use
git diffoutput as the approach summary
Verification Prompt
Use this prompt when spawning the verifier subagent:
VERIFICATION REQUEST
## Original Request
{what was asked}
## Files Changed
{list of files}
## Approach Taken
{what the worker did — or git diff summary if no subagent ran}
## Your Job
You are an independent verifier. The worker who made these changes CANNOT verify their own work — only you can assign a verdict.
### Review Checklist
1. Correctness: Does the code actually do what was requested?
2. Bugs & Edge Cases: Regressions, unhandled errors, missed cases?
3. Security: Vulnerabilities, exposed secrets, permission issues?
4. Build: Does it build/compile/lint cleanly?
5. Facts: Are any claims, version numbers, or URLs verifiable? Check them.
### Rules
- For EVERY check, include the actual command you ran and its output
- Do NOT take the worker's word for anything
- Do NOT give PASS without running at least 3 verification commands
- You have NO information about the worker's test results — verify independently
## Verdict
Assign EXACTLY ONE:
PASS — All checks passed. Every claim backed by command output.
FAIL — Issues found. List each: file, line, what's wrong, severity (critical/major/minor).
PARTIAL — Some passed, some unverifiable. List both with evidence.
Verdicts
- PASS — All checks passed with evidence
- FAIL — Issues found. Report to user with specifics. Retry up to 3 times if fixable.
- PARTIAL — Some checks passed, others couldn't be verified. Report what's unverifiable.
After Verification
- PASS: Report summary to user, proceed
- FAIL: Report issues to user. If auto-fixable, spawn worker to fix, then re-verify (max 3 attempts)
- PARTIAL: Report to user, let them decide whether to proceed
Scripts
scripts/verify.sh [base-ref]
Auto-extracts git diff, changed files, commit messages, and sensitive file detection. Outputs a filled verification prompt ready to send to the verifier subagent. Default base: HEAD~1.
bash scripts/verify.sh # verify last commit
bash scripts/verify.sh main # verify all changes since main
scripts/fact-check.sh <file> [file2 ...]
Extracts and validates factual claims from files:
- URLs → HTTP status check
- npm packages → registry version lookup
- GitHub Actions → tag/SHA existence check
bash scripts/fact-check.sh src/content/articles/en/my-article.md
bash scripts/fact-check.sh .github/workflows/*.yml
Returns exit code 1 if any checks fail.
Configuration
Projects can customize via proof-agent.yaml in the repo root (loaded by proof_agent/config.py):
thresholds:
min_files_changed: 3
always_verify:
- "**/*auth*"
- "**/*secret*"
- "**/*permission*"
- "**/Dockerfile"
- "**/*.env*"
never_verify:
- "**/.gitignore"
retry:
max_attempts: 3
escalate_on_max: true
Key Principle
The worker and verifier must be separate agents. Self-verification is not verification.
相关 Skills
Claude接口
by anthropics
面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。
✎ 想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心
计算机视觉
by alirezarezvani
聚焦目标检测、图像分割与视觉系统落地,覆盖 YOLO、DETR、Mask R-CNN、SAM 等方案,适合定制数据集训练、推理优化及 ONNX/TensorRT 部署。
✎ 把目标检测、图像分割到推理部署串成完整工程链路,主流框架与 YOLO、DETR、SAM 等方案都覆盖,落地视觉 AI 会省心很多。
智能体流程设计
by alirezarezvani
面向生产级多 Agent 编排,梳理顺序、并行、分层、事件驱动、共识五种工作流设计,覆盖 handoff、状态管理、容错重试、上下文预算与成本优化,适合搭建复杂 AI 协作系统。
✎ 帮你把多智能体流程设计、编排和自动化统一起来,复杂工作流也能更稳地落地,适合追求强控制力的团队。
相关 MCP 服务
知识图谱记忆
编辑精选by Anthropic
Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。
✎ 帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。
顺序思维
编辑精选by Anthropic
Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。
✎ 这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。
PraisonAI
编辑精选by mervinpraison
PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。
✎ 如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。