什么是 Evidra?
为运行 kubectl、terraform、helm 和 argocd 的 AI agents 提供 fail-closed 策略护栏。
README
Evidra
Flight recorder and reliability scoring for infrastructure automation
Evidra records intent, outcome, and refusal for every infrastructure mutation — across MCP agents, CI pipelines, A2A agents, and scripts. The append-only evidence chain enables risk assessment, behavioral signal detection, and reliability scoring.
CLI and MCP are the authoritative analytics surfaces today.
Two ways to use it:
| What | How | |
|---|---|---|
| DevOps MCP Server | All-in-one: kubectl/helm/terraform/aws with smart output + auto-evidence | evidra-mcp as your agent's MCP server |
| Flight Recorder | Add evidence to any existing workflow — no MCP required | evidra record, evidra import, webhooks, or proxy mode |
Quick Start — MCP Server
{
"mcpServers": {
"evidra": {
"command": "evidra-mcp",
"args": ["--evidence-dir", "~/.evidra/evidence"]
}
}
}
Your agent gets seven default DevOps tools: run_command, collect_diagnostics, write_file, describe_tool, prescribe_smart, report, and get_event. The normal path is still run_command with automatic evidence recording for mutations. Use describe_tool only when you want the full explicit-control schema for prescribe_smart or report. Add --full-prescribe when you also want artifact-aware prescribe_full.
Quick Start — CLI (No MCP)
# Wrap any command — evidence recorded automatically
evidra record -f deploy.yaml -- kubectl apply -f deploy.yaml
# Import from CI pipelines
evidra import --input record.json
# View reliability scorecard
evidra scorecard --period 30d
Works with any agent framework, CI system, or script. No MCP required.
Security boundary: Evidra does not sandbox the wrapped command. Treat it with the same trust model as direct shell execution.
# Install
brew install samebits/tap/evidra
What Your Agent Gets
Smart output — fewer tokens, same information
Agent: run_command("kubectl get deployment web -n bench")
# Without evidra-mcp (raw JSON): ~2,400 tokens
{"apiVersion":"apps/v1","metadata":{"managedFields":[...],...},"spec":{...},"status":{...}}
# With evidra-mcp (smart output): ~40 tokens
deployment/web (bench): 0/2 ready | image: nginx:99.99 | Available=False
Auto-evidence for mutations — zero agent code
Agent: run_command("kubectl apply -f fix.yaml")
→ evidra auto-prescribes (intent recorded)
→ kubectl executes
→ evidra auto-reports (outcome recorded)
→ smart output returned to agent
Read-only commands (get, describe, logs) execute directly — no overhead.
Skills — tested on real infrastructure
Install the Evidra skill to give your agent operational discipline: diagnosis before fix, safety boundaries, domain-specific patterns. Skills are tested on 62 real scenarios via infra-bench before shipping — skills that hurt performance don't ship.
7 default tools, plus optional Full Prescribe
| Tool | Description |
|---|---|
run_command | Execute kubectl, helm, terraform, aws — with smart output |
collect_diagnostics | Gather pods, describe output, events, and recent logs for one workload |
write_file | Write config or manifest files under the current workspace or temp directories |
describe_tool | Show the full schema for deferred protocol tools when you want explicit control |
prescribe_smart | Smart Prescribe with deferred schema loading; use describe_tool first when needed |
report | Record outcome; full explicit schema available via describe_tool |
get_event | Look up evidence |
Enable --full-prescribe to add Full Prescribe when your agent has artifact bytes and you want artifact-aware explicit intent capture.
Most agents only need run_command. Use collect_diagnostics when the model would otherwise spend multiple turns on get / describe / events / logs. Use write_file for agent-authored manifests or Terraform snippets without leaving the MCP surface. Use describe_tool only when you deliberately want the explicit prescribe_smart / report flow instead of the default auto-evidence path.
Why Not Just kubectl-mcp-server?
| kubectl-mcp-server | evidra-mcp | |
|---|---|---|
| Tools | 270 specialized | 7 default tools + optional Full Prescribe |
| Output | Raw JSON (~2400 tokens) | Smart summary (~40 tokens) |
| Evidence | None | Auto prescribe/report for mutations |
| Security | Open | Command allowlist + blocked subcommands |
| Skills | None | Bench-tested, installable |
| Scoring | None | Reliability scorecards + behavioral signals |
For Platform Teams
Self-hosted analytics
docker compose up --build -d
Centralize evidence across agents, pipelines, and controllers:
- Which agents retry the same operation?
- Which scenarios cause the most failures?
- How does model X compare to model Y on real infrastructure?
CI/CD integration
# Wrap any command — CLI records prescribe/execute/report
evidra record -f deploy.yaml -- kubectl apply -f deploy.yaml
# Import completed operations
evidra import --input record.json
# View reliability scorecard
evidra scorecard --period 30d
References: Self-hosted setup · CLI reference · API reference
For Agent Benchmarking
Test which skills and tools actually improve your agent. 62 real scenarios on real Kubernetes clusters.
# Baseline — no skill
infra-bench certify --track cka --model sonnet --provider bifrost
# With role skill
infra-bench certify --track cka --model sonnet --role k8s-admin
# Result: skills help L1 (75% fewer turns) but break L2 diagnosis
Bench repo: evidra-infra-bench | Dashboard: lab.evidra.cc/bench
Intelligence Layer
From the evidence chain, Evidra computes:
- Risk assessment — pluggable pipeline with multiple assessors
- Behavioral signals — protocol violations, retry loops, blast radius, drift detection
- Reliability scorecards — 0-100 score with band and confidence
Eight behavioral signals documented in the Signal specification.
Explicit Protocol (Advanced)
For agents that want full control over evidence recording:
prescribe_smart / prescribe_full → canonicalize artifact → assess risk → record intent
execute → run the command (or decline to act)
report → record verdict, exit code, or refusal reason
Three evidence modes:
| Mode | How | Agent awareness |
|---|---|---|
| Proxy Observed | Auto prescribe/report via observed mutation-style tool calls | None needed |
| Smart Prescribe | Agent calls prescribe_smart + report | Minimal (~30 tokens) |
| Full Prescribe | Agent calls prescribe_full with artifact | Full artifact (~300 tokens) |
Most users should use Proxy Observed or the default DevOps surface. Smart Prescribe and Full Prescribe are for teams that want agents to see risk assessments before executing.
Proxy Mode — Wrap Mutation-Oriented MCP Servers
Add evidence to an existing MCP server — zero agent changes:
{
"mcpServers": {
"infra": {
"command": "evidra-mcp",
"args": ["--proxy", "--", "npx", "-y", "@anthropic/mcp-server-kubernetes"]
}
}
}
The proxy records evidence when it sees run_command or other mutation-shaped MCP tool calls it can classify heuristically. Unclassified or read-only tool calls pass through without evidence.
Docs
- MCP Setup Guide
- Skill Setup Guide
- CLI Reference
- API Reference
- Architecture
- Protocol Specification
- Executor Contract
- Supported Tools
Development
make build
make test
make lint
make test-mcp-inspector # MCP protocol compliance tests
Environment Variables
| Variable | Description |
|---|---|
EVIDRA_EVIDENCE_DIR | Evidence storage path (default: ~/.evidra/evidence) |
EVIDRA_SIGNING_MODE | strict (default) or optional (dev mode) |
EVIDRA_SIGNING_KEY | Base64 Ed25519 signing key |
EVIDRA_ENVIRONMENT | Environment label (production, staging) |
License
Licensed under the Apache License 2.0.
常见问题
Evidra 是什么?
为运行 kubectl、terraform、helm 和 argocd 的 AI agents 提供 fail-closed 策略护栏。
相关 Skills
环境密钥管理
by alirezarezvani
统一梳理dev/staging/prod的.env和密钥流程,自动生成.env.example、校验必填变量、扫描Git历史泄漏,并联动Vault、AWS SSM、1Password、Doppler完成轮换。
✎ 统一管理环境变量、密钥与配置,减少泄露和部署混乱,安全治理与团队协作一起做好,DevOps 场景很省心。
可观测性设计
by alirezarezvani
面向生产系统规划可落地的可观测性体系,串起指标、日志、链路追踪与 SLI/SLO、错误预算、告警和仪表盘设计,适合搭建监控平台与优化故障响应。
✎ 把监控、日志、链路追踪串起来,帮助团队从设计阶段构建可观测性,排障更快、系统演进更稳。
更新日志
by alirezarezvani
基于 Conventional Commits 自动解析提交记录、判断语义化版本升级并生成规范 changelog,适合在 CI、发版前检查提交格式并批量输出可审计发布说明。
✎ 自动生成和管理更新日志与发布说明,帮团队把版本变更说清楚;聚焦版本化与流程自动化,省时又更规范。
相关 MCP Server
kubefwd
编辑精选by txn2
kubefwd 是让 AI 帮你批量转发 Kubernetes 服务到本地的开发神器。
✎ 微服务开发者最头疼的本地调试问题,它一键搞定——自动分配 IP 避免端口冲突,还能用自然语言查询状态。但依赖 AI 工作流,纯命令行爱好者可能觉得不够直接。
Cloudflare
编辑精选by Cloudflare
Cloudflare MCP Server 是让你用自然语言管理 Workers、KV 和 R2 等云资源的工具。
✎ 这个工具解决了开发者频繁切换控制台和文档的痛点,特别适合那些在 Cloudflare 上部署无服务器应用、需要快速调试或管理配置的团队。不过,由于它依赖多个子服务器,初次设置可能有点繁琐,建议先从 Workers Bindings 这类核心功能入手。
Terraform
编辑精选by hashicorp
Terraform MCP Server 是让 AI 助手直接操作 Terraform Registry 和 HCP Terraform 的桥梁。
✎ 如果你经常在 Terraform 里翻文档找模块配置,这个服务器能省不少时间——直接问 Claude 就能生成准确的代码片段。最适合管理多云基础设施的团队,但注意它目前只适合本地使用,别在生产环境里暴露 HTTP 端点。