Evidra

DevOps

by vitas

为运行 kubectl、terraform、helm 和 argocd 的 AI agents 提供 fail-closed 策略护栏。

什么是 Evidra

为运行 kubectl、terraform、helm 和 argocd 的 AI agents 提供 fail-closed 策略护栏。

README

Evidra

CI Release Pipeline License

Flight recorder and reliability scoring for infrastructure automation

Evidra records intent, outcome, and refusal for every infrastructure mutation — across MCP agents, CI pipelines, A2A agents, and scripts. The append-only evidence chain enables risk assessment, behavioral signal detection, and reliability scoring.

CLI and MCP are the authoritative analytics surfaces today.

Two ways to use it:

WhatHow
DevOps MCP ServerAll-in-one: kubectl/helm/terraform/aws with smart output + auto-evidenceevidra-mcp as your agent's MCP server
Flight RecorderAdd evidence to any existing workflow — no MCP requiredevidra record, evidra import, webhooks, or proxy mode

Quick Start — MCP Server

json
{
  "mcpServers": {
    "evidra": {
      "command": "evidra-mcp",
      "args": ["--evidence-dir", "~/.evidra/evidence"]
    }
  }
}

Your agent gets seven default DevOps tools: run_command, collect_diagnostics, write_file, describe_tool, prescribe_smart, report, and get_event. The normal path is still run_command with automatic evidence recording for mutations. Use describe_tool only when you want the full explicit-control schema for prescribe_smart or report. Add --full-prescribe when you also want artifact-aware prescribe_full.

Quick Start — CLI (No MCP)

bash
# Wrap any command — evidence recorded automatically
evidra record -f deploy.yaml -- kubectl apply -f deploy.yaml

# Import from CI pipelines
evidra import --input record.json

# View reliability scorecard
evidra scorecard --period 30d

Works with any agent framework, CI system, or script. No MCP required.

Security boundary: Evidra does not sandbox the wrapped command. Treat it with the same trust model as direct shell execution.

bash
# Install
brew install samebits/tap/evidra

What Your Agent Gets

Smart output — fewer tokens, same information

code
Agent: run_command("kubectl get deployment web -n bench")

# Without evidra-mcp (raw JSON): ~2,400 tokens
{"apiVersion":"apps/v1","metadata":{"managedFields":[...],...},"spec":{...},"status":{...}}

# With evidra-mcp (smart output): ~40 tokens
deployment/web (bench): 0/2 ready | image: nginx:99.99 | Available=False

Auto-evidence for mutations — zero agent code

code
Agent: run_command("kubectl apply -f fix.yaml")
  → evidra auto-prescribes (intent recorded)
  → kubectl executes
  → evidra auto-reports (outcome recorded)
  → smart output returned to agent

Read-only commands (get, describe, logs) execute directly — no overhead.

Skills — tested on real infrastructure

Install the Evidra skill to give your agent operational discipline: diagnosis before fix, safety boundaries, domain-specific patterns. Skills are tested on 62 real scenarios via infra-bench before shipping — skills that hurt performance don't ship.

7 default tools, plus optional Full Prescribe

ToolDescription
run_commandExecute kubectl, helm, terraform, aws — with smart output
collect_diagnosticsGather pods, describe output, events, and recent logs for one workload
write_fileWrite config or manifest files under the current workspace or temp directories
describe_toolShow the full schema for deferred protocol tools when you want explicit control
prescribe_smartSmart Prescribe with deferred schema loading; use describe_tool first when needed
reportRecord outcome; full explicit schema available via describe_tool
get_eventLook up evidence

Enable --full-prescribe to add Full Prescribe when your agent has artifact bytes and you want artifact-aware explicit intent capture.

Most agents only need run_command. Use collect_diagnostics when the model would otherwise spend multiple turns on get / describe / events / logs. Use write_file for agent-authored manifests or Terraform snippets without leaving the MCP surface. Use describe_tool only when you deliberately want the explicit prescribe_smart / report flow instead of the default auto-evidence path.

Why Not Just kubectl-mcp-server?

kubectl-mcp-serverevidra-mcp
Tools270 specialized7 default tools + optional Full Prescribe
OutputRaw JSON (~2400 tokens)Smart summary (~40 tokens)
EvidenceNoneAuto prescribe/report for mutations
SecurityOpenCommand allowlist + blocked subcommands
SkillsNoneBench-tested, installable
ScoringNoneReliability scorecards + behavioral signals

For Platform Teams

Self-hosted analytics

bash
docker compose up --build -d

Centralize evidence across agents, pipelines, and controllers:

  • Which agents retry the same operation?
  • Which scenarios cause the most failures?
  • How does model X compare to model Y on real infrastructure?

CI/CD integration

bash
# Wrap any command — CLI records prescribe/execute/report
evidra record -f deploy.yaml -- kubectl apply -f deploy.yaml

# Import completed operations
evidra import --input record.json

# View reliability scorecard
evidra scorecard --period 30d

References: Self-hosted setup · CLI reference · API reference

For Agent Benchmarking

Test which skills and tools actually improve your agent. 62 real scenarios on real Kubernetes clusters.

bash
# Baseline — no skill
infra-bench certify --track cka --model sonnet --provider bifrost

# With role skill
infra-bench certify --track cka --model sonnet --role k8s-admin

# Result: skills help L1 (75% fewer turns) but break L2 diagnosis

Bench repo: evidra-infra-bench | Dashboard: lab.evidra.cc/bench

Intelligence Layer

From the evidence chain, Evidra computes:

  • Risk assessment — pluggable pipeline with multiple assessors
  • Behavioral signals — protocol violations, retry loops, blast radius, drift detection
  • Reliability scorecards — 0-100 score with band and confidence

Eight behavioral signals documented in the Signal specification.

Explicit Protocol (Advanced)

For agents that want full control over evidence recording:

text
prescribe_smart / prescribe_full  →  canonicalize artifact → assess risk → record intent
execute    →  run the command (or decline to act)
report     →  record verdict, exit code, or refusal reason

Three evidence modes:

ModeHowAgent awareness
Proxy ObservedAuto prescribe/report via observed mutation-style tool callsNone needed
Smart PrescribeAgent calls prescribe_smart + reportMinimal (~30 tokens)
Full PrescribeAgent calls prescribe_full with artifactFull artifact (~300 tokens)

Most users should use Proxy Observed or the default DevOps surface. Smart Prescribe and Full Prescribe are for teams that want agents to see risk assessments before executing.

Proxy Mode — Wrap Mutation-Oriented MCP Servers

Add evidence to an existing MCP server — zero agent changes:

json
{
  "mcpServers": {
    "infra": {
      "command": "evidra-mcp",
      "args": ["--proxy", "--", "npx", "-y", "@anthropic/mcp-server-kubernetes"]
    }
  }
}

The proxy records evidence when it sees run_command or other mutation-shaped MCP tool calls it can classify heuristically. Unclassified or read-only tool calls pass through without evidence.

Docs

Development

bash
make build
make test
make lint
make test-mcp-inspector    # MCP protocol compliance tests

Environment Variables

VariableDescription
EVIDRA_EVIDENCE_DIREvidence storage path (default: ~/.evidra/evidence)
EVIDRA_SIGNING_MODEstrict (default) or optional (dev mode)
EVIDRA_SIGNING_KEYBase64 Ed25519 signing key
EVIDRA_ENVIRONMENTEnvironment label (production, staging)

License

Licensed under the Apache License 2.0.

常见问题

Evidra 是什么?

为运行 kubectl、terraform、helm 和 argocd 的 AI agents 提供 fail-closed 策略护栏。

相关 Skills

环境密钥管理

by alirezarezvani

Universal
热门

统一梳理dev/staging/prod的.env和密钥流程,自动生成.env.example、校验必填变量、扫描Git历史泄漏,并联动Vault、AWS SSM、1Password、Doppler完成轮换。

统一管理环境变量、密钥与配置,减少泄露和部署混乱,安全治理与团队协作一起做好,DevOps 场景很省心。

DevOps
未扫描15.4k

可观测性设计

by alirezarezvani

Universal
热门

面向生产系统规划可落地的可观测性体系,串起指标、日志、链路追踪与 SLI/SLO、错误预算、告警和仪表盘设计,适合搭建监控平台与优化故障响应。

把监控、日志、链路追踪串起来,帮助团队从设计阶段构建可观测性,排障更快、系统演进更稳。

DevOps
未扫描15.4k

更新日志

by alirezarezvani

Universal
热门

基于 Conventional Commits 自动解析提交记录、判断语义化版本升级并生成规范 changelog,适合在 CI、发版前检查提交格式并批量输出可审计发布说明。

自动生成和管理更新日志与发布说明,帮团队把版本变更说清楚;聚焦版本化与流程自动化,省时又更规范。

DevOps
未扫描15.4k

相关 MCP Server

kubefwd

编辑精选

by txn2

热门

kubefwd 是让 AI 帮你批量转发 Kubernetes 服务到本地的开发神器。

微服务开发者最头疼的本地调试问题,它一键搞定——自动分配 IP 避免端口冲突,还能用自然语言查询状态。但依赖 AI 工作流,纯命令行爱好者可能觉得不够直接。

DevOps
4.1k

Cloudflare

编辑精选

by Cloudflare

热门

Cloudflare MCP Server 是让你用自然语言管理 Workers、KV 和 R2 等云资源的工具。

这个工具解决了开发者频繁切换控制台和文档的痛点,特别适合那些在 Cloudflare 上部署无服务器应用、需要快速调试或管理配置的团队。不过,由于它依赖多个子服务器,初次设置可能有点繁琐,建议先从 Workers Bindings 这类核心功能入手。

DevOps
3.8k

Terraform

编辑精选

by hashicorp

热门

Terraform MCP Server 是让 AI 助手直接操作 Terraform Registry 和 HCP Terraform 的桥梁。

如果你经常在 Terraform 里翻文档找模块配置,这个服务器能省不少时间——直接问 Claude 就能生成准确的代码片段。最适合管理多云基础设施的团队,但注意它目前只适合本地使用,别在生产环境里暴露 HTTP 端点。

DevOps
1.4k

评论