io.github.grahamnotgrant/blacksmith

编码与调试

by grahamnotgrant

MCP server for Blacksmith CI - query runs, analyze test failures, detect flaky tests.

什么是 io.github.grahamnotgrant/blacksmith

MCP server for Blacksmith CI - query runs, analyze test failures, detect flaky tests.

README

Blacksmith MCP

npm version License: MIT MCP

An MCP server that connects Claude to your Blacksmith CI data. Query workflow runs, analyze test failures, detect flaky tests, and monitor usage—all through natural conversation.

Why?

Debugging CI failures usually means clicking through dashboards, copying run IDs, and piecing together information across multiple pages. With this MCP, you can just ask:

  • "Why did the last CI run fail?"
  • "Which tests are flaky this week?"
  • "Compare test failures between main and my PR"
  • "What's using the most cache storage?"

Claude handles the API calls and gives you actionable insights.

Quick Start

Zero-config if you're logged into Blacksmith in Chrome:

bash
# Add to Claude Code
claude mcp add blacksmith -- npx blacksmith-mcp

# Set your org (run once)
export BLACKSMITH_ORG="your-org-name"

The MCP automatically extracts your session from Chrome cookies. No manual token copying needed.

Installation

Option 1: Claude Code CLI

bash
claude mcp add blacksmith -- npx blacksmith-mcp

Option 2: Project Configuration

Add to your .mcp.json:

json
{
  "mcpServers": {
    "blacksmith": {
      "type": "stdio",
      "command": "npx",
      "args": ["blacksmith-mcp"],
      "env": {
        "BLACKSMITH_ORG": "your-org-name"
      }
    }
  }
}

Option 3: Global Install

bash
npm install -g blacksmith-mcp

Configuration

Authentication

Automatic (recommended): Log into app.blacksmith.sh in Chrome. The MCP extracts your session cookie automatically.

Manual: Set BLACKSMITH_SESSION_COOKIE environment variable with your session cookie value.

Environment Variables

VariableRequiredDescription
BLACKSMITH_ORGYesYour Blacksmith organization name
BLACKSMITH_SESSION_COOKIENoSession cookie (auto-extracted from Chrome if not set)

Available Tools

Workflow Runs

ToolDescription
list_runsList workflow runs with filters (status, branch, workflow, actor, PR)
get_runGet run details including all jobs
list_jobsList jobs for a workflow run
get_jobGet job details (steps, timing, runner info)
get_job_logsGet raw log output for a job

Test Analytics

ToolDescription
get_job_testsGet all test results for a job
get_failed_testsGet failed tests with full error messages
get_failures_by_patternGroup failures by error pattern (e.g., "Cannot read properties")
compare_test_runsCompare failures between two runs (find regressions)
get_flaky_testsDetect tests that fail intermittently
get_slow_testsFind tests exceeding duration threshold
get_test_historyTrack a specific test's pass/fail history
get_trendsAnalyze trends: duration, failure rate, test count

Usage & Billing

ToolDescription
get_current_usageCurrent core usage (active vs max)
get_invoice_amountCurrent billing period amount
get_usage_summaryBillable minutes vs free tier
get_cache_statsCache size, entries by repository
get_cache_entriesDetailed cache entries for a repo

Organization

ToolDescription
list_orgsList accessible organizations
get_org_statusOrg status (personal, onboarded, region)
search_logsSearch logs across all jobs

Example Conversations

Debug a Failed Run

You: Why did the last CI run fail?

Claude: Uses list_runs(status="failure", limit=1) then get_failed_tests

The last run failed with 34 test failures. The main issues are:

  • Connection timeout (12 tests) - Tests in api/ are timing out connecting to the test database
  • Undefined property (8 tests) - user.settings.theme is undefined in the new dark mode code
  • Assertion failures (14 tests) - Snapshot mismatches in UI components

Find Flaky Tests

You: Which tests are flaky?

Claude: Uses get_flaky_tests(days=7, min_runs=5)

Found 3 flaky tests:

TestFlake RatePattern
should handle concurrent requests40% (4/10 runs)Fails on CI, passes locally
renders loading state20% (2/10 runs)Race condition with setTimeout
validates auth token30% (3/10 runs)Intermittent network timeout

Compare Branches

You: Did my PR introduce new test failures?

Claude: Uses compare_test_runs with PR run vs base branch

Your PR introduced 2 new failures:

  • UserProfile.test.tsx: "Cannot read property 'avatar' of undefined"
  • api/users.test.ts: Expected 200, got 404

5 previously failing tests now pass (nice!).

Development

bash
# Install dependencies
pnpm install

# Build
pnpm build

# Development mode (watch)
pnpm dev

# Test with MCP Inspector
npx @modelcontextprotocol/inspector node dist/index.js

Troubleshooting

Session Expired

If you see SESSION_EXPIRED, your Blacksmith session has expired. Simply log back into app.blacksmith.sh in Chrome and retry.

Cookie Extraction Failed

The automatic cookie extraction requires:

  • macOS with Chrome installed
  • Being logged into Blacksmith in Chrome
  • Chrome not running with a locked profile

If it fails, set BLACKSMITH_SESSION_COOKIE manually.

No Organization Set

Run list_orgs to see available organizations, then set BLACKSMITH_ORG to your org name.

API Notes

This MCP uses Blacksmith's internal web API, which is undocumented. The API was reverse-engineered from the Blacksmith web app and may change without notice.

License

MIT

Contributing

Contributions welcome! Please open an issue first to discuss proposed changes.

常见问题

io.github.grahamnotgrant/blacksmith 是什么?

MCP server for Blacksmith CI - query runs, analyze test failures, detect flaky tests.

相关 Skills

网页构建器

by anthropics

Universal
热门

面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。

在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。

编码与调试
未扫描114.1k

前端设计

by anthropics

Universal
热门

面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。

想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。

编码与调试
未扫描114.1k

网页应用测试

by anthropics

Universal
热门

用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。

借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。

编码与调试
未扫描114.1k

相关 MCP Server

GitHub

编辑精选

by GitHub

热门

GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。

这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。

编码与调试
83.4k

by Context7

热门

Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。

它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。

编码与调试
52.2k

by tldraw

热门

tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。

这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。

编码与调试
46.3k

评论