Opik MCP Server
平台与服务by comet-ml
通过 Model Context Protocol 与 Opik 的 prompts、traces 和 metrics 交互,便于统一访问与分析。
通过 MCP 把 Opik 的提示词、追踪和指标统一打通,少了来回切换与数据割裂,做 AI 应用分析调试更省心。
什么是 Opik MCP Server?
通过 Model Context Protocol 与 Opik 的 prompts、traces 和 metrics 交互,便于统一访问与分析。
README
opik-mcp
Migrating from the old
npx opik-mcp? The TypeScript server is deprecated and sunsets on 2026-11-15. Swapnpx -y opik-mcpforuvx opik-mcp@latestin your MCP client config. Full guide:legacy/typescript/MIGRATION.md.
Model Context Protocol server for Opik + Ollie. Plug your AI host (Claude Code, Cursor, VS Code Copilot, MCP Inspector) directly into your Opik workspace — read traces, log scores, save prompt versions, and ask Ollie investigative questions, all from the chat.
Built for LLM engineers who already run Opik and want to drive it from the same AI assistant they code with.
You: "Why did the experiment 'gpt-4o-rerank-v3' regress on factuality?"
Claude: → ask_ollie → reads experiment + traces → "Three traces failed because…"
You: "Score trace 7f2e… 0.9 on helpfulness with reason 'great recovery'."
Claude: → write(score.create) → done
Install
opik-mcp is a Python package (requires Python 3.13+). The recommended way to
run it is uvx, which fetches and runs the latest published version on demand —
no global install, no virtualenv juggling.
Install uv once:
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS / Linux
# or: brew install uv
You'll need two things from your Opik workspace:
OPIK_API_KEY— get it fromcomet.com/api/my/settings/.OPIK_WORKSPACE— your workspace name (lowercase, as it appears in the URL). E.g.https://www.comet.com/acme-ai/...→OPIK_WORKSPACE=acme-ai. Optional — defaults todefault(the Opik SDK convention), which is correct for local/OSS installs; cloud users with a named workspace should set it.COMET_WORKSPACEis accepted as a deprecated alias.
Pre-release note:
opik-mcp(Python) is not yet published to PyPI. Until the first PyPI release lands, replaceuvx opik-mcpin any snippet below with:uvx --from git+https://github.com/comet-ml/opik-mcp.git opik-mcp
OPIK_WORKSPACEis optional. Omit theOPIK_WORKSPACEline/key in any snippet below and the server uses thedefaultworkspace (correct for local/OSS installs). Set it only if you connect to a named cloud workspace.
Claude Code
Add the server with one command:
claude mcp add --transport stdio opik-mcp \
--env OPIK_API_KEY=<your-key> \
--env OPIK_WORKSPACE=<your-workspace> \
-- uvx opik-mcp
Or edit ~/.claude.json directly:
{
"mcpServers": {
"opik-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["opik-mcp"],
"env": {
"OPIK_API_KEY": "<your-key>",
"OPIK_WORKSPACE": "<your-workspace>"
}
}
}
}
Restart Claude Code. Verify with /mcp — opik-mcp should appear as connected.
Then, in the chat, ask: "list my Opik projects" — Claude will call the list
tool and you'll see your workspace's projects.
Cursor
Edit ~/.cursor/mcp.json (global) or .cursor/mcp.json (project), or open
Cmd+Shift+J → Features → Model Context Protocol:
{
"mcpServers": {
"opik-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["opik-mcp"],
"env": {
"OPIK_API_KEY": "<your-key>",
"OPIK_WORKSPACE": "<your-workspace>"
}
}
}
}
Reload Cursor; the green dot next to opik-mcp in the MCP panel confirms the
connection. Ask in chat: "list my Opik projects".
Cursor 60s timeout. Cursor enforces a hard tool-call timeout that doesn't reset on progress notifications. Long
ask_ollieturns will fail on Cursor. See Known host limits.
VS Code Copilot
.vscode/mcp.json in your workspace (or User Settings JSON):
{
"servers": {
"opik-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["opik-mcp"],
"env": {
"OPIK_API_KEY": "<your-key>",
"OPIK_WORKSPACE": "<your-workspace>"
}
}
}
}
Reload the window; the Copilot Chat MCP indicator shows opik-mcp once
the server is reachable. Ask in chat: "list my Opik projects".
MCP Inspector (manual testing)
OPIK_API_KEY=<your-key> OPIK_WORKSPACE=<your-workspace> \
npx @modelcontextprotocol/inspector uvx opik-mcp
Self-hosted Opik
Add COMET_URL_OVERRIDE (and OPIK_URL if Opik lives at a non-default path) to
the same env block in your host config:
{
"mcpServers": {
"opik-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["opik-mcp"],
"env": {
"OPIK_API_KEY": "<your-key>",
"COMET_URL_OVERRIDE": "https://opik.your-company.com",
"OPIK_MCP_ANALYTICS_SOURCE": ""
}
}
}
}
ask_ollie and run_experiment are available on Comet Cloud only — on
self-hosted those calls will fail at dispatch, so use read / list / write
directly. Setting OPIK_MCP_ANALYTICS_SOURCE="" opts your install out of the
cloud-Comet source label on telemetry events.
Tools
opik-mcp exposes a small, outcome-oriented surface — six tools that cover
the full lifecycle (read → annotate → curate → author → iterate).
| Tool | Purpose |
|---|---|
read | Universal read by id / name / opik:// URI |
list | Universal list with optional name filter + pagination |
ask_ollie | Investigate / synthesize via the Opik in-product assistant |
write | Universal write — log traces/spans, score, comment, save prompts, manage test suites & experiments |
schema | Introspect write-operation schemas (used by the LLM to construct valid payloads) |
run_experiment | Run an evaluation experiment end-to-end via Ollie |
read
One tool for any "show me X" question. Takes an entity_type plus an id
(UUID or, for nameable types, a name) or a full opik:// URI. Composite reads
(trace, prompt) inline their children so a single call returns the full
picture.
Supported entities: project, trace, span, test_suite, experiment,
prompt. Name-based lookup is available for project, experiment, prompt,
test_suite (slower — two API calls — and may return multiple matches).
read(entity_type="trace", id="7f2e3c8a-…")
read(entity_type="project", id="demo") # name lookup
read(entity_type="trace", id="opik://traces/7f2e3c8a-…")
list
Browse a collection with optional name filter and pagination. Project-scoped
types (trace, test_suite_item, prompt_version) require their parent UUID.
list(entity_type="experiment", page=1, size=25)
list(entity_type="experiment", name="rerank") # name substring filter
list(entity_type="trace", project_id="<project-uuid>") # traces of one project
ask_ollie
For investigative questions, cross-entity synthesis, or anything that needs Opik domain expertise. Ollie has direct read access to your workspace and can execute writes (scores, comments, test-suite items, prompt versions) mid-stream when asked.
ask_ollie(query="Why are spans in project 'demo' slower this week than last?")
ask_ollie(query="Compare experiments A and B on factuality. Score the bottom 5 traces of A 0.2 with reason.")
Returns the assistant's final text plus a thread_id. Pass it back on
follow-ups to preserve context — Ollie has no memory across threads.
YOLO mode (default). Writes Ollie performs mid-stream execute without a
per-action confirmation. Each auto-approval is logged as a JSON audit row on
the opik_mcp.audit Python logger. To require confirmation instead, set
OPIK_MCP_AUTO_APPROVE=disabled — Ollie's confirm requests then surface as
typed errors you can manually re-issue.
Available on Comet Cloud only.
write
Universal write dispatcher. Pass operation + data and the dispatcher
validates the payload, applies the right REST verb, and returns the
backend response.
Operations:
| Operation | What it does |
|---|---|
trace.create | Log a single trace (or a batch). Parent for spans / scores / comments. |
trace.update | Finalize or amend an existing trace. |
span.create | Log a span on an existing trace (or a batch). |
score.create | Attach a numeric feedback score to a trace, span, or thread. |
comment.create | Attach a free-text comment to a trace, span, or thread. |
prompt_version.save | Save a new prompt version (creates the prompt by name if missing). |
test_suite.create | Create an evaluation test suite. |
test_suite_item.upsert | Upsert items into a test suite (always the envelope shape). |
experiment.create | Create an experiment scoped to a test suite. |
experiment_item.create | Attach trace + dataset_item rows to an experiment. |
write(operation="score.create", data={
"target": "trace",
"target_id": "7f2e3c8a-…",
"name": "helpfulness",
"value": 0.9,
"reason": "great recovery"
})
schema
Inspect the exact JSON shape and required fields of any write operation before
you call it — useful when you're not sure what data should look like. Returns
the schema, OAuth scope, and one validated example. Pure lookup, no backend
call.
schema(operation="score.create")
schema(operation="prompt_version.save")
run_experiment
Run an evaluation experiment end-to-end via Ollie. Takes a single
experiment_config dict that mirrors Opik's experiment shape (prompt, test
suite, scorers); Ollie executes the run and writes results back as an Opik
experiment.
run_experiment(experiment_config={
"test_suite_name": "qa-eval-v2",
"prompt_name": "welcome-msg",
# … see `schema(operation="experiment.create")` for the full shape
})
Available on Comet Cloud only.
Configuration
Every setting is an environment variable. Required ones in bold.
Identity / endpoint
| Variable | Default | Notes |
|---|---|---|
OPIK_API_KEY | — | Required for ask_ollie and any authenticated read/write. |
OPIK_WORKSPACE | default | Workspace name. Optional — falls back to default (Opik SDK convention). Cloud users with a named workspace should set it. |
COMET_WORKSPACE | — | Deprecated alias for OPIK_WORKSPACE (backward compat). OPIK_WORKSPACE wins if both are set. |
COMET_WORKSPACE_ID | — | Optional workspace UUID. Stamped into analytics events when set so BI can join on a stable id rather than the (mutable) workspace name. |
COMET_URL_OVERRIDE | https://www.comet.com | Set to your self-hosted Comet host, or https://dev.comet.com for staging. |
OPIK_URL | derived from COMET_URL_OVERRIDE + /opik/api | Override only if Opik lives on a different host/path than the Comet UI. |
OPIK_DEFAULT_PROJECT_NAME | unset | When set, the per-session instructions blob tells the LLM to pass this as project_name on every tool call unless the user names a different project. |
Server / transport
| Variable | Default | Notes |
|---|---|---|
OPIK_MCP_TRANSPORT | stdio | stdio for host-launched, streamable-http to listen on a port. |
OPIK_MCP_HOST | 127.0.0.1 | uvicorn bind host (streamable-http only). |
OPIK_MCP_PORT | 8080 | uvicorn bind port (streamable-http only). |
OPIK_MCP_RELOAD | false | true to enable uvicorn --reload (dev only). |
OPIK_MCP_AS_URL | unset | OAuth Authorization Server URL, advertised in /.well-known/oauth-protected-resource (RFC 9728) and used as the proxy target for AS-discovery probes. Required for MCP hosts to bootstrap the OAuth dance over HTTP. |
OPIK_MCP_RESOURCE_URI | unset | Canonical public URI of this server, advertised as resource in the protected-resource metadata and used to derive the WWW-Authenticate hint. |
OPIK_MCP_LOG_LEVEL | INFO | stderr logger threshold. |
Choosing a transport
opik-mcp performs no local credential validation on HTTP transport: any
well-formed Authorization: Bearer … (an Opik API key or an opik_mcp_at_…
OAuth access token) is forwarded verbatim to opik-backend, which is the
single point of auth enforcement. Pick the transport by deployment shape:
| Scenario | Transport |
|---|---|
| MCP client and Opik on the same machine (local OSS install) | stdio (recommended — simplest, no port, no OAuth setup) |
| Local MCP client → remote Opik (Comet cloud / self-hosted) | stdio with OPIK_API_KEY, or HTTP with OAuth (OPIK_MCP_AS_URL pointing at the backend) |
| Hosted opik-mcp behind the same edge as opik-backend | HTTP — bearers are validated by the backend per request |
Note for local OSS installs: the OSS backend does not authenticate requests,
so an HTTP opik-mcp in front of it is as open as the OSS REST API itself.
Keep the default 127.0.0.1 bind (and prefer stdio) on shared networks.
Ollie / long calls
| Variable | Default | Notes |
|---|---|---|
OPIK_MCP_AUTO_APPROVE | enabled | disabled to require a per-action approval before Ollie's mid-stream writes proceed. On hosts that advertise the MCP elicitation capability the user sees a yes/no prompt; on dumber hosts the request surfaces as a typed error you can manually re-issue. |
OPIK_MCP_ELICIT_TIMEOUT_SECONDS | 60 | How long Ollie's mid-stream confirmation prompt may wait for the user before being treated as a cancel. 0 disables the bound (debug only). |
OPIK_MCP_POD_READY_TIMEOUT_S | 120 | Ollie pod cold-start poll cap. |
OPIK_MCP_POD_READY_INTERVAL_S | 2 | Cold-start poll interval. |
OPIK_MCP_HEARTBEAT_INTERVAL_S | 15.0 | Watchdog cadence — emits a notifications/progress tick when the pod is silent, keeping host timeouts at bay. |
OPIK_MCP_STREAM_IDLE_TIMEOUT_S | 300.0 | Hard ceiling on pod silence before ask_ollie aborts. 0 disables (debug only). |
Telemetry
Anonymous usage events (event type + timing only — no query content). A SHA-256
digest of your API key is included so support can find your account; the raw
key never leaves the process. Opt out: OPIK_MCP_ANALYTICS_ENABLED=false.
| Variable | Default | Notes |
|---|---|---|
OPIK_MCP_ANALYTICS_ENABLED | true | Set to false to disable all telemetry. |
OPIK_MCP_ANALYTICS_URL | https://stats.comet.com/notify/event/ | Override for staging. |
OPIK_MCP_ANALYTICS_ENVIRONMENT | prod | Tag on every event (prod / staging / dev). |
OPIK_MCP_ANALYTICS_SOURCE | comet.com | Receiver uses this to mark on_prem=False. On-prem installs should override to "" or their own domain. |
OPIK_MCP_ANALYTICS_CONNECT_TIMEOUT_S | 5.0 | HTTP connect timeout. |
OPIK_MCP_ANALYTICS_TOTAL_TIMEOUT_S | 10.0 | HTTP total request timeout. |
Known host limits
The MCP spec lets hosts reset their tool-call timeout on
notifications/progress — opik-mcp emits one per Ollie SSE event plus a
15-second watchdog heartbeat. Reality is uneven:
- Claude Code — no documented tool-call timeout; heartbeat keeps the call
alive until
message_end. Recommended. - Cursor — hard 60s timeout that does not reset on progress
(upstream bug).
Long Ollie turns will fail. Keep
ask_olliequeries focused. - MCP Inspector —
MAX_TOTAL_TIMEOUTbounds total duration (default 60s). Raise it in the Inspector UI for long operations.
If a call gets stuck, set OPIK_MCP_LOG_LEVEL=DEBUG — heartbeat failures
(usually host disconnects) are logged on opik_mcp.ask_ollie at debug level.
Troubleshooting
OPIK_API_KEY is required to use ask_ollie — the var isn't reaching the
server process. In Claude Code / Cursor / VS Code, env vars only apply when
inside the env block of the MCP server config, not your shell. Restart the
host after editing.
ask_ollie returns "pod not ready" after 2 minutes — the Ollie pod
cold-start exceeded OPIK_MCP_POD_READY_TIMEOUT_S. Retry — the second call
usually hits a warm pod.
ask_ollie / run_experiment fails with a dispatch error on self-hosted
Opik — those tools are available on Comet Cloud only. Use read / list /
write directly on self-hosted.
Cursor call times out at 60s — Cursor's known bug, not opik-mcp. Either
shorten the Ollie query, or run the same operation on Claude Code which has no
hard cap.
Development
git clone git@github.com:comet-ml/opik-mcp.git
cd opik-mcp
make install # uv sync --extra dev
make check # lint + typecheck + test
make run-dev # uvicorn with --reload + DEBUG logs
make inspect # MCP Inspector against the running server
Common targets:
| Target | What it does |
|---|---|
make install | uv sync --extra dev |
make run | Run the MCP server (stdio by default). |
make run-dev | Run with DEBUG logging + uvicorn --reload. |
make dev | Run via mcp dev (Inspector dev-mode wrapper). |
make inspect | Launch MCP Inspector against a running server. |
make test | uv run pytest -q. |
make test-live | Live end-to-end against dev.comet.com (set OPIK_API_KEY + OPIK_WORKSPACE). |
make lint | ruff check + format check. |
make format | ruff format + ruff check --fix. |
make typecheck | mypy. |
make check | lint + typecheck + test. |
Repo layout:
opik-mcp/
├── src/opik_mcp/ ← server, tools, ask_ollie, analytics
├── tests/ ← pytest suites
├── scripts/ ← live-BE smoke + MCP-session smoke
├── legacy/typescript/ ← deprecated v2 TS server
├── pyproject.toml
└── Makefile
Get help
- Open an issue for bugs and feature requests
- Opik docs for SDK / backend documentation
- Comet community Slack for questions
Upgrading from v2? The legacy TypeScript server still ships on npm as
opik-mcp@^2(npx -y opik-mcp); source is preserved underlegacy/typescript/. Seelegacy/typescript/DEPRECATED.mdfor the support policy.
License
Apache-2.0.
常见问题
Opik MCP Server 是什么?
通过 Model Context Protocol 与 Opik 的 prompts、traces 和 metrics 交互,便于统一访问与分析。
相关 Skills
Slack动图
by anthropics
面向Slack的动图制作Skill,内置emoji/消息GIF的尺寸、帧率和色彩约束、校验与优化流程,适合把创意或上传图片快速做成可直接发送的Slack动画。
✎ 帮你快速做出适配 Slack 的动图,内置约束规则和校验工具,少踩上传与播放坑,做表情包和演示都更省心。
MCP构建
by anthropics
聚焦高质量 MCP Server 开发,覆盖协议研究、工具设计、错误处理与传输选型,适合用 FastMCP 或 MCP SDK 对接外部 API、封装服务能力。
✎ 想让 LLM 稳定调用外部 API,就用 MCP构建:从 Python 到 Node 都有成熟指引,帮你更快做出高质量 MCP 服务器。
接口测试套件
by alirezarezvani
扫描 Next.js、Express、FastAPI、Django REST 的 API 路由,自动生成覆盖鉴权、参数校验、错误码、分页、上传与限流场景的 Vitest 或 Pytest 测试套件。
✎ 帮你把API与集成测试自动化跑顺,减少回归漏测;能力全面,尤其适合复杂接口场景的QA团队。
相关 MCP Server
Slack 消息
编辑精选by Anthropic
Slack 是让 AI 助手直接读写你的 Slack 频道和消息的 MCP 服务器。
✎ 这个服务器解决了团队协作中需要 AI 实时获取 Slack 信息的痛点,特别适合开发团队让 Claude 帮忙汇总频道讨论或发送通知。不过,它目前只是参考实现,文档有限,不建议在生产环境直接使用——更适合开发者学习 MCP 如何集成第三方服务。
by netdata
io.github.netdata/mcp-server 是让 AI 助手实时监控服务器指标和日志的 MCP 服务器。
✎ 这个工具解决了运维人员需要手动检查系统状态的痛点,最适合 DevOps 团队让 Claude 自动分析性能数据。不过,它依赖 NetData 的现有部署,如果你没用过这个监控平台,得先花时间配置。
by d4vinci
Scrapling MCP Server 是专为现代网页设计的智能爬虫工具,支持绕过 Cloudflare 等反爬机制。
✎ 这个工具解决了爬取动态网页和反爬网站时的头疼问题,特别适合需要批量采集电商价格或新闻数据的开发者。不过,它依赖外部浏览器引擎,资源消耗较大,不适合轻量级任务。