什么是 Dataiku MCP?
面向 Dataiku DSS 项目、flow 与 operations APIs 的 MCP 服务器,便于 AI 调用与管理。
README
Dataiku MCP Server
MCP server for Dataiku DSS REST APIs, focused on flow analysis and reliable day-to-day operations (projects, datasets, recipes, jobs, scenarios, folders, variables, connections, and code environments).
Cursor one-click install includes placeholder environment values. Update
DATAIKU_URL,DATAIKU_API_KEY, and optionallyDATAIKU_PROJECT_KEYafter adding the server.
What You Get
- Deterministic normalized flow maps (
project.map) with recipe subtypes and connectivity. - Summary-first outputs with explicit raw/detail toggles where needed.
- Broad test coverage (unit + live integration + optional destructive integration suite).
- Strong error taxonomy in responses:
not_found,forbidden,validation,transient,unknownwith retry hints.
Tool Coverage
project:list,get,metadata,flow,mapdataset:list,get,schema,preview,metadata,download,create,update,deleterecipe:list,get,create,update,delete,downloadjob:list,get,log,build,buildAndWait,wait,abortscenario:list,run,status,get,create,update,deletemanaged_folder:list,get,contents,download,upload,delete_filevariable:get,setconnection:infercode_env:list,get
Prerequisites
- Node.js 20+
- npm
- Dataiku DSS URL + API key
Quick Start
npm ci
npm run build
Run as a local CLI after build:
node dist/index.js
Use directly from npm (after publish):
npx -y dataiku-mcp
Local Build And Testing
Recommended local workflow from repo root:
# install deps
npm ci
# static checks
npm run check
# unit tests
npm test
# build distribution
npm run build
# run MCP server locally (dev)
npm start
Optional live DSS integration tests:
# requires DATAIKU_URL, DATAIKU_API_KEY, DATAIKU_PROJECT_KEY in .env
npm run test:integration
# includes destructive actions (create/update/delete)
DATAIKU_MCP_DESTRUCTIVE_TESTS=1 npm run test:integration
Repository Layout
src/: MCP server and tool implementations.tests/: unit + integration test suites.examples/: demos, fixtures, artifacts, and ad-hoc local scripts.bin/: package executable entrypoint.dist/: compiled output (generated).
Create a local env file:
cp .env.example .env
# then edit .env
Run directly in dev:
npm start
Example scripts and sample outputs are kept under examples/ to avoid root-level clutter.
Environment Variables
DATAIKU_URL: DSS base URLDATAIKU_API_KEY: DSS API keyDATAIKU_PROJECT_KEY(optional): default project keyDATAIKU_REQUEST_TIMEOUT_MS(optional): per-attempt request timeout in milliseconds (default:30000)DATAIKU_RETRY_MAX_ATTEMPTS(optional): max attempts for retry-enabled requests (GETonly, default:4, cap:10)DATAIKU_DEBUG_LATENCY(optional): set to1/trueto include per-tool timing diagnostics instructuredContent.debug.latency(off by default)
MCP Client Setup Guide
Use this server command in clients (npm package):
{
"command": "npx",
"args": ["-y", "dataiku-mcp"],
"env": {
"DATAIKU_URL": "https://your-dss-instance.app.dataiku.io",
"DATAIKU_API_KEY": "your_api_key",
"DATAIKU_PROJECT_KEY": "YOUR_PROJECT_KEY"
}
}
Windows note: if your MCP client launches commands without a shell, use npx.cmd:
{
"command": "npx.cmd",
"args": ["-y", "dataiku-mcp"],
"env": {
"DATAIKU_URL": "https://your-dss-instance.app.dataiku.io",
"DATAIKU_API_KEY": "your_api_key",
"DATAIKU_PROJECT_KEY": "YOUR_PROJECT_KEY"
}
}
You can also run TypeScript directly during development:
{
"command": "npx",
"args": ["tsx", "/absolute/path/to/Dataiku_MCP/src/index.ts"],
"env": {
"DATAIKU_URL": "https://your-dss-instance.app.dataiku.io",
"DATAIKU_API_KEY": "your_api_key",
"DATAIKU_PROJECT_KEY": "YOUR_PROJECT_KEY"
}
}
Claude Desktop
- Open Claude Desktop ->
Settings->Developer->Edit Config. - Add this under
mcpServersinclaude_desktop_config.json:
{
"mcpServers": {
"dataiku": {
"command": "npx",
"args": ["-y", "dataiku-mcp"],
"env": {
"DATAIKU_URL": "https://your-dss-instance.app.dataiku.io",
"DATAIKU_API_KEY": "your_api_key",
"DATAIKU_PROJECT_KEY": "YOUR_PROJECT_KEY"
}
}
}
}
Cursor
Cursor supports both project-scoped and global MCP config:
- Project:
.cursor/mcp.json - Global:
~/.cursor/mcp.json
Example:
{
"mcpServers": {
"dataiku": {
"command": "npx",
"args": ["-y", "dataiku-mcp"],
"env": {
"DATAIKU_URL": "https://your-dss-instance.app.dataiku.io",
"DATAIKU_API_KEY": "your_api_key",
"DATAIKU_PROJECT_KEY": "YOUR_PROJECT_KEY"
}
}
}
}
Cline (VS Code extension)
- Open Cline -> MCP Servers -> Configure MCP Servers.
- Add this server block in
cline_mcp_settings.json:
{
"mcpServers": {
"dataiku": {
"command": "npx",
"args": ["-y", "dataiku-mcp"],
"env": {
"DATAIKU_URL": "https://your-dss-instance.app.dataiku.io",
"DATAIKU_API_KEY": "your_api_key",
"DATAIKU_PROJECT_KEY": "YOUR_PROJECT_KEY"
}
}
}
}
Codex / project-level MCP config
This repo already includes a project-scoped MCP file at .mcp.json.
The checked-in .mcp.json uses node node_modules/tsx/dist/cli.mjs src/index.ts for cross-platform startup (including Windows); run npm ci first.
NPM Release Workflow
This repo includes a manual GitHub Actions release workflow:
- Workflow file:
.github/workflows/release.yml - Trigger:
Actions->Release NPM Package->Run workflow
Inputs:
bump:patch | minor | majorversion: optional exact version (overridesbump)publish: whether to publish to npm
Required repository configuration:
- GitHub variable:
NPM_RELEASE_ENABLED=true - Optional variable:
NPM_PUBLISH_ACCESS=public - Trusted publisher configured on npmjs.com for this package/repo/workflow
The workflow will:
- Install dependencies, run checks/tests, and build.
- Bump package version and create git tag.
- Push commit + tag to
main. - Publish to npm with GitHub OIDC trusted publishing (if
publish=true). - Create a GitHub Release with generated notes.
Trusted publishing setup (npm):
- Open
https://www.npmjs.com/package/dataiku-mcp->Settings->Trusted Publisher. - Choose
GitHub Actions. - Set:
- Organization or user:
clssck - Repository:
Dataiku_MCP - Workflow filename:
release.yml
- Organization or user:
- Save.
Official MCP Registry
This repo is configured for MCP Registry publishing:
- Metadata file:
server.json - Workflow:
.github/workflows/publish-mcp-registry.yml - Required package field:
mcpNameinpackage.json
Server namespace:
io.github.clssck/dataiku-mcp
Publish paths:
- Manual: run
Publish to MCP Registryin GitHub Actions. - Automatic: run the npm release workflow with
publish=true(it triggers MCP Registry publish).
Validation notes:
server.json.namemust matchpackage.json.mcpName.server.json.packages[].identifier+versionmust reference a real npm publish.
Recommended Verification Prompt
After adding the server in a client, run:
projectwith{ "action": "map", "projectKey": "YOUR_PROJECT_KEY" }(defaults tomaxNodes=300,maxEdges=600; override as needed)
You should receive a flow summary in text and normalized nodes, edges, stats, roots, and leaves under structuredContent.map.
When truncation limits are applied (default maxNodes=300, maxEdges=600), structuredContent.truncation reports before/after node+edge counts and whether truncation occurred.
Notes
project.mapreturns a compact text summary; full normalized graph is instructuredContent.map.- Arrays in normalized map output are deterministically sorted to reduce diff churn.
job.waitandjob.buildAndWaitincludestructuredContent.normalizedStatewith one ofterminalSuccess | terminalFailure | timeout | nonTerminalwhile preserving raw DSSstate.- With
DATAIKU_DEBUG_LATENCY=1, responses include per-tool and per-API-call latency metrics understructuredContent.debug.latency. - List-style responses are token-bounded by default; use
limit/offset(and action-specific caps likemaxNodes,maxEdges,maxKeys,maxPackages) to page or expand results when needed. dataset.getandjob.getare summary-first by default; passincludeDefinition=trueto include full DSS JSON instructuredContent.definition.
Sources
- MCP local server connection docs: https://modelcontextprotocol.io/docs/develop/connect-local-servers
- Cursor MCP docs: https://cursor.com/docs/context/mcp
- Cline MCP docs: https://docs.cline.bot/mcp/configuring-mcp-servers
常见问题
Dataiku MCP 是什么?
面向 Dataiku DSS 项目、flow 与 operations APIs 的 MCP 服务器,便于 AI 调用与管理。
相关 Skills
MCP构建
by anthropics
聚焦高质量 MCP Server 开发,覆盖协议研究、工具设计、错误处理与传输选型,适合用 FastMCP 或 MCP SDK 对接外部 API、封装服务能力。
✎ 想让 LLM 稳定调用外部 API,就用 MCP构建:从 Python 到 Node 都有成熟指引,帮你更快做出高质量 MCP 服务器。
Slack动图
by anthropics
面向Slack的动图制作Skill,内置emoji/消息GIF的尺寸、帧率和色彩约束、校验与优化流程,适合把创意或上传图片快速做成可直接发送的Slack动画。
✎ 帮你快速做出适配 Slack 的动图,内置约束规则和校验工具,少踩上传与播放坑,做表情包和演示都更省心。
MCP服务构建器
by alirezarezvani
从 OpenAPI 一键生成 Python/TypeScript MCP server 脚手架,并校验 tool schema、命名规范与版本兼容性,适合把现有 REST API 快速发布成可生产演进的 MCP 服务。
✎ 帮你快速搭建 MCP 服务与后端 API,脚手架完善、扩展顺手,尤其适合想高效验证服务能力的开发者。
相关 MCP Server
Slack 消息
编辑精选by Anthropic
Slack 是让 AI 助手直接读写你的 Slack 频道和消息的 MCP 服务器。
✎ 这个服务器解决了团队协作中需要 AI 实时获取 Slack 信息的痛点,特别适合开发团队让 Claude 帮忙汇总频道讨论或发送通知。不过,它目前只是参考实现,文档有限,不建议在生产环境直接使用——更适合开发者学习 MCP 如何集成第三方服务。
by netdata
io.github.netdata/mcp-server 是让 AI 助手实时监控服务器指标和日志的 MCP 服务器。
✎ 这个工具解决了运维人员需要手动检查系统状态的痛点,最适合 DevOps 团队让 Claude 自动分析性能数据。不过,它依赖 NetData 的现有部署,如果你没用过这个监控平台,得先花时间配置。
by d4vinci
Scrapling MCP Server 是专为现代网页设计的智能爬虫工具,支持绕过 Cloudflare 等反爬机制。
✎ 这个工具解决了爬取动态网页和反爬网站时的头疼问题,特别适合需要批量采集电商价格或新闻数据的开发者。不过,它依赖外部浏览器引擎,资源消耗较大,不适合轻量级任务。