MinerU

效率与工作流

by linxule

MinerU 文档解析 API,支持 PDF、图片、DOCX、PPTX 的 OCR 识别与批量处理。

什么是 MinerU

MinerU 文档解析 API,支持 PDF、图片、DOCX、PPTX 的 OCR 识别与批量处理。

README

mineru-mcp

MCP server for MinerU document parsing API — extract text, tables, and formulas from PDFs, DOCs, and images.

Features

  • VLM model — 90%+ accuracy for complex documents
  • Pipeline model — Fast processing for simple documents
  • Local file upload — Upload files from disk for batch parsing
  • Batch processing — Parse up to 200 documents at once
  • Download & rename — Extract markdown with original filenames
  • Page ranges — Extract specific pages only
  • 109 language OCR support
  • Optimized for Claude Code — 73% token reduction vs alternatives

Tools

ToolDescription
mineru_parseParse a document URL
mineru_statusCheck task progress, get download URL
mineru_batchParse multiple URLs (max 200)
mineru_batch_statusGet batch results with pagination
mineru_upload_batchUpload local files for batch parsing
mineru_download_resultsDownload results as named markdown files

Installation

Requires Node.js 18+ and a MinerU API key.

CLI Install (one-liner)

bash
# Claude Code
claude mcp add mineru-mcp -e MINERU_API_KEY=your-api-key -- npx -y mineru-mcp

# Codex CLI (OpenAI)
codex mcp add mineru --env MINERU_API_KEY=your-api-key -- npx -y mineru-mcp

# Gemini CLI (Google)
gemini mcp add -e MINERU_API_KEY=your-api-key mineru npx -y mineru-mcp

Claude Desktop

Add to your claude_desktop_config.json:

OSConfig path
macOS~/Library/Application Support/Claude/claude_desktop_config.json
Windows%APPDATA%\Claude\claude_desktop_config.json
Linux~/.config/Claude/claude_desktop_config.json
json
{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

VS Code

Add to .vscode/mcp.json (workspace) or open Command Palette > MCP: Open User Configuration (global):

json
{
  "servers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

Note: VS Code uses "servers" as the top-level key, not "mcpServers". Other VS Code forks (Trae, Void, PearAI, etc.) typically use this same format.

Cursor

Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):

json
{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json (Windows: %USERPROFILE%\.codeium\windsurf\mcp_config.json):

json
{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

Cline

Open MCP Servers icon in Cline panel > Configure > Advanced MCP Settings, then add:

json
{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

Cherry Studio

In Settings > MCP Servers > Add Server, set Type to STDIO, Command to npx, Args to -y mineru-mcp, and add environment variable MINERU_API_KEY. Or paste in JSON/Code mode:

json
{
  "mineru": {
    "name": "MinerU",
    "command": "npx",
    "args": ["-y", "mineru-mcp"],
    "env": {
      "MINERU_API_KEY": "your-api-key"
    },
    "isActive": true
  }
}

Witsy

In Settings > MCP Servers, add a new server with Type: stdio, Command: npx, Args: -y mineru-mcp, and set environment variable MINERU_API_KEY to your API key.

Codex CLI (TOML config)

Alternatively, edit ~/.codex/config.toml directly:

toml
[mcp_servers.mineru]
command = "npx"
args = ["-y", "mineru-mcp"]

[mcp_servers.mineru.env]
MINERU_API_KEY = "your-api-key"

Gemini CLI (JSON config)

Alternatively, edit ~/.gemini/settings.json directly:

json
{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

Windows

On Windows, npx requires a shell wrapper. Replace "command": "npx" with:

json
{
  "command": "cmd",
  "args": ["/c", "npx", "-y", "mineru-mcp"],
  "env": {
    "MINERU_API_KEY": "your-api-key"
  }
}

For CLI tools on Windows:

bash
claude mcp add mineru-mcp -e MINERU_API_KEY=your-api-key -- cmd /c npx -y mineru-mcp
codex mcp add mineru --env MINERU_API_KEY=your-api-key -- cmd /c npx -y mineru-mcp

ChatGPT

ChatGPT only supports remote MCP servers over HTTPS — local stdio servers like this one are not directly supported. You would need to deploy behind a public URL with HTTP transport.

Configuration

Environment VariableDefaultDescription
MINERU_API_KEY(required)Your MinerU API Bearer token
MINERU_BASE_URLhttps://mineru.net/api/v4API base URL
MINERU_DEFAULT_MODELpipelineDefault model: pipeline or vlm

Get your API key at mineru.net

Usage

Parse a single URL

typescript
mineru_parse({
  url: "https://example.com/document.pdf",
  model: "vlm",        // optional: "pipeline" (default) or "vlm" (90% accuracy)
  pages: "1-10,15",    // optional: page ranges
  ocr: true,           // optional: enable OCR (pipeline only)
  formula: true,       // optional: formula recognition
  table: true,         // optional: table recognition
  language: "en",      // optional: language code
  formats: ["html"]    // optional: extra export formats
})

Check task progress

typescript
mineru_status({
  task_id: "abc-123",
  format: "concise"    // optional: "concise" (default) or "detailed"
})

Concise output: done | abc-123 | https://cdn-mineru.../result.zip

Batch parse URLs

typescript
mineru_batch({
  urls: ["https://example.com/doc1.pdf", "https://example.com/doc2.pdf"],
  model: "vlm"
})

Check batch progress

typescript
mineru_batch_status({
  batch_id: "batch-123",
  limit: 10,           // optional: max results (default: 10)
  offset: 0,           // optional: skip first N results
  format: "concise"    // optional: "concise" or "detailed"
})

Upload local files

typescript
mineru_upload_batch({
  directory: "/path/to/pdfs",  // scan directory for supported files
  // OR
  files: ["/path/to/doc1.pdf", "/path/to/doc2.pdf"],  // explicit file list
  model: "vlm",        // optional
  formula: true,       // optional
  table: true,         // optional
  language: "en",      // optional
  formats: ["html"]    // optional
})

Returns batch_id for tracking. Each file's original name is preserved via data_id (spaces become underscores).

Download results as markdown

typescript
mineru_download_results({
  batch_id: "batch-123",       // from mineru_upload_batch or mineru_batch
  output_dir: "/path/to/output",
  overwrite: false             // optional: overwrite existing files
})

Output filenames are derived from data_id (e.g., my_paper_title.md). Spaces in original filenames become underscores.

Typical local file workflow

code
mineru_upload_batch → mineru_batch_status (poll) → mineru_download_results

Supported Formats

  • PDF, DOC, DOCX, PPT, PPTX
  • PNG, JPG, JPEG

Limits

  • Single file: 200MB max, 600 pages max
  • Daily quota: 2000 pages at high priority
  • Batch: max 200 files per request

License

MIT

Links

常见问题

MinerU 是什么?

MinerU 文档解析 API,支持 PDF、图片、DOCX、PPTX 的 OCR 识别与批量处理。

相关 Skills

表格处理

by anthropics

Universal
热门

围绕 .xlsx、.xlsm、.csv、.tsv 做读写、修复、清洗、格式整理、公式计算与格式转换,适合修改现有表格、生成新报表或把杂乱数据整理成交付级电子表格。

做 Excel/CSV 相关任务很省心,能直接读写、修复、清洗和格式转换,尤其擅长把乱七八糟的表格整理成交付级文件。

效率与工作流
未扫描109.6k

PDF处理

by anthropics

Universal
热门

遇到 PDF 读写、文本表格提取、合并拆分、旋转加水印、表单填写或加解密时直接用它,也能提取图片、生成新 PDF,并把扫描件通过 OCR 变成可搜索文档。

PDF杂活别再来回切工具了,文本表格提取、合并拆分到OCR识别一次搞定,连扫描件也能变可搜索。

效率与工作流
未扫描109.6k

Word文档

by anthropics

Universal
热门

覆盖Word/.docx文档的创建、读取、编辑与重排,适合生成报告、备忘录、信函和模板,也能处理目录、页眉页脚、页码、图片替换、查找替换、修订批注及内容提取整理。

搞定 .docx 的创建、改写与精排版,目录、批量替换、批注修订和图片更新都能自动化,做正式文档尤其省心。

效率与工作流
未扫描109.6k

相关 MCP Server

文件系统

编辑精选

by Anthropic

热门

Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。

这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。

效率与工作流
82.9k

by wonderwhy-er

热门

Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。

这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。

效率与工作流
5.8k

EdgarTools

编辑精选

by dgunning

热门

EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。

这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。

效率与工作流
1.9k

评论