Mineru Document Parsing Server

Name: Mineru Document Parsing Server
Rating: 0.1 (1 reviews)
Author: demomagic

效率与工作流

by demomagic

Provide powerful document parsing capabilities by integrating with the Mineru API. Enable single and batch file parsing with support for multiple formats, OCR, formula, and table recognition. Monitor parsing task status in real-time to efficiently process documents in various languages.

1GitHub

什么是 Mineru Document Parsing Server？

README

Mineru MCP Server

A Model Context Protocol (MCP) document parsing server that integrates with Mineru API to provide powerful document parsing capabilities.

Features

Single File Parsing: Create document parsing tasks via URL
Batch File Parsing: Support multiple file batch upload and parsing
Task Status Monitoring: Real-time query of parsing progress and results
Multi-format Support: Support PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG and other formats
OCR Functionality: Optional OCR text recognition
Formula Recognition: Support mathematical formula recognition
Table Recognition: Support table structure recognition
Multi-language Support: Support Chinese, English and other languages

Installation

bash

npm install

Configuration

Before using, you need to configure the Mineru API key:

typescript

const config = {
  mineruApiKey: "your-mineru-api-bearer-token", // Mineru API Bearer token
  mineruBaseUrl: "https://mineru.net/api/v4" // Mineru API base URL
};

Available Tools

1. create_parsing_task

Create a document parsing task for a single file

Parameters:

url (required): File URL
is_ocr (optional): Enable OCR, default false
enable_formula (optional): Enable formula recognition, default true
enable_table (optional): Enable table recognition, default true
language (optional): Document language, default "ch"
page_ranges (optional): Page ranges, e.g., "1-10,15-20"
model_version (optional): Model version, "v1" or "v2"
extra_formats (optional): Additional export formats, ["docx", "html", "latex"]

2. get_task_status

Query parsing task status

Parameters:

task_id (required): Task ID

3. create_batch_parsing_task

Create a batch file upload parsing task (for local file uploads)

Parameters:

files (required): File array, each file contains name, is_ocr, page_ranges and other properties
enable_formula (optional): Enable formula recognition
enable_table (optional): Enable table recognition
language (optional): Document language
model_version (optional): Model version
extra_formats (optional): Additional export formats

4. create_batch_url_parsing_task

Create a batch URL parsing task (for remote file URLs)

Parameters:

files (required): File array, each file contains url, is_ocr, page_ranges and other properties
enable_formula (optional): Enable formula recognition
enable_table (optional): Enable table recognition
language (optional): Document language
model_version (optional): Model version
extra_formats (optional): Additional export formats

5. get_batch_task_results

Query batch parsing task results (supports both URL batch parsing and local upload batch parsing)

Parameters:

batch_id (required): Batch task ID (from create_batch_url_parsing_task or create_batch_parsing_task)

Usage Examples

Single File Parsing

typescript

// Create parsing task
const taskResult = await create_parsing_task({
  url: "https://example.com/document.pdf",
  is_ocr: true,
  enable_formula: true,
  language: "en"
});

// Query task status
const status = await get_task_status({
  task_id: taskResult.task_id
});

Batch File Upload Parsing

typescript

// Create batch upload task
const batchResult = await create_batch_parsing_task({
  files: [
    { name: "document1.pdf", is_ocr: true },
    { name: "document2.docx" }
  ],
  enable_formula: true,
  language: "ch"
});

// Query batch task results (applicable to both batch parsing methods)
const batchStatus = await get_batch_task_results({
  batch_id: batchResult.batch_id
});

Batch URL Parsing

typescript

// Create batch URL parsing task
const batchUrlResult = await create_batch_url_parsing_task({
  files: [
    { url: "https://example.com/doc1.pdf", is_ocr: true },
    { url: "https://example.com/doc2.docx" }
  ],
  enable_formula: true,
  language: "en"
});

// Query batch task results (applicable to both batch parsing methods)
const batchUrlStatus = await get_batch_task_results({
  batch_id: batchUrlResult.batch_id
});

Development

bash

npm run dev

Important Notes

Single file size cannot exceed 200MB, page count cannot exceed 600 pages
Each account has 2000 pages of highest priority parsing quota per day
Due to network restrictions, foreign URLs like GitHub and AWS may timeout
Batch upload file links are valid for 24 hours
No need to set Content-Type header when uploading files

Common Error Codes

Error Code	Description	Solution
A0202	Token error	Check if the Token is correct, or replace with a new Token
A0211	Token expired	Replace with a new Token
-500	Parameter error	Ensure parameter types and Content-Type are correct
-10001	Service exception	Please try again later
-10002	Request parameter error	Check request parameter format
-60001	Failed to generate upload URL	Please try again later
-60002	Failed to get matching file format	File type detection failed, ensure the requested filename and link have correct extensions, and the file is one of pdf, doc, docx, ppt, pptx, png, jp(e)g
-60003	File read failed	Check if the file is corrupted and re-upload
-60004	Empty file	Please upload a valid file
-60005	File size exceeds limit	Check file size, maximum support 200MB
-60006	File page count exceeds limit	Please split the file and try again
-60007	Model service temporarily unavailable	Please try again later or contact technical support
-60008	File read timeout	Check if URL is accessible
-60009	Task submission queue is full	Please try again later
-60010	Parsing failed	Please try again later
-60011	Failed to get valid file	Please ensure the file has been uploaded
-60012	Task not found	Please ensure task_id is valid and not deleted
-60013	No permission to access this task	Can only access tasks submitted by yourself
-60014	Delete running task	Running tasks do not support deletion
-60015	File conversion failed	Can manually convert to PDF and upload
-60016	File conversion failed	File conversion to specified format failed, can try other format export or retry

License

ISC

Mineru Document Parsing Server

什么是 Mineru Document Parsing Server？

README

Mineru MCP Server

Features

Installation

Configuration

Available Tools

1. create_parsing_task

2. get_task_status

3. create_batch_parsing_task

4. create_batch_url_parsing_task

5. get_batch_task_results

Usage Examples

Single File Parsing

Batch File Upload Parsing

Batch URL Parsing

Development

Important Notes

Common Error Codes

License

常见问题

Mineru Document Parsing Server 是什么？

相关 Skills

技能工坊

PPT处理

PDF处理

相关 MCP Server

文件系统

io.github.wonderwhy-er/desktop-commander

LinkedIn Profile and Job Scraper

评论