io.github.xt765/mcp-document-converter

效率与工作流

by xt765

将 PDF、DOCX、HTML、Markdown 和 Text 转换为适合 AI assistant context injection 的内容。

什么是 io.github.xt765/mcp-document-converter

将 PDF、DOCX、HTML、Markdown 和 Text 转换为适合 AI assistant context injection 的内容。

README

<h1 align="center">MCP Document Converter</h1> <!-- mcp-name: io.github.xt765/mcp-document-converter --> <p align="center"><strong>MCP (Model Context Protocol) Document Converter - A powerful MCP tool for converting documents between multiple formats, enabling AI agents to easily transform documents.</strong></p> <p align="center">🌐 <strong>Language</strong>: <a href="README.md">English</a> | <a href="README.zh-CN.md">中文</a></p> <p align="center"> <a href="https://blog.csdn.net/Yunyi_Chi"><img src="https://img.shields.io/badge/CSDN-玄同765-orange.svg?style=flat&logo=csdn" alt="CSDN"></a> <a href="https://github.com/xt765/mcp-document-converter"><img src="https://img.shields.io/badge/GitHub-mcp_document_converter-black.svg?style=flat&logo=github" alt="GitHub"></a> <a href="https://gitee.com/xt765/mcp-document-converter"><img src="https://img.shields.io/badge/Gitee-mcp_document_converter-red.svg?style=flat&logo=gitee" alt="Gitee"></a> </p> <p align="center"> <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg?style=flat&logo=opensourceinitiative" alt="License"></a> <a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.10+-blue.svg?style=flat&logo=python" alt="Python"></a> <a href="https://pypi.org/project/mcp-document-converter/"><img src="https://img.shields.io/pypi/v/mcp-document-converter.svg?logo=pypi" alt="PyPI Version"></a> <a href="https://pepy.tech/project/mcp-document-converter"><img src="https://img.shields.io/pepy/dt/mcp-document-converter.svg?logo=pypi&label=PyPI%20Downloads" alt="PyPI Downloads"></a> <a href="https://registry.modelcontextprotocol.io/v0.1/servers?search=io.github.xt765/mcp-document-converter"><img src="https://img.shields.io/badge/MCP-Registry-blue?logo=modelcontextprotocol" alt="MCP Registry"></a> <a href="https://mcp-marketplace.io/server/io-github-xt765-mcp-document-converter"><img src="https://img.shields.io/badge/MCP-Marketplace-22c55e.svg?style=flat&logo=shopify&logoColor=white" alt="MCP Marketplace"></a> </p>

Features

  • Multi-format Support: Supports 5 mainstream document formats: Markdown, HTML, DOCX, PDF, and Text
  • Bidirectional Conversion: Any format can be converted to any other format (5×5=25 conversion combinations)
  • MCP Protocol: Compliant with MCP standards, can be used as a tool for AI assistants like Trae IDE
  • Plugin Architecture: Easy to extend with new parsers and renderers
  • Syntax Highlighting: HTML and PDF outputs support code syntax highlighting
  • Style Customization: Support for custom CSS styles
  • Metadata Preservation: Preserves document title, author, creation time, and other metadata during conversion

📚 Documentation

User Guide · API Reference · Contributing · Changelog · License


Architecture

mermaid
flowchart TB
    subgraph Parsers["Parsers"]
        MD[Markdown]
        DOCX1[DOCX]
        HTML1[HTML]
        PDF1[PDF]
        TXT1[Text]
    end

    subgraph IR["Intermediate Representation (IR)"]
        DT[Document Tree]
        META[Metadata]
        ASSETS[Assets]
    end

    subgraph Renderers["Renderers"]
        HTML2[HTML]
        PDF2[PDF]
        MD2[Markdown]
        DOCX2[DOCX]
        TXT2[Text]
    end

    MD --> IR
    DOCX1 --> IR
    HTML1 --> IR
    PDF1 --> IR
    TXT1 --> IR
    
    IR --> HTML2
    IR --> PDF2
    IR --> MD2
    IR --> DOCX2
    IR --> TXT2

Core Components

  1. DocumentIR (Intermediate Representation): Unified abstraction for all documents, containing document tree, metadata, assets, etc.
  2. BaseParser (Parser Base Class): Defines the parser interface, parses various formats into DocumentIR
  3. BaseRenderer (Renderer Base Class): Defines the renderer interface, renders DocumentIR into various formats
  4. ConverterRegistry (Registry): Manages all parsers and renderers, provides format lookup and auto-matching
  5. DocumentConverter (Conversion Engine): Coordinates parsers and renderers to complete document conversion

Supported Formats

Input Formats (Parsers)

FormatExtensionsMIME TypeFeatures
Markdown.md, .markdown, .mdown, .mkdtext/markdownYAML Front Matter, GFM extensions
HTML.html, .htmtext/htmlSemantic tag parsing
DOCX.docxapplication/vnd.openxmlformats-officedocument.wordprocessingml.documentStyles, tables, images
PDF.pdfapplication/pdfText extraction and structure recognition
Text.txt, .texttext/plainAuto encoding detection and structure recognition

Output Formats (Renderers)

FormatExtensionMIME TypeFeatures
HTML.htmltext/htmlBeautiful styling, code highlighting, responsive design
Markdown.mdtext/markdownStandard Markdown format, YAML Front Matter
DOCX.docxapplication/vnd.openxmlformats-officedocument.wordprocessingml.documentWord document format, style preservation
PDF.pdfapplication/pdfGenerated with WeasyPrint, pagination support
Text.txttext/plainPlain text, basic formatting preserved

Conversion Matrix

mermaid
flowchart LR
    subgraph Sources["Source Formats"]
        MD_S[Markdown]
        HTML_S[HTML]
        DOCX_S[DOCX]
        PDF_S[PDF]
        TXT_S[Text]
    end

    subgraph Targets["Target Formats"]
        MD_T[Markdown]
        HTML_T[HTML]
        DOCX_T[DOCX]
        PDF_T[PDF]
        TXT_T[Text]
    end

    MD_S --> Targets
    HTML_S --> Targets
    DOCX_S --> Targets
    PDF_S --> Targets
    TXT_S --> Targets

Installation

Using pip (Recommended)

bash
pip install mcp-document-converter

From Source

bash
git clone https://github.com/xt765/mcp-document-converter.git
cd mcp-document-converter
pip install -e .

MCP Tools

This server provides the following tools:

convert_document

Convert a document from one format to another.

Arguments:

  • source_path (string, required): Path to the source document.
  • target_format (string, required): Target format (html, pdf, markdown, docx, text).
  • output_path (string, optional): Path for the output file.
  • source_format (string, optional): Format of the source file (auto-detected if not provided).
  • options (object, optional): Additional options like template, css, and preserve_metadata.

Configuration

Using in Trae IDE / Claude Desktop

Add the following to your MCP configuration file:

Option 1: Using PyPI (Recommended)

json
{
  "mcpServers": {
    "mcp-document-converter": {
      "command": "uvx",
      "args": [
        "mcp-document-converter"
      ]
    }
  }
}

Option 2: Using GitHub repository

json
{
  "mcpServers": {
    "mcp-document-converter": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/xt765/mcp-document-converter",
        "mcp-document-converter"
      ]
    }
  }
}

Option 3: Using Gitee repository (Faster access in China)

json
{
  "mcpServers": {
    "mcp-document-converter": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://gitee.com/xt765/mcp-document-converter",
        "mcp-document-converter"
      ]
    }
  }
}

Option 4: Using pip (Manual installation)

First install the package:

bash
pip install mcp-document-converter

Then add to configuration:

json
{
  "mcpServers": {
    "mcp-document-converter": {
      "command": "mcp-document-converter",
      "args": []
    }
  }
}

Using in Cherry Studio

Cherry Studio is a powerful open-source desktop AI client assistant that supports integrating various tools through the MCP protocol

Configuration Example:

Cherry Studio Configuration

Usage Example:

Cherry Studio Usage

Usage

As an MCP Tool

After configuration, AI assistants can directly call the following tools:

1. convert_document (Recommended)

Use a unified interface to convert any supported document type.

python
# Markdown to HTML
convert_document(
    source_path="document.md",
    target_format="html"
)

# HTML to PDF
convert_document(
    source_path="document.html",
    target_format="pdf"
)

# DOCX to Markdown
convert_document(
    source_path="document.docx",
    target_format="markdown"
)

# Conversion with options
convert_document(
    source_path="document.md",
    target_format="html",
    output_path="output.html",
    options={
        "css": "custom.css",
        "preserve_metadata": True
    }
)

2. list_supported_formats

List all supported document formats.

python
list_supported_formats()

3. get_conversion_matrix

Get the complete format conversion matrix.

python
get_conversion_matrix()

4. can_convert

Check if conversion from source format to target format is supported.

python
can_convert(source_format="markdown", target_format="pdf")

5. get_format_info

Get detailed information about a specific format.

python
get_format_info(format="markdown")

As a Python Library

python
from mcp_document_converter import DocumentConverter
from mcp_document_converter.registry import get_registry
from mcp_document_converter.parsers import MarkdownParser, HTMLParser
from mcp_document_converter.renderers import HTMLRenderer, PDFRenderer

# Register parsers and renderers
registry = get_registry()
registry.register_parser(MarkdownParser())
registry.register_parser(HTMLParser())
registry.register_renderer(HTMLRenderer())
registry.register_renderer(PDFRenderer())

# Create converter
converter = DocumentConverter(registry)

# Convert document
result = converter.convert(
    source="input.md",
    target_format="html",
    output_path="output.html"
)

if result.success:
    print(f"✅ Conversion successful: {result.output_path}")
else:
    print(f"❌ Conversion failed: {result.error_message}")

Tool Interface Details

convert_document

Convert a document from one format to another.

Parameters:

ParameterTypeRequiredDescription
source_pathstringSource file path, supports absolute or relative paths
target_formatstringTarget format: html, pdf, markdown, docx, text
output_pathstringOutput file path (optional, defaults to source filename)
source_formatstringSource format (optional, auto-detected from file extension)
optionsobjectConversion options

Options:

OptionTypeDefaultDescription
templatestring-Template name
cssstring-Custom CSS styles
preserve_metadatabooleantrueWhether to preserve metadata
extract_imagesbooleantrueWhether to extract images

Example:

json
{
  "source_path": "/path/to/document.md",
  "target_format": "html",
  "output_path": "/path/to/output.html",
  "options": {
    "css": "body { font-family: Arial; }",
    "preserve_metadata": true
  }
}

Extension Development

Adding a New Parser

python
from typing import List, Union
from pathlib import Path
from mcp_document_converter.core.parser import BaseParser
from mcp_document_converter.core.ir import DocumentIR, Node, NodeType

class MyParser(BaseParser):
    @property
    def supported_extensions(self) -> List[str]:
        return [".myext"]
    
    @property
    def format_name(self) -> str:
        return "myformat"
    
    @property
    def mime_types(self) -> List[str]:
        return ["application/x-myformat"]
    
    def parse(self, source: Union[str, Path, bytes], **options) -> DocumentIR:
        # Read source file
        content = self._read_source(source)
        
        # Parse into DocumentIR
        document = DocumentIR()
        document.title = "My Document"
        
        # Add content nodes
        document.add_node(Node(
            type=NodeType.PARAGRAPH,
            content=[Node(type=NodeType.TEXT, content="Hello World")]
        ))
        
        return document

Adding a New Renderer

python
from typing import Any
from mcp_document_converter.core.renderer import BaseRenderer
from mcp_document_converter.core.ir import DocumentIR

class MyRenderer(BaseRenderer):
    @property
    def output_extension(self) -> str:
        return ".myext"
    
    @property
    def format_name(self) -> str:
        return "myformat"
    
    @property
    def mime_type(self) -> str:
        return "application/x-myformat"
    
    def render(self, document: DocumentIR, **options: Any) -> str:
        # Render DocumentIR to target format
        parts = []
        
        if document.title:
            parts.append(f"# {document.title}")
        
        for node in document.content:
            # Render each node
            pass
        
        return "\n".join(parts)

Registering Extensions

python
from mcp_document_converter.registry import get_registry

# Register new parser and renderer
registry = get_registry()
registry.register_parser(MyParser())
registry.register_renderer(MyRenderer())

Testing

bash
# Run all tests
python tests/test_conversion.py

# Run specific test
python tests/test_conversion.py::test_markdown_to_html

Environment Variables

VariableDescriptionDefault
MCP_CONVERTER_LOG_LEVELLog levelINFO
MCP_CONVERTER_TEMP_DIRTemporary files directorySystem temp directory

Dependencies

Core Dependencies

  • mcp >= 1.26.0 - MCP protocol implementation
  • pydantic >= 2.12.5 - Data validation

Parser Dependencies

  • markdown >= 3.5.0 - Markdown parsing
  • beautifulsoup4 >= 4.12.0 - HTML parsing
  • python-docx >= 1.1.0 - DOCX parsing
  • pypdf >= 6.7.4 - PDF parsing
  • chardet >= 5.0.0 - Encoding detection
  • pyyaml >= 6.0.0 - YAML parsing

Renderer Dependencies

  • weasyprint >= 60.0 - PDF rendering
  • pygments >= 2.17.0 - Code highlighting
  • jinja2 >= 3.1.6 - Template engine
  • reportlab >= 4.0.0 - PDF generation

Development Dependencies

  • pytest >= 7.0.0 - Testing framework
  • pytest-asyncio >= 0.21.0 - Async testing support
  • pytest-cov >= 4.0.0 - Coverage reporting
  • basedpyright >= 1.0.0 - Type checking
  • ruff >= 0.1.0 - Linting and formatting

License

MIT License

Contributing

Issues and Pull Requests are welcome!

Related Projects

常见问题

io.github.xt765/mcp-document-converter 是什么?

将 PDF、DOCX、HTML、Markdown 和 Text 转换为适合 AI assistant context injection 的内容。

相关 Skills

表格处理

by anthropics

Universal
热门

围绕 .xlsx、.xlsm、.csv、.tsv 做读写、修复、清洗、格式整理、公式计算与格式转换,适合修改现有表格、生成新报表或把杂乱数据整理成交付级电子表格。

做 Excel/CSV 相关任务很省心,能直接读写、修复、清洗和格式转换,尤其擅长把乱七八糟的表格整理成交付级文件。

效率与工作流
未扫描109.6k

PDF处理

by anthropics

Universal
热门

遇到 PDF 读写、文本表格提取、合并拆分、旋转加水印、表单填写或加解密时直接用它,也能提取图片、生成新 PDF,并把扫描件通过 OCR 变成可搜索文档。

PDF杂活别再来回切工具了,文本表格提取、合并拆分到OCR识别一次搞定,连扫描件也能变可搜索。

效率与工作流
未扫描109.6k

Word文档

by anthropics

Universal
热门

覆盖Word/.docx文档的创建、读取、编辑与重排,适合生成报告、备忘录、信函和模板,也能处理目录、页眉页脚、页码、图片替换、查找替换、修订批注及内容提取整理。

搞定 .docx 的创建、改写与精排版,目录、批量替换、批注修订和图片更新都能自动化,做正式文档尤其省心。

效率与工作流
未扫描109.6k

相关 MCP Server

文件系统

编辑精选

by Anthropic

热门

Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。

这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。

效率与工作流
82.9k

by wonderwhy-er

热门

Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。

这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。

效率与工作流
5.8k

EdgarTools

编辑精选

by dgunning

热门

EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。

这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。

效率与工作流
1.9k

评论