io.github.pedro-rivas/android-puppeteer-mcp

平台与服务

by pedro-rivas

用于 Android 自动化的 MCP server,支持 UI 交互、截图采集与设备控制等能力。

什么是 io.github.pedro-rivas/android-puppeteer-mcp

用于 Android 自动化的 MCP server,支持 UI 交互、截图采集与设备控制等能力。

README

mcp-name: io.github.pedro-rivas/android-puppeteer-mcp

<div align="center"> <h1>Android Puppeteer</h1> <a href="https://https://github.com/pedro-rivas/android-puppeteer-mcp/blob/main/LICENSE"> <img src="https://img.shields.io/badge/license-MIT-green" alt="License"> </a> <img src="https://img.shields.io/badge/python-3.10%2B-blue" alt="Python"> <img src="https://img.shields.io/badge/platform-Android%2010+-blue" alt="Platform"> <img src="https://img.shields.io/badge/mcp-server-purple" alt="MCP Server"> </div> <br>

Android Puppeteer is a lightweight, visual-first MCP (Model Context Protocol) server that enables AI agents to interact with Android devices through intelligent UI element detection and automated interactions. Built on uiautomator2, it provides comprehensive Android automation capabilities including visual element detection, touch interactions, text input, and video recording.

🎥 Watch the demo in action

Features

  • Visual Element Detection Automatically detects and annotates interactive UI elements with numbered overlays for precise targeting.

  • Comprehensive Touch Interactions Support for tap, long press, swipe, scroll, and drag gestures with coordinate-based precision.

  • Multi-Device Support Connect to multiple Android devices or emulators simultaneously with device-specific targeting.

  • Video Recording Integration Built-in screen recording capabilities using scrcpy for documentation and testing workflows.

  • Real-Time UI Analysis Live UI hierarchy parsing and element information extraction for dynamic interaction strategies.

  • MCP Protocol Integration Seamless integration with Claude Desktop and other MCP-compatible AI platforms.

Supported Operating Systems

  • Android 10+
  • Windows, macOS, Linux (host systems)

Installation

Prerequisites

  • Python 3.10+
  • uiautomator2
  • Android 10+ (Emulator or Physical Device)
  • ADB (Android Debug Bridge)
  • scrcpy (for video recording features)

Getting Started

  1. Clone the repository
shell
git clone https://github.com/pedro-rivas/android-puppeteer-mcp.git
cd android-puppeteer
  1. Install dependencies
shell
uv python install 3.10
uv sync
  1. Setup Android device
shell
# Enable USB debugging on your Android device
# For emulator, ensure it's running
adb devices  # Verify device connection
  1. Connect to the MCP server

  2. Locate your Claude Desktop configuration file:

    • Windows: %APPDATA%\Claude\claude_desktop_config.json
    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  3. Add the following JSON to your Claude Desktop config:

    json
    {
      "mcpServers": {
        "android-puppeteer": {
          "command": "path/to/uv",
          "args": [
            "--directory",
            "path/to/android-puppeteer",
            "run",
            "puppeteer.py"
          ]
        }
      }
    }
    

    Replace:

    • path/to/uv with the actual path to your uv executable
    • path/to/android-puppeteer with the absolute path to where you have cloned this repo
  4. Restart Claude Desktop

Restart your Claude Desktop. You should see "android-puppeteer" listed as an available integration.


Available Tools

Android Puppeteer provides the following tools for comprehensive Android device interaction:

Device Management

  • list_emulators: List all available Android emulators and devices with their status and dimensions
  • get_device_dimensions: Get the screen dimensions of a specific Android device
  • get_ui_elements_info: Get detailed information about all interactive UI elements on screen

Visual Interaction

  • take_screenshot: Capture annotated screenshots with numbered UI element overlays
  • press: Tap on specific coordinates with optional long press duration
  • long_press: Perform long press gestures on specific coordinates

Navigation & Input

  • press_back: Press the hardware back button
  • swipe: Perform directional or custom coordinate swipes
  • type_text: Type text into focused input fields with optional text clearing
  • scroll_element: Scroll specific UI elements in any direction

Recording & Documentation

  • record_video: Start screen recording with customizable quality settings
  • stop_video: Stop active screen recordings and save to local storage

Usage Examples

Basic Device Interaction

python
# Take an annotated screenshot
screenshot = await take_screenshot()

# Tap on a specific element (element 5 from screenshot)
await press(x=500, y=300)

# Type text into an input field
await type_text("Hello, Android!")

# Swipe to scroll down
await swipe(direction="down")

Multi-Device Automation

python
# List available devices
devices = await list_emulators()

# Target specific device
await take_screenshot(device_id="emulator-5554")
await press(x=200, y=400, device_id="emulator-5554")

Video Recording Workflow

python
# Start recording
await record_video(filename="test_session.mp4")

# Perform automation steps
await press(x=300, y=500)
await type_text("Automated test input")

# Stop recording
await stop_video()

Project Structure

code
android-puppeteer/
    puppeteer.py          # Main MCP server implementation
    main.py              # Entry point
    pyproject.toml       # Project configuration
    ss/                  # Screenshots directory
    videos/              # Video recordings directory
    README.md           # This file

Important Notes

  • Device Permissions: Ensure USB debugging is enabled on target Android devices
  • Network Access: Some features require network connectivity for device communication
  • Storage: Screenshot and video files are saved locally in ss/ and videos/ directories
  • Performance: Response times depend on device performance and network latency

Troubleshooting

Common Issues

  1. Device not found: Verify ADB connection with adb devices
  2. Permission denied: Check USB debugging and device authorization
  3. Screenshot failures: Ensure device screen is unlocked and accessible
  4. Video recording issues: Verify scrcpy installation and device compatibility

Debug Mode

Run the server directly for debugging:

shell
uv run puppeteer.py

License

This project is licensed under the MIT License. See LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests and ensure code quality
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Related Projects


Star this repo if you find it useful!

常见问题

io.github.pedro-rivas/android-puppeteer-mcp 是什么?

用于 Android 自动化的 MCP server,支持 UI 交互、截图采集与设备控制等能力。

相关 Skills

MCP构建

by anthropics

Universal
热门

聚焦高质量 MCP Server 开发,覆盖协议研究、工具设计、错误处理与传输选型,适合用 FastMCP 或 MCP SDK 对接外部 API、封装服务能力。

想让 LLM 稳定调用外部 API,就用 MCP构建:从 Python 到 Node 都有成熟指引,帮你更快做出高质量 MCP 服务器。

平台与服务
未扫描114.1k

Slack动图

by anthropics

Universal
热门

面向Slack的动图制作Skill,内置emoji/消息GIF的尺寸、帧率和色彩约束、校验与优化流程,适合把创意或上传图片快速做成可直接发送的Slack动画。

帮你快速做出适配 Slack 的动图,内置约束规则和校验工具,少踩上传与播放坑,做表情包和演示都更省心。

平台与服务
未扫描114.1k

MCP服务构建器

by alirezarezvani

Universal
热门

从 OpenAPI 一键生成 Python/TypeScript MCP server 脚手架,并校验 tool schema、命名规范与版本兼容性,适合把现有 REST API 快速发布成可生产演进的 MCP 服务。

帮你快速搭建 MCP 服务与后端 API,脚手架完善、扩展顺手,尤其适合想高效验证服务能力的开发者。

平台与服务
未扫描10.2k

相关 MCP Server

Slack 消息

编辑精选

by Anthropic

热门

Slack 是让 AI 助手直接读写你的 Slack 频道和消息的 MCP 服务器。

这个服务器解决了团队协作中需要 AI 实时获取 Slack 信息的痛点,特别适合开发团队让 Claude 帮忙汇总频道讨论或发送通知。不过,它目前只是参考实现,文档有限,不建议在生产环境直接使用——更适合开发者学习 MCP 如何集成第三方服务。

平台与服务
83.4k

by netdata

热门

io.github.netdata/mcp-server 是让 AI 助手实时监控服务器指标和日志的 MCP 服务器。

这个工具解决了运维人员需要手动检查系统状态的痛点,最适合 DevOps 团队让 Claude 自动分析性能数据。不过,它依赖 NetData 的现有部署,如果你没用过这个监控平台,得先花时间配置。

平台与服务
78.4k

by d4vinci

热门

Scrapling MCP Server 是专为现代网页设计的智能爬虫工具,支持绕过 Cloudflare 等反爬机制。

这个工具解决了爬取动态网页和反爬网站时的头疼问题,特别适合需要批量采集电商价格或新闻数据的开发者。不过,它依赖外部浏览器引擎,资源消耗较大,不适合轻量级任务。

平台与服务
35.4k

评论