Computer Use

效率与工作流

by domdomegg

通过 screenshots、mouse 和 keyboard 自动化来控制电脑,适合执行界面级操作与交互任务。

把截图识别、鼠标和键盘操作串起来,像人一样自动完成电脑界面任务,尤其适合没有 API 的流程自动化。

什么是 Computer Use

通过 screenshots、mouse 和 keyboard 自动化来控制电脑,适合执行界面级操作与交互任务。

README

computer-use-mcp

💻 An model context protocol server for Claude to control your computer. This is very similar to computer use, but easy to set up and use locally.

Here's Claude Haiku 4.5 changing my desktop background (4x speed):

https://github.com/user-attachments/assets/cd0bc190-52c4-49db-b3bc-4b8a74544789

[!WARNING] At time of writing, models make frequent mistakes and are vulnerable to prompt injections. As this MCP server gives the model complete control of your computer, this could do a lot of damage. You should therefore treat this like giving a hyperactive toddler access to your computer - you probably want to supervise it closely, and consider only doing this in a sandboxed user account.

Installation

<details> <summary><strong>Claude Code</strong></summary>

Run:

bash
claude mcp add --scope user --transport stdio computer-use -- npx -y computer-use-mcp

This installs the server at user scope (available in all projects). To install locally (current directory only), omit --scope user.

</details> <details> <summary><strong>Claude Desktop</strong></summary>

(Recommended) Via manual .dxt installation

  1. Find the latest dxt build in the GitHub Actions history (the top one)
  2. In the 'Artifacts' section, download the computer-use-mcp-dxt file
  3. Rename the .zip file to .dxt
  4. Double-click the .dxt file to open with Claude Desktop
  5. Click "Install"

(Advanced) Alternative: Via JSON configuration

  1. Install Node.js
  2. Open Claude Desktop and go to Settings → Developer
  3. Click "Edit Config" to open your claude_desktop_config.json file
  4. Add the following configuration to the "mcpServers" section:
json
{
  "mcpServers": {
    "computer-use": {
      "command": "npx",
      "args": [
        "-y",
        "computer-use-mcp"
      ]
    }
  }
}
  1. Save the file and restart Claude Desktop
</details> <details> <summary><strong>Cursor</strong></summary>

(Recommended) Via one-click install

  1. Click Install MCP Server

(Advanced) Alternative: Via JSON configuration

Create either a global (~/.cursor/mcp.json) or project-specific (.cursor/mcp.json) configuration file:

json
{
  "mcpServers": {
    "computer-use": {
      "command": "npx",
      "args": ["-y", "computer-use-mcp"]
    }
  }
}
</details> <details> <summary><strong>Cline</strong></summary>

(Recommended) Via marketplace

  1. Click the "MCP Servers" icon in the Cline extension
  2. Search for "Computer Use" and click "Install"
  3. Follow the prompts to install the server

(Advanced) Alternative: Via JSON configuration

  1. Click the "MCP Servers" icon in the Cline extension
  2. Click on the "Installed" tab, then the "Configure MCP Servers" button at the bottom
  3. Add the following configuration to the "mcpServers" section:
json
{
  "mcpServers": {
    "computer-use": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "computer-use-mcp"]
    }
  }
}
</details>

Tips

This should just work out of the box.

However, to get best results:

  • Use a model good at computer use - I recommend the latest Claude models.
  • Use a small, common resolution - 720p works particularly well. On macOS, you can use displayoverride-mac to do this. If you can't use a different resolution, try zooming in to active windows.
  • Install and enable the Rango browser extension. This enables keyboard navigation for websites, which is far more reliable than Claude trying to click coordinates. You can bump up the font size setting in Rango to make the hints more visible.

How it works

We implement a near identical computer use tool to Anthropic's official computer use guide, with some more nudging to prefer keyboard shortcuts.

This talks to your computer using nut.js

Contributing

Pull requests are welcomed on GitHub! To get started:

  1. Install Git and Node.js
  2. Clone the repository
  3. Install dependencies with npm install
  4. Run npm run test to run tests
  5. Build with npm run build

Releases

Versions follow the semantic versioning spec.

To release:

  1. Use npm version <major | minor | patch> to bump the version
  2. Run git push --follow-tags to push with tags
  3. Wait for GitHub Actions to publish to the NPM registry.

常见问题

Computer Use 是什么?

通过 screenshots、mouse 和 keyboard 自动化来控制电脑,适合执行界面级操作与交互任务。

相关 Skills

表格处理

by anthropics

Universal
热门

围绕 .xlsx、.xlsm、.csv、.tsv 做读写、修复、清洗、格式整理、公式计算与格式转换,适合修改现有表格、生成新报表或把杂乱数据整理成交付级电子表格。

做 Excel/CSV 相关任务很省心,能直接读写、修复、清洗和格式转换,尤其擅长把乱七八糟的表格整理成交付级文件。

效率与工作流
未扫描109.6k

PDF处理

by anthropics

Universal
热门

遇到 PDF 读写、文本表格提取、合并拆分、旋转加水印、表单填写或加解密时直接用它,也能提取图片、生成新 PDF,并把扫描件通过 OCR 变成可搜索文档。

PDF杂活别再来回切工具了,文本表格提取、合并拆分到OCR识别一次搞定,连扫描件也能变可搜索。

效率与工作流
未扫描109.6k

Word文档

by anthropics

Universal
热门

覆盖Word/.docx文档的创建、读取、编辑与重排,适合生成报告、备忘录、信函和模板,也能处理目录、页眉页脚、页码、图片替换、查找替换、修订批注及内容提取整理。

搞定 .docx 的创建、改写与精排版,目录、批量替换、批注修订和图片更新都能自动化,做正式文档尤其省心。

效率与工作流
未扫描109.6k

相关 MCP Server

文件系统

编辑精选

by Anthropic

热门

Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。

这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。

效率与工作流
82.9k

by wonderwhy-er

热门

Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。

这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。

效率与工作流
5.8k

EdgarTools

编辑精选

by dgunning

热门

EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。

这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。

效率与工作流
1.9k

评论