io.github.ofershap/real-browser

搜索与获取

by ofershap

结合 MCP server 与 Chrome extension,为 AI 提供基于真实会话的浏览器控制能力。

什么是 io.github.ofershap/real-browser

结合 MCP server 与 Chrome extension,为 AI 提供基于真实会话的浏览器控制能力。

README

<p align="center"> <img src="assets/logo.png" alt="Real Browser MCP" width="100" height="100" /> </p> <h1 align="center">real-browser-mcp</h1> <p align="center"> <strong>The missing piece in AI coding: your agent can now see your REAL browser.</strong> </p> <p align="center"> <a href="https://chromewebstore.google.com/detail/real-browser-mcp/fkkimpklpgedomcheiojngaaaicmaidi"><img src="https://img.shields.io/badge/Chrome_Extension-4285F4?style=for-the-badge&logo=googlechrome&logoColor=white" alt="Chrome Extension" /></a> &nbsp; <a href="https://www.npmjs.com/package/real-browser-mcp"><img src="https://img.shields.io/badge/MCP_Server-CB3837?style=for-the-badge&logo=npm&logoColor=white" alt="MCP Server" /></a> &nbsp; <a href="cursor://anysphere.cursor-deeplink/mcp/install?name=real-browser&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInJlYWwtYnJvd3Nlci1tY3AiXX0="><img src="https://img.shields.io/badge/Add_to_Cursor-6366f1?style=for-the-badge&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCI+PHBhdGggZD0iTTEyIDJMMiA3bDEwIDUgMTAtNS0xMC01ek0yIDE3bDEwIDUgMTAtNS0xMC01LTEwIDV6TTIgMTJsMTAgNSAxMC01LTEwLTUtMTAgNXoiIGZpbGw9IndoaXRlIi8+PC9zdmc+" alt="Add to Cursor" /></a> &nbsp; <a href="#-teach-your-agent"><img src="https://img.shields.io/badge/🧠_Agent_Rules-22c55e?style=for-the-badge" alt="Agent Rules" /></a> </p> <p align="center"> <a href="https://github.com/ofershap/real-browser-mcp/actions/workflows/ci.yml"><img src="https://github.com/ofershap/real-browser-mcp/actions/workflows/ci.yml/badge.svg" alt="CI" /></a> <a href="https://www.npmjs.com/package/real-browser-mcp"><img src="https://img.shields.io/npm/v/real-browser-mcp.svg" alt="npm version" /></a> <a href="https://www.npmjs.com/package/real-browser-mcp"><img src="https://img.shields.io/npm/dm/real-browser-mcp.svg" alt="npm downloads" /></a> <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT" /></a> <a href="https://www.typescriptlang.org/"><img src="https://img.shields.io/badge/TypeScript-strict-blue" alt="TypeScript" /></a> </p>

You ship a fix. Your agent says "done, please verify." You alt-tab to Chrome, navigate to the page, log in, click around, find the bug.

Your agent just wrote the code. It could also verify it. It already has your browser open right there. It just can't see it.

Now it can.

<p align="center"> <img src="assets/preview.png" alt="Real Browser MCP" width="100%" /> </p>

Quick Start

Two parts:

  • MCP server - runs on your machine, talks to your AI agent
  • Chrome extension - sits in your browser, executes the commands

1. Add the MCP server

Cursor (one click):

<img src="https://cursor.com/deeplink/mcp-install-dark.svg" alt="Install in Cursor" height="32" />

Or add manually in Cursor Settings > MCP > "Add new MCP server":

json
{
  "mcpServers": {
    "real-browser": {
      "command": "npx",
      "args": ["-y", "real-browser-mcp"]
    }
  }
}
<details> <summary>Claude Desktop, Windsurf, or other MCP clients</summary>

Claude Desktop: Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows). Add the same JSON block.

Windsurf: Settings > MCP. Same config.

Any MCP-compatible client works.

</details>

2. Install the Chrome extension

<img src="https://developer.chrome.com/static/docs/webstore/branding/image/iNEddTyWiMfLSwFD6qGq.png" alt="Available in the Chrome Web Store" height="58" />

Or load from source:

bash
git clone https://github.com/ofershap/real-browser-mcp.git
  1. Open chrome://extensions and enable Developer mode (toggle in the top right)
  2. Click Load unpacked and select the extension/ folder from the cloned repo

Click the Real Browser MCP icon in your toolbar.

Green dot = connected. Gray = waiting for server.

Done. Your agent can see your browser.


How Others Compare

Real Browser MCPPlaywright MCPChrome DevTools MCP
Uses your existing browserYesNo, launches newPartial, needs debug port
Sessions and cookiesAlready thereFresh profileManual setup
Works behind corporate SSOYesNoDepends
SetupExtension + MCP configHeadless browserChrome with --remote-debugging-port

🧠 Teach Your Agent

The agent can use all 18 tools out of the box, but it works better when it knows when and how to chain them. A config file teaches the right workflow - snapshot first, then act, then verify.

Run one command:

bash
npx real-browser-mcp --setup cursor

This installs:

  • ~/.cursor/rules/real-browser-mcp.mdc - teaches the snapshot-first workflow, how to handle dropdowns, when to use screenshots vs snapshots
  • ~/.cursor/commands/check-browser.md - adds /check-browser to your Cursor chat

After that, type /check-browser in any chat. Or just say "check the result in my browser" and the agent knows what to do.

<details> <summary>Claude Code setup</summary>
bash
npx real-browser-mcp --setup claude

Adds an AGENTS.md to your project root. Claude Code auto-discovers it.

</details>

See agent-config/ for manual installation or to customize the rules.


What It Can Do

18 tools. Grouped by purpose.

See

ToolWhat it does
browser_snapshotAccessibility tree with element refs. Compact mode (default) returns only interactive elements
browser_screenshotCapture what's on screen
browser_textExtract raw text from page or element
browser_findQuery elements by CSS selector

Interact

ToolWhat it does
browser_clickClick by ref or CSS selector
browser_click_textClick by visible text. Works through React portals and overlays
browser_typeType into inputs and contenteditable fields
browser_press_keyKey combos (Enter, Escape, Ctrl+A)
browser_scrollScroll pages and virtual containers
browser_hoverTrigger tooltips and dropdowns
browser_selectPick from native <select> dropdowns
browser_waitWait for elements to appear or disappear

Navigate

ToolWhat it does
browser_navigateGo to a URL in the active tab
browser_tabsList, create, close, or focus tabs

Debug

ToolWhat it does
browser_consoleConsole output (log, warn, error)
browser_networkXHR/fetch requests with status codes
browser_evaluateRun JavaScript via Chrome DevTools Protocol
browser_handle_dialogHandle alert/confirm/prompt dialogs

Configuration

Env varDefaultWhat it does
WS_PORT7225WebSocket port for extension connection

Connection drops are handled automatically with exponential backoff (1s to 30s), ping/pong health checks every 10s, and per-tool timeouts (5s for clicks, 60s for navigation).

<details> <summary>Multiple Chrome profiles</summary>

Run two server instances on different ports:

json
{
  "mcpServers": {
    "browser-work": {
      "command": "npx", "args": ["-y", "real-browser-mcp"]
    },
    "browser-personal": {
      "command": "npx", "args": ["-y", "real-browser-mcp"],
      "env": { "WS_PORT": "9333" }
    }
  }
}

Update the port in each extension popup to match.

</details>
<details> <summary><strong>Architecture</strong></summary>

Everything stays on your machine. The extension connects to the MCP server via WebSocket on localhost. No cloud, no proxy, nothing leaves your browser.

code
real-browser-mcp/
├── mcp-server/          MCP server (npm package, TypeScript)
│   └── src/tools/       One file per tool, registry pattern
├── extension/           Chrome extension (Manifest V3, plain JS)
│   ├── background.js    Service worker, WebSocket client, tool handlers
│   ├── content.js       Console capture
│   └── popup/           Connection status UI
├── agent-config/        Pre-built configs for Cursor + Claude Code
│   ├── cursor/          Rules and commands
│   ├── skills/          Browser automation skill
│   └── setup.mjs        One-command installer
└── tests/               Bridge + registry tests

Stack: TypeScript (strict) · MCP SDK · WebSocket · Chrome Extension Manifest V3 · Vitest

</details> <details> <summary><strong>Development</strong></summary>
bash
git clone https://github.com/ofershap/real-browser-mcp.git
cd real-browser-mcp
npm install
npm run build
npm test
CommandWhat it does
npm run buildCompile TypeScript
npm run devWatch mode
npm testRun tests
npm run typecheckType check without emitting
npm run setup:cursorInstall Cursor rule + command
</details>

FAQ

<details> <summary>Does it work with my logged-in sessions?</summary>

That's the whole point. The extension runs inside your actual Chrome - same cookies, same sessions, same local storage. No re-authentication needed.

</details> <details> <summary>Does it send data anywhere?</summary>

No. The MCP server and extension talk over WebSocket on localhost. Nothing leaves your machine. There's no analytics, no telemetry, no cloud component. Privacy policy.

</details> <details> <summary>Which AI clients work?</summary>

Any MCP-compatible client. Cursor, Claude Desktop, Claude Code, Windsurf, Cline, and anything else that speaks the MCP protocol.

</details> <details> <summary>Can I use it with multiple Chrome profiles?</summary>

Yes. Run two MCP server instances on different ports. See Configuration for the setup.

</details> <details> <summary>How is this different from Playwright MCP or browser-use?</summary>

They launch a new browser instance from scratch - no state, no cookies, no sessions. You have to replay the full login flow every time. This connects to the browser you already have open with everything already loaded.

</details>

Contributing

Bug reports, feature requests, and PRs welcome. Open an issue first for larger changes.

Author

Made by ofershap

LinkedIn GitHub


<sub>README built with README Builder</sub>

License

MIT © Ofer Shapira

常见问题

io.github.ofershap/real-browser 是什么?

结合 MCP server 与 Chrome extension,为 AI 提供基于真实会话的浏览器控制能力。

相关 Skills

agent-browser

by chulla-ceja

热门

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

搜索与获取
未扫描3.7k

接口规范

by alexxxiong

热门

API 规范管理工具 - 跨项目 API 文档的初始化、更新、查询与搜索。Triggers: 'API文档', 'API规范', '接口文档', '路由解析', 'apispec', 'API lookup', 'API search'.

搜索与获取
未扫描3.7k

investment-research

by caijichang212

热门

Perform structured investment research (投研分析) for a company/stock/ETF/sector using a repeatable framework: fundamentals (basic/财务报表与商业模式), technical analysis (技术指标与关键价位), industry research (行业景气与竞争格局), valuation (估值对比/情景), catalysts and risks, and produce a professional research report + actionable plan. Use when the user asks for: equity/ETF analysis, earnings/financial statement breakdown, peer/industry comparison, valuation ranges, bull/base/bear scenarios, technical trend/support-resistance, or a full research memo.

搜索与获取
未扫描3.7k

相关 MCP Server

by Anthropic

热门

Puppeteer 是让 Claude 自动操作浏览器进行网页抓取和测试的 MCP 服务器。

这个服务器解决了手动编写 Puppeteer 脚本的繁琐问题,适合需要自动化网页交互的开发者,比如抓取动态内容或做端到端测试。不过,作为参考实现,它可能缺少生产级的安全防护,建议在可控环境中使用。

搜索与获取
82.9k

网页抓取

编辑精选

by Anthropic

热门

Fetch 是 MCP 官方参考服务器,让 AI 能抓取网页并转为 Markdown 格式。

这个服务器解决了 AI 直接处理网页内容时格式混乱的问题,适合需要让 Claude 分析在线文档或新闻的开发者。不过作为参考实现,它缺乏生产级的安全配置,你得自己处理反爬虫和隐私风险。

搜索与获取
82.9k

Brave 搜索

编辑精选

by Anthropic

热门

Brave Search 是让 Claude 直接调用 Brave 搜索 API 获取实时网络信息的 MCP 服务器。

如果你想让 AI 助手帮你搜索最新资讯或技术文档,这个工具能绕过传统搜索的限制,直接返回结构化数据。特别适合需要实时信息的开发者,比如查 API 更新或竞品动态。不过它依赖 Brave 的 API 配额,高频使用可能受限。

搜索与获取
82.9k

评论