Intercept

搜索与获取

by bighippoman

让你的 AI 具备读取 Web 的能力,可将 URL 抓取为干净的 markdown,并提供多层级回退机制。

什么是 Intercept

让你的 AI 具备读取 Web 的能力,可将 URL 抓取为干净的 markdown,并提供多层级回退机制。

README

intercept-mcp

Give your AI the ability to read the web. One command, no API keys required.

Without it, your AI hits a URL and gets a 403, a wall, or a wall of raw HTML. With intercept, it almost always gets the content — clean markdown, ready to use.

Handles tweets, YouTube videos, arXiv papers, PDFs, and regular web pages. If the first strategy fails, it tries up to 8 more before giving up.

Works with any MCP client: Claude Code, Claude Desktop, Codex, Cursor, Windsurf, Cline, and more.

<a href="https://glama.ai/mcp/servers/@bighippoman/intercept-mcp"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@bighippoman/intercept-mcp/badge" alt="intercept-mcp MCP server" /> </a>

Install

Claude Code

bash
claude mcp add intercept -s user -- npx -y intercept-mcp

Codex

bash
codex mcp add intercept -- npx -y intercept-mcp

Cursor

Settings → MCP → Add Server:

json
{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Windsurf

Settings → MCP → Add Server → same JSON config as above.

Claude Desktop

Add to your claude_desktop_config.json:

json
{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Other MCP clients

Any client that supports stdio MCP servers can run npx -y intercept-mcp.

No API keys needed for the fetch tool.

How it works

URLs are processed in three stages:

1. Site-specific handlers

Known URL patterns are routed to dedicated handlers before the fallback pipeline:

PatternHandlerWhat you get
twitter.com/*/status/*, x.com/*/status/*Twitter/XTweet text, author, media, engagement stats
youtube.com/watch?v=*, youtu.be/*YouTubeTitle, channel, duration, views, description
arxiv.org/abs/*, arxiv.org/pdf/*arXivPaper metadata, authors, abstract, categories
*.pdfPDFExtracted text (text-layer PDFs only)

2. Fallback pipeline

If no handler matches (or the handler returns nothing), the URL enters the multi-tier pipeline:

TierFetcherStrategy
1Jina ReaderClean text extraction service
2Wayback + CodetabsArchived version + CORS proxy (run in parallel)
3Raw fetchDirect GET with browser headers
4RSS, CrossRef, Semantic Scholar, HN, RedditMetadata / discussion fallbacks
5OG MetaOpen Graph tags (guaranteed fallback)

Tier 2 fetchers run in parallel. When both succeed, the higher quality result wins. All other tiers run sequentially.

3. Caching

Results are cached in-memory for the session (max 100 entries, LRU eviction). Failed URLs are also cached to prevent re-attempting known-dead URLs.

Tools

fetch

Fetch a URL and return its content as clean markdown.

  • url (string, required) — URL to fetch
  • maxTier (number, optional, 1-5) — Stop at this tier for speed-sensitive cases

search

Search the web and return results.

  • query (string, required) — Search query
  • count (number, optional, 1-20, default 5) — Number of results

Uses Brave Search API if BRAVE_API_KEY is set, otherwise falls back to SearXNG.

Environment variables

VariableRequiredDescription
BRAVE_API_KEYNoBrave Search API key (free tier: 2,000 queries/month)
SEARXNG_URLNoSelf-hosted SearXNG instance URL

The search tool needs at least one backend configured. Public SearXNG instances are rate-limited and unreliable in practice. A free Brave Search API key (2,000 queries/month) is the realistic zero-cost option. Set SEARXNG_URL only if you run your own instance.

The fetch tool works without any keys.

URL normalization

Incoming URLs are automatically cleaned:

  • Strips 60+ tracking params (UTM, click IDs, analytics, A/B testing, etc.)
  • Removes hash fragments
  • Upgrades to HTTPS
  • Cleans AMP artifacts
  • Preserves functional params (ref, format, page, offset, limit)

Content quality detection

Each fetcher result is scored for quality. Automatic fail on:

  • CAPTCHA / Cloudflare challenges
  • Login walls
  • HTTP error pages in body
  • Content under 200 characters

Requirements

  • Node.js >= 18
  • No API keys required for basic use (fetch only)

常见问题

Intercept 是什么?

让你的 AI 具备读取 Web 的能力,可将 URL 抓取为干净的 markdown,并提供多层级回退机制。

相关 Skills

agent-browser

by chulla-ceja

热门

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

搜索与获取
未扫描3.7k

接口规范

by alexxxiong

热门

API 规范管理工具 - 跨项目 API 文档的初始化、更新、查询与搜索。Triggers: 'API文档', 'API规范', '接口文档', '路由解析', 'apispec', 'API lookup', 'API search'.

搜索与获取
未扫描3.7k

investment-research

by caijichang212

热门

Perform structured investment research (投研分析) for a company/stock/ETF/sector using a repeatable framework: fundamentals (basic/财务报表与商业模式), technical analysis (技术指标与关键价位), industry research (行业景气与竞争格局), valuation (估值对比/情景), catalysts and risks, and produce a professional research report + actionable plan. Use when the user asks for: equity/ETF analysis, earnings/financial statement breakdown, peer/industry comparison, valuation ranges, bull/base/bear scenarios, technical trend/support-resistance, or a full research memo.

搜索与获取
未扫描3.7k

相关 MCP Server

by Anthropic

热门

Puppeteer 是让 Claude 自动操作浏览器进行网页抓取和测试的 MCP 服务器。

这个服务器解决了手动编写 Puppeteer 脚本的繁琐问题,适合需要自动化网页交互的开发者,比如抓取动态内容或做端到端测试。不过,作为参考实现,它可能缺少生产级的安全防护,建议在可控环境中使用。

搜索与获取
82.9k

网页抓取

编辑精选

by Anthropic

热门

Fetch 是 MCP 官方参考服务器,让 AI 能抓取网页并转为 Markdown 格式。

这个服务器解决了 AI 直接处理网页内容时格式混乱的问题,适合需要让 Claude 分析在线文档或新闻的开发者。不过作为参考实现,它缺乏生产级的安全配置,你得自己处理反爬虫和隐私风险。

搜索与获取
82.9k

Brave 搜索

编辑精选

by Anthropic

热门

Brave Search 是让 Claude 直接调用 Brave 搜索 API 获取实时网络信息的 MCP 服务器。

如果你想让 AI 助手帮你搜索最新资讯或技术文档,这个工具能绕过传统搜索的限制,直接返回结构化数据。特别适合需要实时信息的开发者,比如查 API 更新或竞品动态。不过它依赖 Brave 的 API 配额,高频使用可能受限。

搜索与获取
82.9k

评论