Exa Websets

搜索与获取

by exa-labs

创建并管理公司、人物和论文集合,自动发现并验证相关实体,补充 CEO、融资额等自定义字段,并支持定时搜索持续更新。

什么是 Exa Websets

创建并管理公司、人物和论文集合,自动发现并验证相关实体,补充 CEO、融资额等自定义字段,并支持定时搜索持续更新。

核心功能 (16 个工具)

create_webset

Create a new Webset collection. Websets are collections of web entities (companies, people, papers) that can be automatically searched, verified, and enriched with custom data. IMPORTANT PARAMETER FORMATS: - searchCriteria: MUST be array of objects like [{description: "..."}] (NOT array of strings) - enrichments: Each must have description field, optional format and options - enrichment options: MUST be array of objects like [{label: "..."}] (NOT array of strings) Example call: { "name": "AI Startups", "searchQuery": "AI startups in San Francisco", "searchCriteria": [{"description": "Founded after 2020"}], "enrichments": [ {"description": "CEO name", "format": "text"}, {"description": "Company stage", "format": "options", "options": [{"label": "Seed"}, {"label": "Series A"}]} ] }

list_websets

List all websets in your account. Returns a paginated list of webset collections with their current status and item counts.

get_webset

Get details about a specific webset by ID or externalId. Returns full webset information including status, item count, and metadata.

update_webset

Update a webset's metadata. Use this to add or update custom key-value pairs associated with the webset.

delete_webset

Delete a webset and all its items. This action is permanent and cannot be undone.

list_webset_items

List all items in a webset. Returns entities (companies, people, papers) that have been discovered and verified in the collection.

get_item

Get a specific item from a webset by its ID. Returns detailed information about the item including all enrichment data.

create_search

Create a new search to find and add items to a webset. The search will discover entities matching your query and criteria. IMPORTANT PARAMETER FORMATS: - entity: MUST be an object like {type: "company"} (NOT a string) - criteria: MUST be array of objects like [{description: "..."}] (NOT array of strings) Example call: { "websetId": "webset_123", "query": "AI startups in San Francisco", "entity": {"type": "company"}, "criteria": [{"description": "Founded after 2020"}], "count": 10 }

get_search

Get details about a specific search, including its status, progress, and results found.

cancel_search

Cancel a running search operation. This will stop the search from finding more items.

create_enrichment

Create a new enrichment for a webset. Enrichments automatically extract custom data from each item using AI agents (e.g., 'company revenue', 'CEO name', 'funding amount'). IMPORTANT PARAMETER FORMATS: - options (when format is "options"): MUST be array of objects like [{label: "..."}] (NOT array of strings) Example call (text format): {"websetId": "webset_123", "description": "CEO name", "format": "text"} Example call (options format): {"websetId": "webset_123", "description": "Company stage", "format": "options", "options": [{"label": "Seed"}, {"label": "Series A"}]}

get_enrichment

Get details about a specific enrichment, including its status and progress.

update_enrichment

Update an enrichment's metadata. You can associate custom key-value pairs with the enrichment.

delete_enrichment

Delete an enrichment from a webset. This will remove all enriched data for this enrichment from all items.

cancel_enrichment

Cancel a running enrichment operation. This will stop the enrichment from processing more items.

create_monitor

Create a monitor to automatically update a webset on a schedule. Monitors run search operations to find new items. IMPORTANT PARAMETER FORMATS: - cron: MUST be 5-field format "minute hour day month weekday" (e.g., "0 9 * * 1") - entity: MUST be an object like {type: "company"} (NOT a string) - criteria: MUST be array of objects like [{description: "..."}] (NOT array of strings) Example call: { "websetId": "webset_123", "cron": "0 9 * * 1", "query": "New AI startups", "entity": {"type": "company"}, "criteria": [{"description": "Founded in last 30 days"}], "count": 10 }

README

Exa Websets MCP Server

smithery badge

A Model Context Protocol (MCP) server that integrates Exa's Websets API with Claude Desktop, Cursor, Windsurf, and other MCP-compatible clients.

What are Websets?

Websets are collections of web entities (companies, people, research papers) that can be automatically discovered, verified, and enriched with custom data. Think of them as smart, self-updating spreadsheets powered by AI web research.

Key capabilities:

  • 🔍 Automated Search: Find entities matching natural language criteria
  • 📊 Data Enrichment: Extract custom information using AI agents
  • 🔄 Monitoring: Schedule automatic updates to keep collections fresh
  • 🎯 Verification: AI validates that entities meet your criteria
  • 🔗 Webhooks: Real-time notifications for collection updates

Available Tools

This MCP server provides the following tools:

Webset Management

ToolDescription
create_websetCreate a new webset collection with optional search and enrichments
list_websetsList all your websets with pagination support
get_websetGet details about a specific webset
update_websetUpdate a webset's metadata
delete_websetDelete a webset and all its items

Item Management

ToolDescription
list_webset_itemsList all items (entities) in a webset
get_itemGet a specific item from a webset with all enrichment data

Search Operations

ToolDescription
create_searchCreate a new search to find and add items to a webset
get_searchGet details about a specific search including status and progress
cancel_searchCancel a running search operation

Enrichment Operations

ToolDescription
create_enrichmentAdd a new data enrichment to extract custom information
get_enrichmentGet details about a specific enrichment
cancel_enrichmentCancel a running enrichment operation

Monitoring

ToolDescription
create_monitorSet up automated monitoring to keep the webset updated

Installation

Installing via Smithery

To install Exa Websets automatically via Smithery:

bash
npx -y @smithery/cli install @exa-labs/websets-mcp-server

Prerequisites

Using Claude Code (Recommended)

The quickest way to set up Websets MCP:

bash
claude mcp add websets -e EXA_API_KEY=YOUR_API_KEY -- npx -y websets-mcp-server

Replace YOUR_API_KEY with your Exa API key.

Using NPX

bash
# Install globally
npm install -g websets-mcp-server

# Or run directly with npx
npx websets-mcp-server

Configuration

Claude Desktop Configuration

  1. Enable Developer Mode

    • Open Claude Desktop
    • Click the menu → Enable Developer Mode
    • Go to Settings → Developer → Edit Config
  2. Add to configuration file:

    macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

    Windows: %APPDATA%\Claude\claude_desktop_config.json

    json
    {
      "mcpServers": {
        "websets": {
          "command": "npx",
          "args": [
            "-y",
            "websets-mcp-server"
          ],
          "env": {
            "EXA_API_KEY": "your-api-key-here"
          }
        }
      }
    }
    
  3. Restart Claude Desktop

    • Completely quit Claude Desktop
    • Start it again
    • Look for the 🔌 icon to verify connection

Cursor and Claude Code Configuration

Use the HTTP-based configuration:

json
{
  "mcpServers": {
    "websets": {
      "type": "http",
      "url": "https://mcp.exa.ai/websets",
      "headers": {}
    }
  }
}

Tool Schema Reference

⚠️ Important for AI Callers: See TOOL_SCHEMAS.md for exact parameter formats and examples.

Key Schema Rules:

  • criteria must be an array of objects: [{description: "..."}] (NOT an array of strings)
  • entity must be an object: {type: "company"} (NOT a string)
  • options must be an array of objects: [{label: "..."}] (NOT an array of strings)

These formats ensure consistency across all tools and match the Websets API specification.

Usage Examples

Once configured, you can ask Claude to interact with Websets:

Creating a Webset

code
Create a webset of AI startups in San Francisco with 20 companies. 
Add enrichments for revenue, employee count, and funding stage.

Listing and Viewing Websets

code
List all my websets and show me the details of the one called "AI Startups"

Managing Items

code
Show me the first 10 items from my "AI Startups" webset with all their enrichment data

Setting Up Monitoring

code
Create a monitor for my "AI Startups" webset that searches for new companies 
every Monday at 9am using the cron schedule "0 9 * * 1"

Advanced Enrichments

code
Add an enrichment to my webset that extracts the company's latest product launch 
and the CEO's LinkedIn profile

Example Workflow

Here's a complete workflow for building a company research database:

  1. Create the collection:

    code
    Create a webset called "SaaS Companies" that searches for 
    "B2B SaaS companies with $10M+ revenue"
    
  2. Add enrichments:

    code
    Add enrichments to extract: annual recurring revenue, number of customers, 
    primary market segment, and tech stack used
    
  3. Set up monitoring:

    code
    Create a weekly monitor that searches for new companies and refreshes 
    enrichment data for existing ones
    
  4. View results:

    code
    Show me all items with their enrichment data, sorted by revenue
    

Tool Details

create_webset

Creates a new webset collection with optional automatic population and enrichments.

Parameters:

  • name (optional): Name for the webset
  • description (optional): Description of what the webset contains
  • externalId (optional): Your own identifier (max 300 chars)
  • searchQuery (optional): Natural language query to find entities
  • searchCount (optional): Number of entities to find (default: 10, min: 1)
  • searchCriteria (optional): Additional filtering criteria
  • enrichments (optional): Array of enrichments to extract

Example:

json
{
  "name": "Tech Unicorns",
  "searchQuery": "Technology companies valued over $1 billion",
  "searchCount": 50,
  "searchCriteria": [
    {"description": "Valued at over $1 billion"},
    {"description": "Technology sector"}
  ],
  "enrichments": [
    {
      "description": "Current company valuation in USD",
      "format": "number"
    },
    {
      "description": "Names of company founders",
      "format": "text"
    },
    {
      "description": "Company stage",
      "format": "options",
      "options": [
        {"label": "Series A"},
        {"label": "Series B"},
        {"label": "Series C+"},
        {"label": "Public"}
      ]
    }
  ]
}

create_enrichment

Adds a new data enrichment to extract custom information from each webset item.

Parameters:

  • websetId: The ID of the webset
  • description: Detailed description of what to extract

Example:

json
{
  "websetId": "webset_abc123",
  "description": "Total number of full-time employees as of the most recent data"
}

create_monitor

Sets up automated monitoring with a cron schedule.

Parameters:

  • websetId: The ID of the webset
  • cron: Cron expression (e.g., "0 9 * * 1" for Mondays at 9am)
  • behavior: Either "search" (find new items) or "refresh" (update existing)
  • name (optional): Name for the monitor
  • enabled (optional): Start enabled (default: true)

Common cron schedules:

  • 0 9 * * 1 - Every Monday at 9am
  • 0 0 * * * - Daily at midnight
  • 0 */6 * * * - Every 6 hours
  • 0 9 * * 1-5 - Weekdays at 9am

API Endpoints

The server connects to Exa's Websets API at https://api.exa.ai/v0/websets.

Full API documentation: docs.exa.ai/reference/websets

Advanced Configuration

Enable Specific Tools Only

To enable only certain tools, use the enabledTools config:

json
{
  "mcpServers": {
    "websets": {
      "command": "npx",
      "args": [
        "-y",
        "websets-mcp-server",
        "--tools=create_webset,list_websets,list_webset_items"
      ],
      "env": {
        "EXA_API_KEY": "your-api-key-here"
      }
    }
  }
}

Debug Mode

Enable debug logging to troubleshoot issues:

json
{
  "mcpServers": {
    "websets": {
      "command": "npx",
      "args": [
        "-y",
        "websets-mcp-server",
        "--debug"
      ],
      "env": {
        "EXA_API_KEY": "your-api-key-here"
      }
    }
  }
}

Troubleshooting

Connection Issues

  1. Verify your API key is valid
  2. Ensure there are no spaces or quotes around the API key
  3. Completely restart your MCP client (not just close the window)
  4. Check the MCP logs for error messages

API Rate Limits

Websets API has the following limits:

  • Check your plan limits at exa.ai/dashboard
  • Use pagination for large websets
  • Monitor API usage in your dashboard

Common Errors

  • 401 Unauthorized: Invalid or missing API key
  • 404 Not Found: Webset ID doesn't exist or was deleted
  • 422 Unprocessable: Invalid query or criteria format
  • 429 Rate Limited: Too many requests, wait and retry

Resources

Development

Building from Source

bash
git clone https://github.com/exa-labs/websets-mcp-server.git
cd websets-mcp-server
npm install
npm run build

Project Structure

code
websets-mcp-server/
├── src/
│   ├── index.ts              # Main server setup
│   ├── types.ts              # TypeScript type definitions
│   ├── tools/                # MCP tool implementations
│   │   ├── config.ts         # API configuration
│   │   ├── createWebset.ts
│   │   ├── listWebsets.ts
│   │   ├── getWebset.ts
│   │   ├── updateWebset.ts
│   │   ├── deleteWebset.ts
│   │   ├── listItems.ts
│   │   ├── createEnrichment.ts
│   │   ├── createMonitor.ts
│   │   └── ...
│   └── utils/
│       ├── api.ts            # Shared API client and error handling
│       └── logger.ts         # Logging utilities
├── package.json
└── tsconfig.json

License

MIT

Contributing

Contributions welcome! Please open an issue or PR at github.com/exa-labs/websets-mcp-server.

Support

常见问题

Exa Websets 是什么?

创建并管理公司、人物和论文集合,自动发现并验证相关实体,补充 CEO、融资额等自定义字段,并支持定时搜索持续更新。

Exa Websets 提供哪些工具?

提供 16 个工具,包括 create_webset、list_websets、get_webset

相关 Skills

agent-browser

by chulla-ceja

热门

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

搜索与获取
未扫描3.7k

接口规范

by alexxxiong

热门

API 规范管理工具 - 跨项目 API 文档的初始化、更新、查询与搜索。Triggers: 'API文档', 'API规范', '接口文档', '路由解析', 'apispec', 'API lookup', 'API search'.

搜索与获取
未扫描3.7k

investment-research

by caijichang212

热门

Perform structured investment research (投研分析) for a company/stock/ETF/sector using a repeatable framework: fundamentals (basic/财务报表与商业模式), technical analysis (技术指标与关键价位), industry research (行业景气与竞争格局), valuation (估值对比/情景), catalysts and risks, and produce a professional research report + actionable plan. Use when the user asks for: equity/ETF analysis, earnings/financial statement breakdown, peer/industry comparison, valuation ranges, bull/base/bear scenarios, technical trend/support-resistance, or a full research memo.

搜索与获取
未扫描3.7k

相关 MCP Server

by Anthropic

热门

Puppeteer 是让 Claude 自动操作浏览器进行网页抓取和测试的 MCP 服务器。

这个服务器解决了手动编写 Puppeteer 脚本的繁琐问题,适合需要自动化网页交互的开发者,比如抓取动态内容或做端到端测试。不过,作为参考实现,它可能缺少生产级的安全防护,建议在可控环境中使用。

搜索与获取
82.9k

网页抓取

编辑精选

by Anthropic

热门

Fetch 是 MCP 官方参考服务器,让 AI 能抓取网页并转为 Markdown 格式。

这个服务器解决了 AI 直接处理网页内容时格式混乱的问题,适合需要让 Claude 分析在线文档或新闻的开发者。不过作为参考实现,它缺乏生产级的安全配置,你得自己处理反爬虫和隐私风险。

搜索与获取
82.9k

Brave 搜索

编辑精选

by Anthropic

热门

Brave Search 是让 Claude 直接调用 Brave 搜索 API 获取实时网络信息的 MCP 服务器。

如果你想让 AI 助手帮你搜索最新资讯或技术文档,这个工具能绕过传统搜索的限制,直接返回结构化数据。特别适合需要实时信息的开发者,比如查 API 更新或竞品动态。不过它依赖 Brave 的 API 配额,高频使用可能受限。

搜索与获取
82.9k

评论