Exa Websets

搜索与获取

by exa-labs

创建并管理公司、人物和论文集合,自动发现并验证相关实体,补充 CEO、融资额等自定义字段,并支持定时搜索持续更新。

什么是 Exa Websets

创建并管理公司、人物和论文集合,自动发现并验证相关实体,补充 CEO、融资额等自定义字段,并支持定时搜索持续更新。

核心功能 (16 个工具)

create_webset

Create a new Webset collection. Websets are collections of web entities (companies, people, papers) that can be automatically searched, verified, and enriched with custom data. IMPORTANT PARAMETER FORMATS: - searchCriteria: MUST be array of objects like [{description: "..."}] (NOT array of strings) - enrichments: Each must have description field, optional format and options - enrichment options: MUST be array of objects like [{label: "..."}] (NOT array of strings) Example call: { "name": "AI Startups", "searchQuery": "AI startups in San Francisco", "searchCriteria": [{"description": "Founded after 2020"}], "enrichments": [ {"description": "CEO name", "format": "text"}, {"description": "Company stage", "format": "options", "options": [{"label": "Seed"}, {"label": "Series A"}]} ] }

list_websets

List all websets in your account. Returns a paginated list of webset collections with their current status and item counts.

get_webset

Get details about a specific webset by ID or externalId. Returns full webset information including status, item count, and metadata.

update_webset

Update a webset's metadata. Use this to add or update custom key-value pairs associated with the webset.

delete_webset

Delete a webset and all its items. This action is permanent and cannot be undone.

list_webset_items

List all items in a webset. Returns entities (companies, people, papers) that have been discovered and verified in the collection.

get_item

Get a specific item from a webset by its ID. Returns detailed information about the item including all enrichment data.

create_search

Create a new search to find and add items to a webset. The search will discover entities matching your query and criteria. IMPORTANT PARAMETER FORMATS: - entity: MUST be an object like {type: "company"} (NOT a string) - criteria: MUST be array of objects like [{description: "..."}] (NOT array of strings) Example call: { "websetId": "webset_123", "query": "AI startups in San Francisco", "entity": {"type": "company"}, "criteria": [{"description": "Founded after 2020"}], "count": 10 }

get_search

Get details about a specific search, including its status, progress, and results found.

cancel_search

Cancel a running search operation. This will stop the search from finding more items.

create_enrichment

Create a new enrichment for a webset. Enrichments automatically extract custom data from each item using AI agents (e.g., 'company revenue', 'CEO name', 'funding amount'). IMPORTANT PARAMETER FORMATS: - options (when format is "options"): MUST be array of objects like [{label: "..."}] (NOT array of strings) Example call (text format): {"websetId": "webset_123", "description": "CEO name", "format": "text"} Example call (options format): {"websetId": "webset_123", "description": "Company stage", "format": "options", "options": [{"label": "Seed"}, {"label": "Series A"}]}

get_enrichment

Get details about a specific enrichment, including its status and progress.

update_enrichment

Update an enrichment's metadata. You can associate custom key-value pairs with the enrichment.

delete_enrichment

Delete an enrichment from a webset. This will remove all enriched data for this enrichment from all items.

cancel_enrichment

Cancel a running enrichment operation. This will stop the enrichment from processing more items.

create_monitor

Create a monitor to automatically update a webset on a schedule. Monitors run search operations to find new items. IMPORTANT PARAMETER FORMATS: - cron: MUST be 5-field format "minute hour day month weekday" (e.g., "0 9 * * 1") - entity: MUST be an object like {type: "company"} (NOT a string) - criteria: MUST be array of objects like [{description: "..."}] (NOT array of strings) Example call: { "websetId": "webset_123", "cron": "0 9 * * 1", "query": "New AI startups", "entity": {"type": "company"}, "criteria": [{"description": "Founded in last 30 days"}], "count": 10 }

README

Exa Websets MCP Server

smithery badge

A Model Context Protocol (MCP) server that integrates Exa's Websets API with Claude Desktop, Cursor, Windsurf, and other MCP-compatible clients.

What are Websets?

Websets are collections of web entities (companies, people, research papers) that can be automatically discovered, verified, and enriched with custom data. Think of them as smart, self-updating spreadsheets powered by AI web research.

Key capabilities:

  • 🔍 Automated Search: Find entities matching natural language criteria
  • 📊 Data Enrichment: Extract custom information using AI agents
  • 🎯 Verification: AI validates that entities meet your criteria
  • 🔗 Webhooks: Real-time notifications for collection updates
  • 📥 Imports: Bring your own CSV data into Websets for enrichment or scoping

Available Tools

This MCP server provides the following tools:

Webset Management

ToolDescription
create_websetCreate a new webset collection with optional search and enrichments
list_websetsList all your websets with pagination support
get_websetGet details about a specific webset
update_websetUpdate a webset's title and/or metadata
delete_websetDelete a webset and all its items
preview_websetPreview how a search query will be interpreted before creating a webset

Item Management

ToolDescription
list_webset_itemsList all items (entities) in a webset
get_itemGet a specific item from a webset with all enrichment data

Search Operations

ToolDescription
create_searchCreate a new search to find and add items to a webset
get_searchGet details about a specific search including status and progress
cancel_searchCancel a running search operation

Enrichment Operations

ToolDescription
create_enrichmentAdd a new data enrichment to extract custom information
get_enrichmentGet details about a specific enrichment
cancel_enrichmentCancel a running enrichment operation

Webhooks

ToolDescription
create_webhookSubscribe to real-time HTTP callbacks for webset events
get_webhookGet details about a specific webhook
update_webhookUpdate a webhook's URL, events, or metadata
delete_webhookDelete a webhook
list_webhooksList all webhooks in your account

Imports

ToolDescription
create_importCreate an import to upload your own CSV data into Websets
get_importGet details about a specific import including upload URL
list_importsList all imports in your account

Events

ToolDescription
list_eventsList system events (search, enrichment, webset lifecycle, etc.)

Installation

Installing via Smithery

To install Exa Websets automatically via Smithery:

bash
npx -y @smithery/cli install @exa-labs/websets-mcp-server

Prerequisites

Using Claude Code (Recommended)

The quickest way to set up Websets MCP:

bash
claude mcp add websets -e EXA_API_KEY=YOUR_API_KEY -- npx -y websets-mcp-server

Replace YOUR_API_KEY with your Exa API key.

Using NPX

bash
# Install globally
npm install -g websets-mcp-server

# Or run directly with npx
npx websets-mcp-server

Configuration

Claude Desktop Configuration

  1. Enable Developer Mode

    • Open Claude Desktop
    • Click the menu → Enable Developer Mode
    • Go to Settings → Developer → Edit Config
  2. Add to configuration file:

    macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

    Windows: %APPDATA%\Claude\claude_desktop_config.json

    json
    {
      "mcpServers": {
        "websets": {
          "command": "npx",
          "args": [
            "-y",
            "websets-mcp-server"
          ],
          "env": {
            "EXA_API_KEY": "your-api-key-here"
          }
        }
      }
    }
    
  3. Restart Claude Desktop

    • Completely quit Claude Desktop
    • Start it again
    • Look for the 🔌 icon to verify connection

Cursor and Claude Code Configuration

Use the HTTP-based configuration. Pass your Exa API key as a Bearer token in the Authorization header (or as an ?exaApiKey=... query parameter as a fallback):

json
{
  "mcpServers": {
    "websets": {
      "type": "http",
      "url": "https://websetsmcp.exa.ai/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_EXA_API_KEY"
      }
    }
  }
}

Tool Schema Reference

⚠️ Important for AI Callers: See TOOL_SCHEMAS.md for exact parameter formats and examples.

Key Schema Rules:

  • criteria must be an array of objects: [{description: "..."}] (NOT an array of strings)
  • entity must be an object: {type: "company"} (NOT a string)
  • options must be an array of objects: [{label: "..."}] (NOT an array of strings)

These formats ensure consistency across all tools and match the Websets API specification.

Usage Examples

Once configured, you can ask Claude to interact with Websets:

Creating a Webset

code
Create a webset of AI startups in San Francisco with 20 companies. 
Add enrichments for revenue, employee count, and funding stage.

Listing and Viewing Websets

code
List all my websets and show me the details of the one called "AI Startups"

Managing Items

code
Show me the first 10 items from my "AI Startups" webset with all their enrichment data

Adding More Items

code
Run another search on my "AI Startups" webset for 20 more companies focused on
enterprise voice agents, appending to the existing items

Advanced Enrichments

code
Add an enrichment to my webset that extracts the company's latest product launch 
and the CEO's LinkedIn profile

Example Workflow

Here's a complete workflow for building a company research database:

  1. Create the collection:

    code
    Create a webset called "SaaS Companies" that searches for 
    "B2B SaaS companies with $10M+ revenue"
    
  2. Add enrichments:

    code
    Add enrichments to extract: annual recurring revenue, number of customers, 
    primary market segment, and tech stack used
    
  3. Subscribe to events:

    code
    Create a webhook to https://example.com/hook subscribed to
    webset.search.completed and webset.enrichment.completed
    
  4. View results:

    code
    Show me all items with their enrichment data, sorted by revenue
    

Tool Details

create_webset

Creates a new webset collection with optional automatic population and enrichments.

Parameters:

  • externalId (optional): Your own identifier for the webset (max 300 chars)
  • searchQuery (optional): Natural language query to find entities
  • searchCount (optional): Number of entities to find (default: 10, min: 1)
  • searchEntity (optional): Entity type for the search, e.g. {type: "company"}. For "custom" type include a description.
  • searchCriteria (optional): Additional filtering criteria — [{description: "..."}] (max 5)
  • searchBehavior (optional): "override" (default) replaces existing items, "append" adds to them
  • searchExclude (optional): Imports/websets whose results to exclude — [{source: "webset"|"import", id: "..."}]
  • searchScope (optional): Scope the search to existing imports or websets — [{source: "import"|"webset", id: "..."}]; enables hop searches with a relationship object
  • searchRecall (optional): Whether to compute recall metrics for the search
  • searchMaxPeoplePerCompany (optional): Soft cap on people-per-employer for person searches
  • searchMetadata (optional): Key-value metadata to associate with the search
  • enrichments (optional): Data enrichments to automatically extract for each item
  • metadata (optional): Key-value metadata to associate with the webset
  • excludes (optional): Global excludes — sources whose results are omitted across all operations on this webset

Note: there is no top-level name or description parameter on the webset itself. Use update_webset with title after creation, or metadata to attach arbitrary key-value pairs.

Example:

json
{
  "externalId": "tech-unicorns-2024",
  "searchQuery": "Technology companies valued over $1 billion",
  "searchCount": 50,
  "searchEntity": {"type": "company"},
  "searchCriteria": [
    {"description": "Valued at over $1 billion"},
    {"description": "Technology sector"}
  ],
  "enrichments": [
    {
      "description": "Current company valuation in USD",
      "format": "number"
    },
    {
      "description": "Names of company founders",
      "format": "text"
    },
    {
      "description": "Company stage",
      "format": "options",
      "options": [
        {"label": "Series A"},
        {"label": "Series B"},
        {"label": "Series C+"},
        {"label": "Public"}
      ]
    }
  ]
}

create_enrichment

Adds a new data enrichment to extract custom information from each webset item.

Parameters:

  • websetId: The ID of the webset
  • description: Detailed description of what to extract
  • format (optional): One of "text", "date", "number", "options", "email", "phone", "url" — auto-selected if omitted
  • options (optional): When format is "options", the choices the enrichment agent picks from — [{label: "..."}]
  • metadata (optional): Key-value metadata to associate with this enrichment

Example:

json
{
  "websetId": "webset_abc123",
  "description": "Total number of full-time employees as of the most recent data",
  "format": "number"
}

Monitors (scheduled refresh/search) are exposed by the underlying Websets API but are not currently surfaced as MCP tools in this server. Configure monitors directly via the Websets API or websets.exa.ai.

API Endpoints

The server connects to Exa's Websets API at https://api.exa.ai/websets/v0.

Full API documentation: docs.exa.ai/reference/websets

Advanced Configuration

Enable Specific Tools Only

To enable only certain tools, use the enabledTools config:

json
{
  "mcpServers": {
    "websets": {
      "command": "npx",
      "args": [
        "-y",
        "websets-mcp-server",
        "--tools=create_webset,list_websets,list_webset_items"
      ],
      "env": {
        "EXA_API_KEY": "your-api-key-here"
      }
    }
  }
}

Debug Mode

Enable debug logging to troubleshoot issues:

json
{
  "mcpServers": {
    "websets": {
      "command": "npx",
      "args": [
        "-y",
        "websets-mcp-server",
        "--debug"
      ],
      "env": {
        "EXA_API_KEY": "your-api-key-here"
      }
    }
  }
}

Troubleshooting

Connection Issues

  1. Verify your API key is valid
  2. Ensure there are no spaces or quotes around the API key
  3. Completely restart your MCP client (not just close the window)
  4. Check the MCP logs for error messages

API Rate Limits

Websets API has the following limits:

  • Check your plan limits at exa.ai/dashboard
  • Use pagination for large websets
  • Monitor API usage in your dashboard

Common Errors

  • 401 Unauthorized: Invalid or missing API key
  • 404 Not Found: Webset ID doesn't exist or was deleted
  • 422 Unprocessable: Invalid query or criteria format
  • 429 Rate Limited: Too many requests, wait and retry

Resources

Development

Building from Source

bash
git clone https://github.com/exa-labs/websets-mcp-server.git
cd websets-mcp-server
npm install
npm run build

Project Structure

code
websets-mcp-server/
├── src/
│   ├── index.ts              # Main server setup
│   ├── types.ts              # TypeScript type definitions
│   ├── tools/                # MCP tool implementations
│   │   ├── config.ts         # API configuration
│   │   ├── createWebset.ts
│   │   ├── listWebsets.ts
│   │   ├── getWebset.ts
│   │   ├── updateWebset.ts
│   │   ├── deleteWebset.ts
│   │   ├── listItems.ts
│   │   ├── createEnrichment.ts
│   │   ├── createSearch.ts
│   │   ├── createWebhook.ts
│   │   ├── createImport.ts
│   │   └── ...
│   └── utils/
│       ├── api.ts            # Shared API client and error handling
│       └── logger.ts         # Logging utilities
├── package.json
└── tsconfig.json

License

MIT

Contributing

Contributions welcome! Please open an issue or PR at github.com/exa-labs/websets-mcp-server.

Support

常见问题

Exa Websets 是什么?

创建并管理公司、人物和论文集合,自动发现并验证相关实体,补充 CEO、融资额等自定义字段,并支持定时搜索持续更新。

Exa Websets 提供哪些工具?

提供 16 个工具,包括 create_webset、list_websets、get_webset

相关 Skills

谷歌视频工具

by bwbernardweston18

热门

>

搜索与获取
未扫描4.5k
热门

股票投研9点分析框架,覆盖基本面/财务/竞品/估值/宏观/情绪等维度

搜索与获取
未扫描4.5k

SEO审计工具

by amdf01-debug

热门

搜索与获取
未扫描4.5k

相关 MCP Server

by Anthropic

热门

Puppeteer 是让 Claude 自动操作浏览器进行网页抓取和测试的 MCP 服务器。

这个服务器解决了手动编写 Puppeteer 脚本的繁琐问题,适合需要自动化网页交互的开发者,比如抓取动态内容或做端到端测试。不过,作为参考实现,它可能缺少生产级的安全防护,建议在可控环境中使用。

搜索与获取
85.9k

网页抓取

编辑精选

by Anthropic

热门

Fetch 是 MCP 官方参考服务器,让 AI 能抓取网页并转为 Markdown 格式。

这个服务器解决了 AI 直接处理网页内容时格式混乱的问题,适合需要让 Claude 分析在线文档或新闻的开发者。不过作为参考实现,它缺乏生产级的安全配置,你得自己处理反爬虫和隐私风险。

搜索与获取
85.9k

Brave 搜索

编辑精选

by Anthropic

热门

Brave Search 是让 Claude 直接调用 Brave 搜索 API 获取实时网络信息的 MCP 服务器。

如果你想让 AI 助手帮你搜索最新资讯或技术文档,这个工具能绕过传统搜索的限制,直接返回结构化数据。特别适合需要实时信息的开发者,比如查 API 更新或竞品动态。不过它依赖 Brave 的 API 配额,高频使用可能受限。

搜索与获取
85.9k

评论