MLB Stats Server
平台与服务by etweisberg
通过 MCP server 结构化访问 Major League Baseball 统计数据,可查询 Statcast、Fangraphs 和 Baseball Reference 等详细信息,并生成可视化用于深入分析。
什么是 MLB Stats Server?
通过 MCP server 结构化访问 Major League Baseball 统计数据,可查询 Statcast、Fangraphs 和 Baseball Reference 等详细信息,并生成可视化用于深入分析。
核心功能 (46 个工具)
get_statsget_scheduleGet list of games for a given date/range and/or team/opponent.
get_player_statsReturns a list of current season or career stat data for a given player.
get_standingsReturns a dict of standings data for a given league/division and season.
get_team_leadersReturns a python list of stat leader data for a given team
lookup_playerGet data about players based on first, last, or full name.
get_boxscoreGet a formatted boxscore for a given game.
get_team_rosterGet the roster for a given team.
get_game_paceReturns data about pace of game for a given season (back to 1999).
get_metaGet available values from StatsAPI for use in other queries, or look up descriptions for values found in API results. For example, to get a list of leader categories to use when calling team_leaders(): statsapi.meta('leagueLeaderTypes')
get_available_endpointsGet MLB StatsAPI endpoints directly
get_notesGet additional notes on an endpoint
get_game_scoring_play_dataReturns a dictionary of scoring plays for a given game containing 3 keys: * home - home team data * away - away team data * plays - sorted list of scoring play data
get_last_gameGet the gamePk (game_id) for the given team's most recent completed game.
get_league_leader_dataReturns a list of stat leaders overall or for a given league (103=AL, 104=NL).
get_linescoreGet formatted linescore data for a specific MLB game.
get_next_gameGet the game ID for a team's next scheduled game.
get_game_highlight_dataReturns a list of highlight data for a given game.
get_statcast_dataPulls statcast play-level data from Baseball Savant for a given date range. INPUTS: start_dt: YYYY-MM-DD : the first date for which you want statcast data end_dt: YYYY-MM-DD : the last date for which you want statcast data team: optional (defaults to None) : city abbreviation of the team you want data for (e.g. SEA or BOS) verbose: bool (defaults to True) : whether to print updates on query progress parallel: bool (defaults to True) : whether to parallelize HTTP requests in large queries start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets. If no arguments are provided, this will return yesterday's statcast data. If one date is provided, it will return that date's statcast data.
get_statcast_batter_dataPulls statcast pitch-level data from Baseball Savant for a given batter. ARGUMENTS start_dt : YYYY-MM-DD : the first date for which you want a player's statcast data end_dt : YYYY-MM-DD : the final date for which you want data player_id : INT : the player's MLBAM ID. Find this by via the get_playerid_lookup tool, finding the correct player, and selecting their key_mlbam. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_pitcher_dataPulls statcast pitch-level data from Baseball Savant for a given pitcher. ARGUMENTS start_dt : YYYY-MM-DD : the first date for which you want a player's statcast data end_dt : YYYY-MM-DD : the final date for which you want data player_id : INT : the player's MLBAM ID. Find this by calling pthe get_playerid_lookup tool, finding the correct player, and selecting their key_mlbam. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_batter_exitvelo_barrelsRetrieves batted ball data for all batters in a given year. ARGUMENTS year: The year for which you wish to retrieve batted ball data. Format: YYYY. minBBE: The minimum number of batted ball events for each player. If a player falls below this threshold, they will be excluded from the results. If no value is specified, only qualified batters will be returned. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_pitcher_exitvelo_barrelsRetrieves batted ball against data for all qualified pitchers in a given year. ARGUMENTS year: The year for which you wish to retrieve batted ball against data. Format: YYYY. minBBE: The minimum number of batted ball against events for each pitcher. If a player falls below this threshold, they will be excluded from the results. If no value is specified, only qualified pitchers will be returned. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_batter_expected_statsRetrieves expected stats based on quality of batted ball contact in a given year. ARGUMENTS year: The year for which you wish to retrieve expected stats data. Format: YYYY. minPA: The minimum number of plate appearances for each player. If a player falls below this threshold, they will be excluded from the results. If no value is specified, only qualified batters will be returned. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_pitcher_expected_statsRetrieves expected stats based on quality of batted ball contact against in a given year. ARGUMENTS year: The year for which you wish to retrieve expected stats data. Format: YYYY. minPA: The minimum number of plate appearances against for each pitcher. If a player falls below this threshold, they will be excluded from the results. If no value is specified, only qualified pitchers will be returned. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_batter_percentile_ranksRetrieves percentile ranks for batters in a given year. ARGUMENTS year: The year for which you wish to retrieve percentile data. Format: YYYY. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_pitcher_percentile_ranksRetrieves percentile ranks for each player in a given year, including batters with 2.1 PA per team game and 1.25 for pitchers. It includes percentiles on expected stats, batted ball data, and spin rates, among others. ARGUMENTS year: The year for which you wish to retrieve percentile data. Format: YYYY. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_batter_pitch_arsenalRetrieves outcome data for batters split by the pitch type in a given year. ARGUMENTS year: The year for which you wish to retrieve pitch arsenal data. Format: YYYY. minPA: The minimum number of plate appearances for each player. If a player falls below this threshold, they will be excluded from the results. If no value is specified, the default number of plate appearances is 25. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_pitcher_pitch_arsenalRetrieves high level stats on each pitcher's arsenal in a given year. ARGUMENTS year: The year for which you wish to retrieve expected stats data. Format: YYYY. minP: The minimum number of pitches thrown. If a player falls below this threshold, they will be excluded from the results. If no value is specified, only qualified pitchers will be returned. arsenal_type: The type of stat to retrieve for the pitchers' arsenals. Options include ["average_speed", "n_", "average_spin"], where "n_" corresponds to the percentage share for each pitch. If no value is specified, it will default to average speed. start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
get_statcast_single_gamePulls statcast play-level data from Baseball Savant for a single game, identified by its MLB game ID (game_pk in statcast data) INPUTS: game_pk : 6-digit integer MLB game ID to retrieve start_row: optional (defaults to None) : starting row index for truncating large results (0-based, inclusive) end_row: optional (defaults to None) : ending row index for truncating large results (0-based, exclusive) Use start_row and end_row to limit response size when dealing with large datasets.
create_strike_zone_plotProduces a pitches overlaid on a strike zone using StatCast data Args: data: (pandas.DataFrame) StatCast pandas.DataFrame of StatCast pitcher data title: (str), default = '' Optional: Title of plot colorby: (str), default = 'pitch_type' Optional: Which category to color the mark with. 'pitch_type', 'pitcher', 'description' or a column within data legend_title: (str), default = based on colorby Optional: Title for the legend annotation: (str), default = 'pitch_type' Optional: What to annotate in the marker. 'pitch_type', 'release_speed', 'effective_speed', 'launch_speed', or something else in the data
create_spraychart_plotProduces a spraychart using statcast data overlayed on specified stadium Args: data: (pandas.DataFrame) StatCast pandas.DataFrame of StatCast batter data team_stadium: (str) Team whose stadium the hits will be overlaid on title: (str), default = '' Optional: Title of plot size: (int), default = 100 Optional: Size of hit circles on plot colorby: (str), default = 'events' Optional: Which category to color the mark with. 'events','player', or a column within data legend_title: (str), default = based on colorby Optional: Title for the legend width: (int), default = 500 Optional: Width of plot (not counting the legend) height: (int), default = 500 Optional: Height of plot
create_bb_profile_plotPlots a given StatCast parameter split by bb_type Args: df: (pandas.DataFrame) pandas.DataFrame of StatCast batter data (retrieved through statcast, statcast_batter, etc) parameter: (str), default = 'launch_angle' Optional: Parameter to plot
create_teams_plotPlots a scatter plot with each MLB team Args: data: (pandas.DataFrame) pandas.DataFrame of Fangraphs team data (retrieved through team_batting or team_pitching) x_axis: (str) Stat name to be plotted as the x_axis of the chart y_axis: (str) Stat name to be plotted as the y_axis of the chart title: (str), default = None Optional: Title of the plot
get_pitching_stats_brefGet all pitching stats for a set season. If no argument is supplied, gives stats for current season to date.
get_pitching_stats_rangeGet all pitching stats for a set time range. This can be the past week, the month of August, anything. Just supply the start and end date in YYYY-MM-DD format.
get_pitching_statsGet season-level pitching data from FanGraphs. Args: start_season: First season to retrieve data from end_season: Final season to retrieve data from. If None, returns only start_season. league: Either "all", "nl", "al", or "mnl" qual: Minimum number of plate appearances to be included ind: 1 for individual season level, 0 for aggregate data Returns: Dictionary containing pitching stats from FanGraphs
get_playerid_lookupLookup playerIDs (MLB AM, bbref, retrosheet, FG) for a given player Args: last (str, required): Player's last name. first (str, optional): Player's first name. Defaults to None. fuzzy (bool, optional): In case of typos, returns players with names close to input. Defaults to False. Returns: pd.DataFrame: DataFrame of playerIDs, name, years played
reverse_lookup_playerRetrieve a table of player information given a list of player ids :param player_ids: list of player ids :type player_ids: list :param key_type: name of the key type being looked up (one of "mlbam", "retro", "bbref", or "fangraphs") :type key_type: str :rtype: :class:`pandas.core.frame.DataFrame`
get_schedule_and_recordRetrieve a team's game-level results for a given season, including win/loss/tie result, score, attendance, and winning/losing/saving pitcher. If the season is incomplete, it will provide scheduling information for future games. ARGUMENTS season: Integer. The season for which you want a team's record data. team: String. The abbreviation of the team for which you are requesting data (e.g. "PHI", "BOS", "LAD").
get_player_splitsReturns a dataframe of all split stats for a given player. If player_info is True, this will also return a dictionary that includes player position, handedness, height, weight, position, and team
get_pybaseball_standingsReturns a pandas DataFrame of the standings for a given MLB season, or the most recent standings if the date is not specified. ARGUMENTS season (int): the year of the season
get_team_battingGet season-level Batting Statistics for Specific Team (from Baseball-Reference) ARGUMENTS: team : str : The Team Abbreviation (i.e. 'NYY' for Yankees) of the Team you want data for start_season : int : first season you want data for (or the only season if you do not specify an end_season) end_season : int : final season you want data for
get_team_fieldingGet season-level Fielding Statistics for Specific Team (from Baseball-Reference) ARGUMENTS: team : str : The Team Abbreviation (i.e., 'NYY' for Yankees) of the Team you want data for start_season : int : first season you want data for (or the only season if you do not specify an end_season) end_season : int : final season you want data for
get_team_pitchingGet season-level Pitching Statistics for Specific Team (from Baseball-Reference) ARGUMENTS: team : str : The Team Abbreviation (i.e. 'NYY' for Yankees) of the Team you want data for start_season : int : first season you want data for (or the only season if you do not specify an end_season) end_season : int : final season you want data for
get_top_prospectsRetrieves the top prospects by team or leaguewide. It can return top prospect pitchers, batters, or both. ARGUMENTS team: The team name for which you wish to retrieve top prospects. If not specified, the function will return leaguewide top prospects. playerType: Either "pitchers" or "batters". If not specified, the function will return top prospects for both pitchers and batters.
README
MLB Stats MCP Server
A Python project that creates a Model Context Protocol (MCP) server for accessing MLB statistics data through the MLB Stats API and pybaseball library for statcast, fangraphs, and baseball reference statistics. This server provides structured API access to baseball statistics that can be used with MCP-compatible clients.
Project Structure
mlb_stats_mcp/- Main package directoryserver.py- Core MCP server implementationtools/- MCP tool implementationsmlb_statsapi_tools.py- MLB StatsAPI tool definitionsstatcast_tools.py- Statcast data tool definitionspybaseball_plotting_tools.py- Additionalpybaseballtools provided for generating matplotlib plots and returning base64 encoded imagespybaseball_supp_tools.py- Supplementalpybaseballfunctions for interfacing with fangraphs, baseball reference, and other data sources
utils/- Utility moduleslogging_config.py- Logging configurationimages.py- functions related to handling plot images
tests/- Test suite for verifying server functionality
pyproject.toml- Project configuration and dependencies.pre-commit-config.yaml- Pre-commit hooks configuration.github/- GitHub Actions workflows
Tools
Setup
- Install uv if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | sh
- Create and activate a virtual environment:
uv venv
source .venv/bin/activate # On Unix/macOS
# or
.venv\Scripts\activate # On Windows
- Install dependencies:
uv pip install -e .
Installing via Smithery
To install MLB Stats Server for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @etweisberg/mlb-mcp --client claude
Running Tests
The project includes comprehensive pytest tests for the MCP server functionality:
uv run pytest -v
Tests verify all MLB StatsAPI tools work correctly with the MCP protocol, establishing connections, making API calls, and processing responses.
Environment Variables
The project uses environment variables stored in .env to configure settings.
Use ANTHROPIC_API_KEY to enable MCP Server.
Logging Configuration
The MLB Stats MCP Server supports configurable logging via environment variables:
MLB_STATS_LOG_LEVEL- Sets the logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)MLB_STATS_LOG_FILE- Path to log file (if not set, logs to stdout)
Claude Desktop Integration
To connect this MCP server to Claude Desktop, add a configuration to your claude_desktop_config.json file. Here's a template configuration:
"mcp-baseball-stats": {
"command": "{PATH_TO_UV}",
"args": [
"--directory",
"{PROJECT_DIRECTORY}",
"run",
"python",
"-m",
"mlb_stats_mcp.server"
],
"env": {
"MLB_STATS_LOG_FILE": "{LOG_FILE_PATH}",
"MLB_STATS_LOG_LEVEL": "DEBUG"
}
}
Replace the following placeholders:
{PATH_TO_UV}: Path to your uv installation (e.g.,~/.local/bin/uv){PROJECT_DIRECTORY}: Path to your project directory{LOG_FILE_PATH}: Path where you want to store the log file
Technologies Used
mcp[cli]- Machine-Learning Chat Protocol for tool definitionmlb-statsapi- Python wrapper for the MLB Stats APIhttpx- HTTP client for making API requestspytestandpytest-asyncio- Test frameworksuv- Fast Python package manager and installer
Linting
This project uses Ruff for linting and code formatting, with pre-commit hooks to ensure code quality.
Setup Pre-commit Hooks
- Install pre-commit:
pip install pre-commit
- Initialize pre-commit hooks:
pre-commit install
Now, the linting checks will run automatically whenever you commit code. You can also run them manually:
pre-commit run --all-files
Linting Configuration
Linting rules are configured in the pyproject.toml file under the [tool.ruff] section. The project follows PEP 8 style guidelines with some customizations.
CI Integration
GitHub Actions workflows automatically run tests, linting, and pre-commit checks on all pull requests and pushes to the main branch.
常见问题
MLB Stats Server 是什么?
通过 MCP server 结构化访问 Major League Baseball 统计数据,可查询 Statcast、Fangraphs 和 Baseball Reference 等详细信息,并生成可视化用于深入分析。
MLB Stats Server 提供哪些工具?
提供 46 个工具,包括 get_stats、get_schedule、get_player_stats 等。
相关 Skills
MCP构建
by anthropics
聚焦高质量 MCP Server 开发,覆盖协议研究、工具设计、错误处理与传输选型,适合用 FastMCP 或 MCP SDK 对接外部 API、封装服务能力。
✎ 想让 LLM 稳定调用外部 API,就用 MCP构建:从 Python 到 Node 都有成熟指引,帮你更快做出高质量 MCP 服务器。
Slack动图
by anthropics
面向Slack的动图制作Skill,内置emoji/消息GIF的尺寸、帧率和色彩约束、校验与优化流程,适合把创意或上传图片快速做成可直接发送的Slack动画。
✎ 帮你快速做出适配 Slack 的动图,内置约束规则和校验工具,少踩上传与播放坑,做表情包和演示都更省心。
接口设计评审
by alirezarezvani
审查 REST API 设计是否符合行业规范,自动检查命名、HTTP 方法、状态码与文档覆盖,识别破坏性变更并给出设计评分,适合评审接口方案和版本迭代前把关。
✎ 做API和架构方案时,它能帮你提前揪出接口设计问题并对齐最佳实践,评审视角系统,团队协作更省心。
相关 MCP Server
Slack 消息
编辑精选by Anthropic
Slack 是让 AI 助手直接读写你的 Slack 频道和消息的 MCP 服务器。
✎ 这个服务器解决了团队协作中需要 AI 实时获取 Slack 信息的痛点,特别适合开发团队让 Claude 帮忙汇总频道讨论或发送通知。不过,它目前只是参考实现,文档有限,不建议在生产环境直接使用——更适合开发者学习 MCP 如何集成第三方服务。
by netdata
io.github.netdata/mcp-server 是让 AI 助手实时监控服务器指标和日志的 MCP 服务器。
✎ 这个工具解决了运维人员需要手动检查系统状态的痛点,最适合 DevOps 团队让 Claude 自动分析性能数据。不过,它依赖 NetData 的现有部署,如果你没用过这个监控平台,得先花时间配置。
by d4vinci
Scrapling MCP Server 是专为现代网页设计的智能爬虫工具,支持绕过 Cloudflare 等反爬机制。
✎ 这个工具解决了爬取动态网页和反爬网站时的头疼问题,特别适合需要批量采集电商价格或新闻数据的开发者。不过,它依赖外部浏览器引擎,资源消耗较大,不适合轻量级任务。