io.github.johnzfitch/pyghidra-lite

Name: io.github.johnzfitch/pyghidra-lite
Rating: 1.7 (33 reviews)
Author: johnzfitch

编码与调试

by johnzfitch

Token-efficient Ghidra RE: decompilation, Swift/ObjC, ELF/Mach-O, async progress

33GitHub

什么是 io.github.johnzfitch/pyghidra-lite？

Token-efficient Ghidra RE: decompilation, Swift/ObjC, ELF/Mach-O, async progress

README

pyghidra-lite

Token-efficient MCP server for Ghidra-based reverse engineering. Analyze ELF, Mach-O, and PE binaries with Swift, Objective-C, and Hermes support.

Bottom line: a lean, security-first Ghidra MCP. It is read-only by default — analysis tools never mutate your binaries or the server's configuration (which is frozen for the life of the process). The one tool that writes, annotate (rename / comment / prototype), is opt-in (--allow-write) and human-confirmed: every change is approved by you through an MCP elicitation prompt before it's committed, and it fails closed if your client can't ask. You get an analyst-agent that can persist its findings — under supervision — without giving up the read-only safety story.

Quick Start

1. Prerequisites

JDK 21+ and Ghidra 11.x are required.

bash

# macOS
brew install openjdk@21
brew install ghidra

# Ubuntu/Debian
sudo apt install openjdk-21-jdk
# Download Ghidra from https://ghidra-sre.org

# Arch Linux
sudo pacman -S jdk21-openjdk
yay -S ghidra

Ghidra installed via Homebrew (brew install ghidra) or to /opt/ghidra or ~/ghidra is found automatically. Set GHIDRA_INSTALL_DIR only for non-standard paths.

2. Install pyghidra-lite

bash

pip install pyghidra-lite

3. Add to Claude Code

Create .mcp.json in your project (or ~/.claude.json for global):

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "pyghidra-lite"
    }
  }
}

4. Use it

code

You: Analyze the binary at /path/to/binaries/app

Claude: [calls load, info, code...]

Installation

PyPI (recommended)

bash

pip install pyghidra-lite

Arch Linux (AUR)

bash

yay -S python-pyghidra-lite

From source

bash

git clone https://github.com/johnzfitch/pyghidra-lite
cd pyghidra-lite
pip install -e .

MCP Configuration

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "uvx",
      "args": ["pyghidra-lite"]
    }
  }
}

uvx auto-installs pyghidra-lite from PyPI on first run. Ghidra is auto-detected; set GHIDRA_INSTALL_DIR in env if needed:

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "uvx",
      "args": ["pyghidra-lite"],
      "env": {
        "GHIDRA_INSTALL_DIR": "/path/to/ghidra"
      }
    }
  }
}

Claude Code

Create .mcp.json in your project (or ~/.claude.json for global):

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "pyghidra-lite"
    }
  }
}

Direct mode (skip proxy)

For single-session use or debugging, run the server directly:

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "pyghidra-lite",
      "args": ["serve"]
    }
  }
}

With explicit Ghidra path

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "pyghidra-lite",
      "args": [
        "serve",
        "--ghidra-dir", "/path/to/ghidra"
      ]
    }
  }
}

Restrict to specific paths

By default, pyghidra-lite can load binaries from any path (the MCP client handles permissions). Use --restrict-path to lock down access:

json

{
  "mcpServers": {
    "pyghidra-lite": {
      "command": "pyghidra-lite",
      "args": [
        "serve",
        "--restrict-path", "/home/user/binaries",
        "--restrict-path", "/opt/targets"
      ]
    }
  }
}

Shared HTTP transport (network access)

The HTTP/SSE transports are shared and apply DNS-rebinding protection (Host/Origin validation). Binding to a non-loopback address additionally requires both --restrict-path and a bearer token:

bash

pyghidra-lite serve -t streamable-http --host 0.0.0.0 \
  --restrict-path /opt/targets \
  --auth-token "$PYGHIDRA_LITE_AUTH_TOKEN" \
  --allowed-host re.example.com:8000   # if fronted under another hostname

Clients then send Authorization: Bearer <token> on every request. Terminate TLS at a reverse proxy for remote access.

Tools (9)

pyghidra-lite provides 8 read-only analysis tools plus 1 opt-in write tool, all auto-detecting format (ELF/Mach-O/PE) and language (Swift/ObjC/Hermes):

Tool	Purpose	Key Parameters
`load`	Import and analyze binary	`path`, `profile?`, `fresh?`, `bootstrap?`, `bootstrap_mode?`
`delete`	Remove binary and cancel jobs	`name`
`binaries`	List binaries + job status	`jobs?`, `rank_sources?`
`info`	Binary overview	`binary`, `detail?` (summary/full/format/sections/entropy)
`functions`	List/search functions	`binary`, `query?`, `type?` (all/swift/objc/imports/exports)
`code`	Decompile or disassemble	`binary`, `target`, `what?` (decompile/asm), `cfg?`
`xrefs`	References and call graphs	`binary`, `target`, `direction?`, `depth?`, `diff?`
`search`	Find strings, bytes, symbols	`binary`, `query`, `type?`, `mode?`, `bg?`
`annotate` 🔒	Rename / comment / set prototype	`binary`, `target`, `action`, `name?`/`comment?`/`prototype?`

🔒 annotate is the only tool that writes. It is disabled unless the server is started with --allow-write, and every change requires interactive confirmation (MCP elicitation) before it is committed — clients that can't confirm get a preview only. See Writing back.

Examples

python

# Import and analyze
load("/path/to/binary", profile="fast")

# Version-track from a prior build, including synthetic IDs for unnamed code
load("/path/to/new.bin", profile="deep", bootstrap="old.bin", bootstrap_mode="all")

# Get overview with full triage
info("mybinary", detail="full")

# List Swift functions
functions("mybinary", type="swift")

# Decompile with CFG
code("mybinary", "main", cfg=True)

# Search strings in background
search("mybinary", ["password", "api_key"], bg=True)

# Get cross-references
xrefs("mybinary", "malloc", depth=2)

Auto-Detection

All tools automatically detect:

Format: ELF, Mach-O, PE
Language: Swift, Objective-C, Hermes/React Native
Runtime: Bun, Node.js, Electron, PyInstaller

Use the type and detail parameters to access format/language-specific features.

Bootstrap Modes

bootstrap_mode="named": transfer only meaningful source names (default).
bootstrap_mode="all": also assign stable synthetic labels to source FUN_* functions during transfer, which is useful for large version-to-version bootstrap workflows where uniqueness matters more than semantics.

Writing back

By default pyghidra-lite is read-only — no tool mutates your binaries. To let an agent persist findings (rename a function, attach a comment, fix a prototype), start the server with --allow-write:

bash

pyghidra-lite serve --allow-write          # or PYGHIDRA_LITE_ALLOW_WRITE=1

Then the annotate tool becomes usable:

python

annotate("mybinary", target="FUN_00401000", action="rename", name="parse_header")
annotate("mybinary", target="parse_header", action="comment", comment="validates the v2 header")
annotate("mybinary", target="parse_header", action="prototype", prototype="int parse_header(char *buf, int len)")

Every call is human-confirmed: the server sends an MCP elicitation prompt showing the exact old -> new change, and only commits if you accept. If the server was started without --allow-write, the tool refuses; if your MCP client doesn't support elicitation, the tool returns a preview with applied: false and writes nothing (fail closed). Confirmed changes are written in a single Ghidra transaction and saved to the on-disk project.

Audit journal. Because MCP elicitation ultimately trusts the client (an autonomous "auto-approve" client can self-confirm), every write is recorded in annotate_audit.jsonl next to the projects — and every declined or failed attempt is logged too. Each line records old -> new, so the journal is both an accountability trail and an undo log; a flood of entries is your signal that an auto-agent is churning, and the server also nudges (ctx.warning) as write volume climbs. The journal is fail-closed and hardened: a write is recorded before it's applied (if it can't be journaled, it isn't committed), the file is created 0o600 and opened with O_NOFOLLOW (a symlinked journal is refused), and it rotates by size so it can't grow without bound.

Analysis Profiles

Profile	Use Case
`fast`	Quick triage, disables 20 slow analyzers (default)
`default`	Balanced, full Ghidra analysis
`deep`	Thorough analysis for obfuscated code

The server defaults to fast to stay within MCP timeout limits. Use load(fresh=True) to run deeper analysis when needed:

python

# Default import uses fast profile
load("/path/to/binary")

# Re-analyze with deep profile
load("/path/to/binary", profile="deep", fresh=True)

Token Efficiency

pyghidra-lite is designed for minimal token usage:

Compact output by default - functions(binary, type="all") returns minimal {name, addr} pairs
Opt-in detail - use info(detail="full"), code(cfg=True), or richer type/what modes only when needed
Progress reporting - large imports report progress every 10% or 60s
Truncated strings - long strings capped at 500 chars

Architecture

By default, pyghidra-lite runs as a lightweight stdio proxy (~10MB) that forwards to a persistent shared HTTP backend (~500MB JVM). Multiple sessions share a single JVM instead of each spawning their own.

code

Claude Code session 1 ──stdio──> proxy ──┐
Claude Code session 2 ──stdio──> proxy ──┼──HTTP──> shared backend (1 JVM)
Claude Code session 3 ──stdio──> proxy ──┘        localhost:19101

The proxy auto-starts the backend on first use and the backend auto-exits after 30 minutes of idle. A file lock prevents concurrent proxy starts from spawning duplicate backends.

Command	What it does
`pyghidra-lite`	Stdio proxy (default) -- auto-starts backend
`pyghidra-lite serve`	Direct stdio server (1 JVM per session)
`pyghidra-lite serve -t streamable-http`	Start persistent HTTP backend manually
`pyghidra-lite stop`	Stop the shared backend

Set PYGHIDRA_LITE_NO_AUTOSTART=1 to disable auto-start (useful with systemd).

Multi-Agent Support

Each binary gets its own Ghidra project, enabling:

Parallel analysis of different binaries
Shared results across agents
Persistent analysis (survives restarts)
Content-addressed storage (same binary = same analysis)

Projects stored in ~/.local/share/pyghidra-lite/projects/.

License

MIT

常见问题