io.github.somacoffeekyoto/imgx

编码与调试

by somacoffeekyoto

通过CLI或MCP进行AI图像生成与编辑,支持文本式修改及多provider接入。

什么是 io.github.somacoffeekyoto/imgx

通过CLI或MCP进行AI图像生成与编辑,支持文本式修改及多provider接入。

README

imgx-mcp

npm version npm downloads Cursor Directory License: MIT

AI image generation and editing MCP server. Works with Claude Code, Gemini CLI, Cursor, Windsurf, and any MCP-compatible tool.

Generate images from text, edit existing images with text instructions, iterate on results — all from your AI coding environment.

What sets imgx-mcp apart

  • No prompt engineering — Your AI agent keeps conversation context and auto-constructs optimized prompts. Say what you need; the agent handles prompt structure, model selection, and platform-specific sizing
  • 24 editing techniques built in — Atmosphere, composition, style transfer, element manipulation, and trending styles — bundled as a Skill your agent applies on demand
  • Session management with undo/redo — Edit iteratively, step back to any point, branch off, or switch between parallel sessions — version control for images

Quick start

Add to your tool's MCP config (.mcp.json, settings.json, etc.):

json
{
  "mcpServers": {
    "imgx": {
      "command": "npx",
      "args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
      "env": { "GEMINI_API_KEY": "your-key" }
    }
  }
}

That's it. Your AI agent can now generate and edit images.

Windows: Replace "command": "npx" with "command": "cmd" and prepend "/c" to the args array.

Skill (Claude Code)

For Claude Code users, imgx-mcp includes an image-generation skill — a guided prompt that teaches Claude how to use the MCP tools effectively. With the skill installed, type /image-generation to start a guided workflow.

Install the skill

Copy the skill directory from the npm package or GitHub repository to your project:

bash
# From npm (after npx has cached the package)
cp -r $(npm root -g)/imgx-mcp/skills .claude/skills

# Or from the GitHub repository
curl -sL https://raw.githubusercontent.com/somacoffeekyoto/imgx-mcp/main/skills/image-generation/SKILL.md \
  -o .claude/skills/image-generation/SKILL.md --create-dirs
curl -sL https://raw.githubusercontent.com/somacoffeekyoto/imgx-mcp/main/skills/image-generation/references/providers.md \
  -o .claude/skills/image-generation/references/providers.md --create-dirs

Or place skill files manually:

code
your-project/
  .mcp.json                              ← MCP server config (Quick start above)
  .claude/
    skills/
      image-generation/
        SKILL.md                         ← skill prompt
        references/
          providers.md                   ← provider reference

The skill files are included in the npm package under skills/ and in the GitHub repository.

Personal skill (all projects): Place in ~/.claude/skills/image-generation/ instead of .claude/skills/.

Claude Desktop

Claude Desktop supports skills via ZIP upload:

  1. Download image-generation-skill.zip from the repository (or find it in the npm package under dist/)
  2. In Claude Desktop: Settings > Profile > Customize > Skills > Add Skill
  3. Upload the ZIP

Update the skill by re-downloading and re-uploading the ZIP after new releases.

What the Skill brings

The MCP server gives the AI the ability to generate and edit images. The Skill adds the knowledge of how to use those tools well — so you don't need to learn prompt syntax, model specifications, or service-specific parameters.

  • Automatic prompt construction — Say "I need a cover image." The AI builds a structured prompt using the Subject-Context-Style framework: what to show, where to place it, how it should look
  • 24 editing techniques — Atmosphere adjustment, composition changes, element manipulation, style transfer. "Make it warmer" or "add depth of field" — the AI selects the right instruction for the model
  • Intelligent model selection — Starts with the free model. Suggests paid upgrades only when your needs exceed free tier capabilities, and explains what changes
  • Platform-aware sizing — "Twitter OGP" or "App Store screenshot" — the AI picks the correct aspect ratio and resolution. Covers social media, OGP, app stores, print, and blog platforms
  • Trending style templates — Ghibli, action figure in box, 3D clay, pixel art, chibi, and more. Name the style and the AI applies the right prompt structure
  • Multi-image consistency — Design tokens and character DNA templates maintain visual coherence across slide decks, social media series, and brand assets

The image generation models already have these capabilities. The Skill is what makes them accessible without specialized knowledge.

MCP server vs Skill

MCP serverSkill
What it doesExposes image tools to AI agentsGuided prompt for using the tools
Works withAny MCP-compatible toolClaude Code, Claude Desktop
InstallAdd to .mcp.jsonCopy skill files to project
Team sharingCommit .mcp.json to repoCommit .claude/skills/ to repo

Recommended: Set up the MCP server (Quick start) + install the skill if you use Claude Code.

MCP tools

ToolDescription
generate_imageGenerate an image from a text prompt
edit_imageEdit an existing image with text instructions
edit_lastEdit the last generated/edited image (no input path needed)
undo_editUndo the last edit, reverting to the previous image in the session
redo_editRedo a previously undone edit
edit_historyShow all sessions and their edit history with metadata
switch_sessionSwitch to a different editing session
clear_historyClear project history (optionally delete image files)
set_output_dirChange the default output directory (optionally move existing files)
list_providersList available providers and capabilities

The .imgx/ directory holds both edit history and default image output. Its location depends on project root detection:

Project root.imgx/ locationHistory
Detected<project-root>/.imgx/<project-root>/.imgx/output-history.json
Not detected~/Pictures/imgx/ (images only)~/.config/imgx/output-history.json (global)

All clients that resolve to the same project root share the same history. Each session gets its own subdirectory. File paths are returned in the response. Inline image preview is included in MCP responses (base64).

Iterative editing

The edit_last tool uses the output of the previous generate_image or edit_image call as input. This enables a conversational workflow:

code
"Generate a coffee shop interior" → generate_image
"Make the lighting warmer"        → edit_last
"Add a person reading a book"     → edit_last

No need to specify file paths between steps.

Session management

Each generate_image call starts a new session. Subsequent edit_last calls are added to the same session, forming an edit chain. Each session has its own output directory.

Undo / Redo — Step backward and forward through the edit chain:

code
generate → edit_last → edit_last → edit_last
                                    ↑ current
                       ← undo_edit
                       ↑ current
                            redo_edit →
                                    ↑ current

After undo, calling edit_last branches from the current position (abandoned entries and their files are deleted from disk).

File namingedit_last generates sequential filenames based on the origin file:

code
generate_image             → cover.png
edit_last                  → cover-1.png
edit_last                  → cover-2.png

generate_image (no output) → imgx-a1b2c3d4.png
edit_last                  → imgx-a1b2c3d4-1.png

Session switching — Use edit_history to see all sessions, then switch_session to resume a previous session. The edit_last tool will use the current position in the switched session.

Output directoryedit_last inherits the output directory from the session. If generate_image was called with output_dir, all subsequent edit_last calls in that session output to the same directory. The output_dir path is recorded as session metadata in output-history.json. This only affects where image files are saved — history always stays in .imgx/ (or the global config directory).

API key setup

Set up at least one provider:

Gemini — get a key from Google AI Studio (free tier available for gemini-2.5-flash-image):

bash
imgx config set api-key YOUR_GEMINI_API_KEY --provider gemini

OpenAI — get a key from OpenAI Platform:

bash
imgx config set api-key YOUR_OPENAI_API_KEY --provider openai

Keys are stored in ~/.config/imgx/config.json (Linux/macOS) or %APPDATA%\imgx\config.json (Windows). Alternatively, pass keys via the env section in your MCP config, or set environment variables:

bash
export GEMINI_API_KEY="your-api-key"
export OPENAI_API_KEY="your-api-key"

Only include the API keys for providers you want to use. At least one is required.

MCP configuration by tool

Claude Code

.mcp.json in your project root:

json
{
  "mcpServers": {
    "imgx": {
      "command": "npx",
      "args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
      "env": { "GEMINI_API_KEY": "your-key", "OPENAI_API_KEY": "your-key" }
    }
  }
}

Gemini CLI

~/.gemini/settings.json:

json
{
  "mcpServers": {
    "imgx": {
      "command": "npx",
      "args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
      "env": { "GEMINI_API_KEY": "your-key", "OPENAI_API_KEY": "your-key" }
    }
  }
}

Claude Desktop

claude_desktop_config.json:

macOS / Linux:

json
{
  "mcpServers": {
    "imgx": {
      "command": "npx",
      "args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-key",
        "OPENAI_API_KEY": "your-key",
        "IMGX_PROJECT_ROOT": ""
      }
    }
  }
}

Windows:

json
{
  "mcpServers": {
    "imgx": {
      "command": "cmd",
      "args": ["/c", "npx", "--package=imgx-mcp", "-y", "imgx-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-key",
        "OPENAI_API_KEY": "your-key",
        "IMGX_PROJECT_ROOT": ""
      }
    }
  }
}

IMGX_PROJECT_ROOT — Set to your project path to save images inside the project (e.g. "C:\\Users\\you\\my-project"). Leave empty to use the global default (~/Pictures/imgx).

Config file location: %APPDATA%\Claude\claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS). After editing, restart Claude Desktop.

Note: Claude Desktop does not support auto-detection (MCP roots / CWD-based .imgxrc search). Use IMGX_PROJECT_ROOT in the config above (per-client), or run imgx config set project-root /path/to/project (shared across all clients).

Codex CLI

.codex/config.toml:

toml
[mcp_servers.imgx]
command = "npx"
args = ["--package=imgx-mcp", "-y", "imgx-mcp"]
env = { GEMINI_API_KEY = "your-key", OPENAI_API_KEY = "your-key" }

Other tools

The same npx pattern works with Cursor, Windsurf, Continue.dev, Cline, Zed, and other MCP-compatible tools. On Windows, use cmd /c npx instead of npx directly.

Providers

ProviderModelsCapabilities
Geminigemini-2.5-flash-image (Nano Banana — free tier, default), gemini-3-pro-image-preview (Nano Banana Pro), gemini-3.1-flash-image-preview (Nano Banana 2)Generate, edit, aspect ratio (up to 14 ratios), resolution (up to 4K), reference images, person control
OpenAIgpt-image-1, gpt-image-1.5 (faster, 20% cheaper), gpt-image-1-mini (budget)Generate, edit, aspect ratio, multi-output, output format (PNG/JPEG/WebP), background transparency

Architecture

imgx separates model-independent and model-dependent concerns:

code
MCP server (tool definitions, stdio transport)    CLI (argument parsing, output formatting)
 ↓                                                 ↓
Core (Capability enum, ImageProvider interface, provider registry, file I/O, history)
 ↓
Provider (model-specific API calls, capability declarations)

MCP server and CLI are two entry points into the same core. Both call the same provider functions.

Each provider declares its supported capabilities. Adding a new provider means implementing the ImageProvider interface and registering it — no changes to the MCP or CLI layer.

Capability system

CapabilityDescription
TEXT_TO_IMAGEGenerate images from text prompts
IMAGE_EDITINGEdit images with text instructions
ASPECT_RATIOControl output aspect ratio
RESOLUTION_CONTROLControl output resolution
MULTIPLE_OUTPUTSGenerate multiple images per request
REFERENCE_IMAGESUse reference images for guidance
PERSON_CONTROLControl person generation in output
OUTPUT_FORMATChoose output format (PNG, JPEG, WebP)

CLI

imgx-mcp also works as a standalone command-line tool.

Install

bash
npm install -g imgx-mcp

Requires Node.js 18+.

Usage

bash
# Generate
imgx generate -p "A coffee cup on a wooden table, morning light" -o output.png

# Edit
imgx edit -i photo.png -p "Change the background to sunset" -o edited.png

# Iterative editing
imgx edit -i photo.png -p "Make the background darker"
imgx edit --last -p "Add warm lighting"
imgx edit --last -p "Crop to 16:9" -o final.png

# Undo / redo
imgx undo               # Revert to previous image in session
imgx redo               # Re-apply an undone edit

# History
imgx history            # Show all sessions and entries
imgx history switch <session-id>  # Switch to a different session
imgx history clear      # Clear project history (interactive)
imgx history clear --yes          # Clear without confirmation
imgx history clear --keep-files   # Clear history but keep image files
imgx history clear --all          # Clear ALL history across all projects

# Provider management
imgx providers          # List providers and capabilities
imgx capabilities       # Detailed capabilities of current provider

CLI options

FlagShortDescription
--prompt-pImage description or edit instruction (required)
--output-oOutput file path (auto-generated if omitted)
--input-iInput image to edit (edit command only)
--last-lUse last output as input (edit command only)
--aspect-ratio-a1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2 + Gemini 3.x: 1:4, 1:8, 4:1, 4:5, 5:4, 8:1, 21:9
--resolution-r1K, 2K, 4K
--count-nNumber of images to generate
--format-fOutput format: png, jpeg, webp (OpenAI only)
--background-bBackground: transparent, opaque, auto (OpenAI only)
--quality-qQuality: low, medium, high, auto (OpenAI only)
--model-mModel name
--providerProvider name (default: gemini)
--output-dir-dOutput directory

Configuration

bash
imgx config set api-key <key> --provider gemini   # Save Gemini API key
imgx config set api-key <key> --provider openai   # Save OpenAI API key
imgx config set model <name>      # Set default model
imgx config set output-dir <dir>  # Set default output directory
imgx config set aspect-ratio 16:9 # Set default aspect ratio
imgx config set resolution 2K     # Set default resolution
imgx config list                  # Show all settings
imgx config get api-key           # Show a specific setting (API key is masked)
imgx config path                  # Show config file location

Project config (.imgxrc)

Generate a template with imgx init:

bash
imgx init
# → creates .imgxrc in current directory

Or create manually:

json
{
  "defaults": {
    "model": "gemini-2.5-flash-image",
    "outputDir": "./assets/images",
    "aspectRatio": "16:9"
  }
}

Project config is shared via Git. Do not put API keys in .imgxrc.

Project root configuration (3 tiers)

MethodScopeHow to set
IMGX_PROJECT_ROOT env var in client configPer-client (highest priority)Add to env in claude_desktop_config.json, .mcp.json, etc.
Auto-detection (MCP roots / .imgxrc search)AutomaticWorks on CLI agents (Claude Code, Gemini CLI). Not available on Claude Desktop
imgx config set project-rootAll clients on the machineStored in user config (~/.config/imgx/config.json or %APPDATA%\imgx\config.json)

Detection priority: env var → MCP roots → .imgxrc upward search → user config projectRoot.

History is saved to <project-root>/.imgx/output-history.json (project-scoped, not shared with other projects). Default image output goes to <project-root>/.imgx/<session-id>/. Relative paths in output and output_dir are resolved against the project root instead of the MCP server's working directory.

Settings resolution

  1. CLI flags (--model, --output-dir, etc.)
  2. Environment variables (IMGX_MODEL, IMGX_OUTPUT_DIR, etc.)
  3. Project config (.imgxrc — searched from current directory upward)
  4. User config (~/.config/imgx/config.json or %APPDATA%\imgx\config.json)
  5. Provider defaults

Output format

All CLI commands output JSON:

json
{"success": true, "filePaths": ["./output.png"]}

Claude Code plugin

The plugin bundles MCP server + skill in one step. If you prefer not to configure .mcp.json and skill files manually:

code
/plugin marketplace add somacoffeekyoto/imgx-mcp
/plugin install imgx-mcp@somacoffeekyoto-imgx-mcp

Update: /plugin → installed → imgx-mcp → update. If the update shows no changes, uninstall and reinstall.

Uninstall: /plugin uninstall imgx-mcp@somacoffeekyoto-imgx-mcp then /plugin marketplace remove somacoffeekyoto-imgx-mcp.

Development

bash
git clone https://github.com/somacoffeekyoto/imgx-mcp.git
cd imgx-mcp
npm install
npm run bundle    # TypeScript compile + esbuild bundle

The build produces two bundles:

  • dist/mcp.bundle.js — MCP server entry point
  • dist/cli.bundle.js — CLI entry point

Uninstall

MCP server

Remove the imgx entry from your tool's MCP configuration file.

Skill

Delete the image-generation/ directory from .claude/skills/ or ~/.claude/skills/.

CLI

bash
npm uninstall -g imgx-mcp

npm uninstall removes the package but does not delete configuration or generated files. Remove them manually if needed:

Global configuration:

bash
# Linux / macOS
rm -rf ~/.config/imgx/

# Windows (PowerShell)
Remove-Item -Recurse -Force "$env:APPDATA\imgx"

Project history and images: Each project may have a .imgx/ directory containing edit history and generated images. Remove it from each project as needed.

bash
rm -rf <project-root>/.imgx/

License

MIT — SOMA COFFEE KYOTO

Links

常见问题

io.github.somacoffeekyoto/imgx 是什么?

通过CLI或MCP进行AI图像生成与编辑,支持文本式修改及多provider接入。

相关 Skills

网页构建器

by anthropics

Universal
热门

面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。

在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。

编码与调试
未扫描123.0k

前端设计

by anthropics

Universal
热门

面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。

想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。

编码与调试
未扫描123.0k

网页应用测试

by anthropics

Universal
热门

用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。

借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。

编码与调试
未扫描123.0k

相关 MCP Server

GitHub

编辑精选

by GitHub

热门

GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。

这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。

编码与调试
84.2k

by Context7

热门

Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。

它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。

编码与调试
53.3k

by tldraw

热门

tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。

这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。

编码与调试
46.4k

评论