skill-review

by antaisecuritylab

>-

View Chinese version with editor review

安装

claude skill add --url https://github.com/openclaw/skills

文档

skill-review

A multi-agent security scanner CLI for Claude Code Skill packages. It combines deterministic static pre-scanning with LLM-driven deep analysis to surface security risks across 7 layers before you install a Skill.

When to use

  • Auditing a third-party Skill before installation
  • Checking a skill directory for prompt injection, credential theft, data exfiltration, or hidden backdoors
  • Evaluating supply chain risk of a Skill's npm/PyPI dependencies
  • CI/CD integration to block high-risk Skills automatically

How it works

The scanner runs in two phases:

  1. Pre-scan (deterministic, no LLM) — walks all files and flags: symlinks, suspicious filenames (Unicode confusables, shell metacharacters), large files, binary executables, invisible characters, ANSI escape sequences, JS obfuscation patterns, and hardcoded URLs.

  2. LLM Analysis — an Explore Agent reads each file and performs 7-layer analysis:

    • Layer 1: Prompt Injection (direct injection, jailbreak, remote prompt loading)
    • Layer 2: Malicious Behavior (credential theft, data exfiltration, sandbox escape)
    • Layer 3: Dynamic Code Loading (remote execution via fetch+eval, curl|sh, etc.)
    • Layer 4: Obfuscation & Binary (obfuscated scripts, compiled binaries)
    • Layer 5: Dependencies & Supply Chain (npm/PyPI/CLI tool inventory, typosquat detection)
    • Layer 6: System Modification (global installs, profile changes, cron jobs)
    • Layer 7: Code Quality (hardcoded secrets, insecure configs, vulnerable code patterns)

    An optional Deep Analysis Agent then verifies URLs, checks dependency metadata on registries, and inspects binaries.

  3. Deterministic Scoring — each finding is scored based on its type and severity. The overall risk level (safe/low/medium/high/critical) and recommendation (install/caution/do_not_install) are computed deterministically, not by the LLM.

Installation

bash
cd <skill-review-dir>
npm install

Configuration

Create .env and fill in your LLM provider details:

VariableDescriptionDefault
OPENAI_API_BASELLM API base URL (OpenAI-compatible)required
OPENAI_API_KEYAPI keyrequired
OPENAI_API_MODELModel namegpt-4o
NPM_REGISTRY_URLnpm registry for dependency checkshttps://registry.npmjs.org
PYPI_INDEX_URLPyPI index for dependency checkshttps://pypi.org

Alternatively, pass a JSON config file via --config.

Usage

bash
# Standard scan (pre-scan + LLM explore)
node index.mjs <skill-dir>

# Pre-scan only (no LLM, fast)
node index.mjs --pre <skill-dir>

# Deep analysis (pre-scan + explore + deep verification of URLs/deps/binaries)
node index.mjs --deep <skill-dir>

# JSON output, save to file
node index.mjs --json -o report.json <skill-dir>

# Chinese language report
node index.mjs --lang zh <skill-dir>

# Verbose logs to stderr + log file
node index.mjs -v --log scan.log <skill-dir>

Options

OptionDescription
<skill-dir>Path to the skill directory to scan (required, positional)
--config <file>Path to JSON config file
--preRun pre-scan only (no LLM calls)
--deepEnable deep analysis phase
--lang <lang>Report language (default: English)
--jsonOutput raw JSON instead of text report
-o, --output <file>Save report to file (default: stdout)
--log <file>Save detailed logs to file
-v, --verboseStream detailed logs to stderr
-h, --helpShow help

Output

The text report shows each layer with a risk score (0-10), star rating, and up to 5 findings per layer. The JSON output contains the full structured result with all findings, layer scores, overall risk, and recommendation.

Risk levels: safe (0) / low (1-3) / medium (4-6) / high (7-8) / critical (9-10)

Recommendations: install (safe/low) / caution (medium) / do_not_install (high/critical)