智能体工具包
agent-toolkit
by BytesAgain
Configure and benchmark agent tools and integration patterns. Use when setting up agent workflows, comparing tools, or evaluating agents.
安装
claude skill add --url github.com/openclaw/skills/tree/main/skills/bytesagain/ba-agent-toolkit文档
Agent Toolkit
A comprehensive AI toolkit for configuring, benchmarking, comparing, and optimizing agent tools and integration patterns. Agent Toolkit provides persistent, file-based logging for each command category with timestamped entries, summary statistics, multi-format export, and full-text search across all records.
Commands
| Command | Description |
|---|---|
configure | Configure agent tools — log configuration entries or view recent ones |
benchmark | Benchmark tool performance — log benchmark results or view history |
compare | Compare tool outputs — log comparison data or view recent comparisons |
prompt | Prompt management — log prompt variations or view recent prompts |
evaluate | Evaluate tool results — log evaluation data or view history |
fine-tune | Fine-tune parameters — log fine-tuning sessions or view recent ones |
analyze | Analyze tool behavior — log analysis entries or view recent analyses |
cost | Cost tracking — log cost data or view recent cost entries |
usage | Usage monitoring — log usage metrics or view recent usage data |
optimize | Optimize configurations — log optimization runs or view history |
test | Test tool behavior — log test results or view recent tests |
report | Report generation — log report entries or view recent reports |
stats | Show summary statistics across all log categories (entry counts, data size, first entry date) |
export <fmt> | Export all data in json, csv, or txt format to the data directory |
search <term> | Full-text search across all log files (case-insensitive) |
recent | Show the 20 most recent entries from the activity history log |
status | Health check — show version, data directory, total entries, disk usage, and last activity |
help | Show the full help message with all available commands |
version | Print the current version string |
Each data command (configure, benchmark, compare, etc.) works in two modes:
- Without arguments: displays the 20 most recent entries from that category
- With arguments: saves the input as a new timestamped entry and reports the total count
Data Storage
All data is stored in plain text files under the data directory:
- Category logs:
$DATA_DIR/<command>.log— one file per command (e.g.,configure.log,benchmark.log,prompt.log), each entry istimestamp|value - History log:
$DATA_DIR/history.log— audit trail of every command executed with timestamps - Export files:
$DATA_DIR/export.<fmt>— generated by theexportcommand in json, csv, or txt format
Default data directory: ~/.local/share/agent-toolkit/
Requirements
- Bash (with
set -euo pipefailsupport) - Standard Unix utilities:
grep,cat,date,echo,wc,du,head,tail,basename - No external dependencies or API keys required
When to Use
- Setting up agent workflows — When you need to configure and log settings for agent tool integrations, API connections, or pipeline configurations
- Benchmarking and comparing tools — When you're evaluating different AI tools or agent frameworks and want to log performance metrics for comparison
- Cost and usage optimization — When you need to track API costs, token usage, and resource consumption across different tools to optimize spending
- Fine-tuning and testing — When running fine-tuning experiments or test suites and you want to log parameters, results, and observations
- Cross-tool analysis and reporting — When you need to search across all logged data, generate reports, or export results for stakeholder review
Examples
# Check toolkit status
agent-toolkit status
# Configure a new tool integration
agent-toolkit configure "OpenAI API key rotated, new model endpoint: gpt-4o-2024-08"
# Benchmark a tool
agent-toolkit benchmark "LangChain ReAct agent: 94% task completion, 3.4s avg response time"
# Compare two tools
agent-toolkit compare "LangChain vs CrewAI: LangChain 20% faster setup, CrewAI better multi-agent coordination"
# Log a prompt template
agent-toolkit prompt "Tool-use system prompt v3: Added structured output format and error handling instructions"
# Track costs
agent-toolkit cost "Weekly API spend: OpenAI $12.30, Anthropic $8.50, total $20.80"
# View recent benchmarks
agent-toolkit benchmark
# Search across all logs
agent-toolkit search "LangChain"
# Export all data as CSV
agent-toolkit export csv
# View summary statistics
agent-toolkit stats
# Show recent activity
agent-toolkit recent
Output
All commands return output to stdout. Export files are written to the data directory:
agent-toolkit export json # → ~/.local/share/agent-toolkit/export.json
agent-toolkit export csv # → ~/.local/share/agent-toolkit/export.csv
agent-toolkit export txt # → ~/.local/share/agent-toolkit/export.txt
Every command execution is logged to $DATA_DIR/history.log for auditing purposes.
Powered by BytesAgain | bytesagain.com | hello@bytesagain.com
相关 Skills
技能工坊
by anthropics
覆盖 Skill 从创建到迭代优化全流程:起草能力、补测试提示、跑评测与基准方差分析,并持续改写内容和描述,提升效果与触发准确率。
✎ 技能工坊把技能从创建、迭代到评测串成闭环,方差分析加描述优化,特别适合把触发准确率打磨得更稳。
PPT处理
by anthropics
处理 .pptx 全流程:创建演示文稿、提取和解析幻灯片内容、批量修改现有文件,支持模板套用、合并拆分、备注评论与版式调整。
✎ 涉及PPTX的创建、解析、修改到合并拆分都能一站搞定,连备注、模板和评论也能处理,做演示文稿特别省心。
PDF处理
by anthropics
遇到 PDF 读写、文本表格提取、合并拆分、旋转加水印、表单填写或加解密时直接用它,也能提取图片、生成新 PDF,并把扫描件通过 OCR 变成可搜索文档。
✎ PDF杂活别再来回切工具了,文本表格提取、合并拆分到OCR识别一次搞定,连扫描件也能变可搜索。
相关 MCP 服务
文件系统
编辑精选by Anthropic
Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。
✎ 这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。
by wonderwhy-er
Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。
✎ 这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。
EdgarTools
编辑精选by dgunning
EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。
✎ 这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。