llm
by BytesAgain
Build and evaluate LLM prompts. Use when crafting system prompts, comparing variants, estimating tokens, or managing prompt templates.
安装
claude skill add --url github.com/openclaw/skills/tree/main/skills/bytesagain/llm文档
llm
LLM Prompt Engineering Toolkit. Build structured prompts from role/context/task components, compare prompt variations side by side, estimate token counts, manage reusable prompt templates, chain multi-step prompts, and evaluate prompt quality with a scored breakdown. All commands run locally in bash with no API keys or network access required.
Commands
prompt — Build a Structured Prompt
Assembles a prompt from modular components: role, context, task, constraints, and output format. The --task flag is required; all others are optional.
Flags:
--role <text>— Define the AI's persona (e.g., "senior developer")--context <text>— Provide background information--task <text>— (required) The main instruction--constraints <text>— Rules or limitations--format <text>— Desired output format
bash scripts/script.sh prompt --role "senior developer" --context "Python Flask app" --task "write unit tests"
bash scripts/script.sh prompt --task "summarize this article" --constraints "max 3 sentences" --json
compare — Compare Prompt Variations
Compare two or more prompt files side by side. Shows each variant with word/line/char/token stats, then a diff --side-by-side of the first two variants, plus a summary table.
Flags:
--prompts <file1> <file2> [file3...]— Two or more prompt text files to compare
bash scripts/script.sh compare --prompts prompt_a.txt prompt_b.txt
bash scripts/script.sh compare --prompts v1.txt v2.txt v3.txt
tokenize — Estimate Token Count
Estimate the token count for a given text using a cl100k_base-compatible heuristic. Reports characters, words, lines, and estimated tokens.
Input methods:
--input <text>— Inline text string--file <path>— Read from a file- Pipe via stdin
bash scripts/script.sh tokenize --input "Your prompt text here"
bash scripts/script.sh tokenize --file prompt.txt
echo "some text" | bash scripts/script.sh tokenize
bash scripts/script.sh tokenize --file prompt.txt --json
template — Manage Prompt Templates
Save, list, load, and delete reusable prompt templates. Templates are stored as .txt files in ~/.llm-skill/templates/.
Actions:
--save <name> --file <path>— Save a template from a file (or pipe via stdin)--list— List all saved templates with sizes--load <name>— Output the contents of a saved template--delete <name>— Remove a saved template
bash scripts/script.sh template --save my_template --file prompt.txt
bash scripts/script.sh template --list
bash scripts/script.sh template --list --json
bash scripts/script.sh template --load my_template
bash scripts/script.sh template --delete my_template
echo "Write a haiku about {{topic}}" | bash scripts/script.sh template --save haiku
chain — Multi-Step Prompt Chains
Run a sequence of prompt steps where each step's output feeds into the next via the {{previous_output}} placeholder. Steps can be specified as individual files or loaded from a JSON config.
Flags:
--steps <file1> <file2> [...]— Ordered list of step files--from <config.json>— Load steps from a JSON configuration file
bash scripts/script.sh chain --steps step1.txt step2.txt step3.txt
bash scripts/script.sh chain --from chain_config.json
bash scripts/script.sh chain --steps brainstorm.txt refine.txt format.txt --json
evaluate — Score Prompt Quality
Score a prompt on four dimensions (0–100 each): Clarity, Specificity, Structure, and Completeness. Returns an overall score (0–100) and letter grade (A–F) with actionable suggestions.
Scoring heuristics:
- Clarity — Penalizes vague words ("something", "stuff"), rewards action verbs ("write", "create", "analyze") and structural markers
- Specificity — Rewards concrete numbers, quoted examples, and sufficient length
- Structure — Rewards headers, bullet lists, numbered steps, and paragraph breaks
- Completeness — Checks for role definition, output format spec, constraints, and examples
bash scripts/script.sh evaluate --input "Explain quantum computing"
bash scripts/script.sh evaluate --file my_prompt.txt
bash scripts/script.sh evaluate --file my_prompt.txt --json
help — Show Help
bash scripts/script.sh help
Global Flags
--json— Output in JSON format (supported byprompt,tokenize,template --list,chain, andevaluate)
Data Storage
- Templates:
~/.llm-skill/templates/*.txt - No other persistent state. All commands are stateless except
templatewhich manages saved files.
Requirements
- Bash 4+ (uses arrays,
[[ ]], process substitution) - Standard Unix utilities:
wc,grep,diff,cat,basename,tr,sed,rm,mkdir - No external dependencies, API keys, or network access required
When to Use
- Crafting system prompts — Use
promptto build well-structured prompts from role/context/task components instead of writing them freehand. - A/B testing prompt variants — Use
compareto see side-by-side diffs and token counts for two or more prompt versions before committing to one. - Estimating API costs — Use
tokenizeto get token estimates before sending prompts to paid LLM APIs, helping you stay within budget. - Building reusable prompt libraries — Use
templateto save, organize, and reuse your best prompts across projects. - Quality-checking prompts before use — Use
evaluateto score your prompts on clarity, specificity, structure, and completeness, with actionable improvement suggestions.
Examples
# Build a structured prompt for code review
bash scripts/script.sh prompt \
--role "senior code reviewer" \
--context "React TypeScript project" \
--task "review this pull request for bugs and performance issues" \
--constraints "focus on security vulnerabilities" \
--format "numbered list of findings"
# Estimate tokens for a long prompt
bash scripts/script.sh tokenize --file system_prompt.txt
# Save a template and reuse it
echo "You are a {{role}}. Your task: {{task}}" | bash scripts/script.sh template --save generic
bash scripts/script.sh template --load generic
# Evaluate prompt quality
bash scripts/script.sh evaluate --input "You are an expert Python developer. Write a function that sorts a list of dictionaries by a given key. Include type hints, docstring, and 3 unit tests."
Powered by BytesAgain | bytesagain.com | hello@bytesagain.com
相关 Skills
Claude接口
by anthropics
面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。
✎ 想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心
提示工程专家
by alirezarezvani
覆盖Prompt优化、Few-shot设计、结构化输出、RAG评测与Agent工作流编排,适合分析token成本、评估LLM输出质量,并搭建可落地的AI智能体系统。
✎ 把提示优化、LLM评测到RAG与智能体设计串成一套方法,适合想系统提升AI开发效率的人。
智能体流程设计
by alirezarezvani
面向生产级多 Agent 编排,梳理顺序、并行、分层、事件驱动、共识五种工作流设计,覆盖 handoff、状态管理、容错重试、上下文预算与成本优化,适合搭建复杂 AI 协作系统。
✎ 帮你把多智能体流程设计、编排和自动化统一起来,复杂工作流也能更稳地落地,适合追求强控制力的团队。
相关 MCP 服务
顺序思维
编辑精选by Anthropic
Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。
✎ 这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。
知识图谱记忆
编辑精选by Anthropic
Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。
✎ 帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。
PraisonAI
编辑精选by mervinpraison
PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。
✎ 如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。
相关资讯
Simon Willison 正在重构 LLM Python 库的抽象层,以支持服务器端工具执行等新功能。他利用 Claude Code 分析了四大 LLM 提供商的客户端库,生成了用于测试的 curl 命令和 JSON 输出。这些调研材料已开源,旨在帮助设计更通用的 API 抽象。
DBPlanBench 框架将 Apache DataFusion 的物理执行计划序列化为紧凑 JSON,供 LLM 进行语义分析并生成局部优化补丁。实验显示,该方法能修正基数估计错误,在复杂查询上显著降低延迟和内存占用,且优化结果可从小规模数据集迁移到生产环境。
datasette-llm 0.1a6 版本发布,主要更新包括:默认模型自动添加到允许模型列表,无需重复配置;改进了 Python API 的使用文档。