Langflow
Langflow
by bytesagain
Langflow is a powerful tool for building and deploying AI-powered agents and workflows. llm-flow, python, agents, chatgpt, generative-ai.
安装
claude skill add --url github.com/openclaw/skills/tree/main/skills/bytesagain/llm-flow文档
LLM Flow
An AI toolkit for configuring, benchmarking, comparing, prompting, evaluating, fine-tuning, analyzing, and optimizing LLM workflows. Each command logs timestamped entries to local files with full export, search, and statistics support.
Commands
Core AI Operations
| Command | Description |
|---|---|
llm-flow configure <input> | Record a configuration change (or view recent configs with no args) |
llm-flow benchmark <input> | Log a benchmark run and its results |
llm-flow compare <input> | Record a model or output comparison |
llm-flow prompt <input> | Log a prompt template or prompt engineering note |
llm-flow evaluate <input> | Record an evaluation result or metric |
llm-flow fine-tune <input> | Log a fine-tuning session or parameters |
llm-flow analyze <input> | Record an analysis observation |
llm-flow cost <input> | Log cost tracking data (tokens, dollars, etc.) |
llm-flow usage <input> | Record API usage metrics |
llm-flow optimize <input> | Log an optimization attempt and outcome |
llm-flow test <input> | Record a test case or test result |
llm-flow report <input> | Log a report entry or summary |
Utility Commands
| Command | Description |
|---|---|
llm-flow stats | Show summary statistics across all log files |
llm-flow export <fmt> | Export all data in json, csv, or txt format |
llm-flow search <term> | Search all entries for a keyword (case-insensitive) |
llm-flow recent | Show the 20 most recent activity log entries |
llm-flow status | Health check: version, entry count, disk usage, last activity |
llm-flow help | Display full command reference |
llm-flow version | Print current version (v2.0.0) |
How It Works
Every core command accepts free-text input. When called with arguments, LLM Flow:
- Timestamps the entry (
YYYY-MM-DD HH:MM) - Appends it to the command-specific log file (e.g.
benchmark.log,cost.log) - Records the action in a central
history.log - Reports the saved entry and running total
When called with no arguments, each command displays the 20 most recent entries from its log file.
Data Storage
All data is stored locally in plain-text log files:
~/.local/share/llm-flow/
├── configure.log # Configuration changes
├── benchmark.log # Benchmark results
├── compare.log # Model comparisons
├── prompt.log # Prompt templates & notes
├── evaluate.log # Evaluation metrics
├── fine-tune.log # Fine-tuning sessions
├── analyze.log # Analysis observations
├── cost.log # Cost tracking
├── usage.log # API usage metrics
├── optimize.log # Optimization attempts
├── test.log # Test cases & results
├── report.log # Report entries
├── history.log # Central activity log
└── export.{json,csv,txt} # Exported snapshots
Each log uses pipe-delimited format: timestamp|value.
Requirements
- Bash 4.0+ with
set -euo pipefail - Standard Unix utilities:
wc,du,grep,tail,date,sed - No external dependencies — pure bash
When to Use
- Building AI agent workflows — log each step of your agent pipeline (configure → prompt → evaluate → optimize) with full traceability
- Tracking LLM costs and usage — record per-request costs, token counts, and API usage to monitor spending across providers
- Benchmarking and comparing models — log benchmark metrics side-by-side to make data-driven model selection decisions
- Fine-tuning experiment tracking — capture hyperparameters, dataset details, and evaluation scores for every fine-tuning run
- Generating compliance reports — export all logged activity to JSON/CSV for audits, SOC reviews, or stakeholder reporting
Examples
# Configure a new workflow
llm-flow configure "workflow: summarize → classify → respond, model=claude-3.5"
# Benchmark a model
llm-flow benchmark "claude-3.5-sonnet: 94% accuracy, 0.8s p50 latency, $0.003/req"
# Log a prompt template
llm-flow prompt "system: You are a helpful assistant. Always cite sources."
# Track API costs
llm-flow cost "March week 3: 890k tokens in, 210k tokens out, $12.40 total"
# Evaluate output quality
llm-flow evaluate "human eval score: 4.2/5.0 across 50 samples"
# Search across all logs
llm-flow search "claude"
# Export to CSV for analysis
llm-flow export csv
# Quick health check
llm-flow status
Configuration
Set the DATA_DIR variable in the script or modify the default path to change storage location. Default: ~/.local/share/llm-flow/
Powered by BytesAgain | bytesagain.com | hello@bytesagain.com
相关 Skills
Claude接口
by anthropics
面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。
✎ 想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心
RAG架构师
by alirezarezvani
聚焦生产级RAG系统设计与优化,覆盖文档切块、检索链路、索引构建、召回评估等关键环节,适合搭建可扩展、高准确率的知识库问答与检索增强应用。
✎ 面向RAG落地,把知识库、向量检索和生成链路系统串联起来,做架构设计时更清晰,也更少踩坑。
多智能体架构
by alirezarezvani
聚焦多智能体系统架构设计,梳理 Supervisor、Swarm、分层和 Pipeline 等模式,覆盖角色定义、通信协作与性能评估,适合规划稳健可扩展的 AI agent 编排方案。
✎ 帮你系统解决多智能体应用的架构设计与协同编排难题,适合构建复杂 AI 工作流,成熟度高、社区认可也很亮眼。
相关 MCP 服务
知识图谱记忆
编辑精选by Anthropic
Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。
✎ 帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。
顺序思维
编辑精选by Anthropic
Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。
✎ 这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。
PraisonAI
编辑精选by mervinpraison
PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。
✎ 如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。