io.github.smigolsmigol/llmkit

AI 与智能体

by smigolsmigol

跟踪11家LLM provider的AI API成本,可按model、session或时间范围查询支出。

什么是 io.github.smigolsmigol/llmkit

跟踪11家LLM provider的AI API成本,可按model、session或时间范围查询支出。

README

<p align="center"> <img src=".github/logo-wordmark-animated.svg" width="280" alt="LLMKit" /> </p> <h3 align="center">Know what your AI agents cost.</h3> <p align="center"> <a href="https://github.com/smigolsmigol/llmkit/actions/workflows/ci.yml"><img src="https://github.com/smigolsmigol/llmkit/actions/workflows/ci.yml/badge.svg" alt="CI" /></a> <a href="https://scorecard.dev/viewer/?uri=github.com/smigolsmigol/llmkit"><img src="https://api.scorecard.dev/projects/github.com/smigolsmigol/llmkit/badge" alt="OpenSSF Scorecard" /></a> <a href="https://www.bestpractices.dev/projects/12288"><img src="https://www.bestpractices.dev/projects/12288/badge" alt="OpenSSF Best Practices" /></a> <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License" /></a> <a href="https://pypi.org/project/llmkit-sdk/"><img src="https://img.shields.io/pypi/v/llmkit-sdk?label=PyPI&color=blue" alt="PyPI" /></a> <a href="https://www.npmjs.com/package/@f3d1/llmkit-sdk"><img src="https://img.shields.io/npm/v/%40f3d1/llmkit-sdk?label=npm&color=blue" alt="npm" /></a> <a href="https://github.com/smigolsmigol/llmkit/tree/main/packages/mcp-server"><img src="https://img.shields.io/badge/MCP-Registry-blue" alt="MCP" /></a> <a href="https://lobehub.com/mcp/smigolsmigol-llmkit"><img src="https://img.shields.io/badge/LobeHub-A_Grade-green" alt="LobeHub MCP" /></a> <a href="https://www.npmjs.com/package/@f3d1/llmkit-mcp-server"><img src="https://img.shields.io/npm/dw/@f3d1/llmkit-mcp-server?label=npm%20downloads" alt="npm downloads" /></a> <a href="https://pypi.org/project/llmkit-sdk/"><img src="https://img.shields.io/pypi/dm/llmkit-sdk?label=PyPI%20downloads" alt="PyPI downloads" /></a> </p> <p align="center"> Open-source API gateway for AI providers. Logs every request with token counts and dollar costs.<br> Budget limits reject requests before they reach the provider, not after. </p>
code
$ npx @f3d1/llmkit-cli -- python my_agent.py

  $0.0215 total  3 requests  4.2s  ~$18.43/hr

  claude-sonnet-4-20250514  1 req    $0.0156  ████████████████████
  gpt-4o                    2 reqs   $0.0059  ███████░░░░░░░░░░░░░

Works with Python, Ruby, Go, Rust - anything that calls the OpenAI or Anthropic API. One command, no code changes.

Get started

  1. Create an account at llmkit.sh (free while in beta)
  2. Create an API key in the Keys tab
  3. Pick a method below

CLI

Wrap any command. The CLI intercepts API calls, forwards them through the proxy, and prints a cost summary when the process exits.

bash
npx @f3d1/llmkit-cli -- python my_agent.py

Use -v for per-request costs as they happen, --json for machine-readable output.

Python

bash
pip install llmkit-sdk

With the proxy (budget enforcement, logging, dashboard):

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.llmkit.sh/v1",
    api_key="llmk_your_key_here",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
)

Without the proxy (local cost estimation, zero setup):

python
from llmkit import tracked
from openai import OpenAI

client = OpenAI(http_client=tracked())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
)
# costs estimated locally from bundled pricing table

tracked() wraps your HTTP client and estimates costs from token usage. No proxy needed. Works with any SDK that accepts http_client.

Framework integrations (LangChain, LlamaIndex, Pydantic AI):

python
from llmkit.integrations.langchain import LLMKitCallbackHandler
handler = LLMKitCallbackHandler()
chain.invoke("...", config={"callbacks": [handler]})
print(f"${handler.total_cost:.4f}")

TypeScript

bash
npm install @f3d1/llmkit-sdk
typescript
import { LLMKit } from '@f3d1/llmkit-sdk'

const kit = new LLMKit({ apiKey: process.env.LLMKIT_KEY })
const agent = kit.session()

const res = await agent.chat({
  provider: 'anthropic',
  model: 'claude-sonnet-4-20250514',
  messages: [{ role: 'user', content: 'summarize this document' }],
})

console.log(res.content)
console.log(res.cost)   // { inputCost: 0.003, outputCost: 0.015, totalCost: 0.018, currency: 'USD' }

Streaming, CostTracker, and Vercel AI SDK provider also available.

MCP Server

<a href="https://glama.ai/mcp/servers/smigolsmigol/llmkit-mcp-server"> <img width="380" height="200" src="https://glama.ai/mcp/servers/smigolsmigol/llmkit-mcp-server/badge" alt="llmkit-mcp-server MCP server" /> </a>

Query AI costs from Claude Code, Cline, or Cursor:

json
{
  "mcpServers": {
    "llmkit": {
      "command": "npx",
      "args": ["@f3d1/llmkit-mcp-server"],
      "env": { "LLMKIT_API_KEY": "llmk_your_key_here" }
    }
  }
}

11 tools - 6 proxy (need API key), 5 local (no key, auto-detect Claude Code + Cline + Cursor):

llmkit_usage_stats llmkit_cost_query llmkit_budget_status llmkit_session_summary llmkit_list_keys llmkit_health llmkit_local_session llmkit_local_projects llmkit_local_cache llmkit_local_forecast llmkit_local_agents

SessionEnd hook - auto-log session costs when Claude Code exits. Add to settings.json:

json
{
  "hooks": {
    "SessionEnd": [
      {
        "type": "command",
        "command": "npx @f3d1/llmkit-mcp-server --hook"
      }
    ]
  }
}

Parses the session transcript and prints cost summary. No API key needed.

GitHub Action

Cap AI spend in CI. The action runs your command through the CLI, tracks cost, and fails the job if it exceeds the budget.

yaml
- uses: smigolsmigol/llmkit/.github/actions/llmkit-budget@main
  with:
    command: python agent.py
    budget-usd: '5.00'
    post-comment: 'true'

Posts a cost report as a PR comment. Outputs total-cost, total-requests, budget-exceeded, and summary-json for downstream steps.

Why LLMKit

Most cost tracking tools give you "soft limits" that agents blow past in the first hour. LLMKit runs cost estimation before every request. If it would exceed the budget, the request gets rejected before reaching the provider. Per-key or per-session scope.

Tag requests with a session ID or end-user ID to track costs per agent, per conversation, per user. The dashboard and MCP server surface this data in real time. Cost anomaly detection alerts when a single request costs 3x the recent median.

11 providers through one interface: Anthropic, OpenAI, Google Gemini, Groq, Together, Fireworks, DeepSeek, Mistral, xAI, Ollama, OpenRouter. Fallback chains with one header (x-llmkit-fallback: anthropic,openai,gemini).

Runs on Cloudflare Workers at the edge. Cache-aware pricing across 7 providers with prompt caching. 730+ models priced across all providers.

Automatic prompt caching for Anthropic: the proxy injects cache breakpoints on system prompts and conversation history. Second request with the same system prompt costs 90% less. Zero config, zero code changes.

Framework integrations: drop-in cost tracking for LangChain, LlamaIndex, and Pydantic AI via callback handlers. Works alongside the httpx transport for direct SDK use.

470+ tests, ClusterFuzzLite fuzzing, 6-stage security pipeline (gitleaks, semgrep, CodeQL, bandit, pip-audit, pnpm audit). OpenSSF Scorecard 8.3 - higher than React, Django, Kubernetes, and every AI gateway competitor.

Public API endpoints (no auth required):

Security

LLMKit handles your API keys. We take that seriously.

LayerWhat
EncryptionProvider keys: AES-256-GCM, random IV, context-bound AAD
HashingUser API keys: SHA-256, never stored in plaintext
RuntimeCloudflare Workers: no filesystem, no .env, nothing to exfiltrate
Supply chainAll CI actions pinned to commit SHAs, explicit least-privilege permissions
Provenancenpm packages published with Sigstore provenance via GitHub Actions OIDC
Pre-commit19 secret patterns + credential file blocking + gitleaks
CI pipelinegitleaks, semgrep, pnpm audit, pip-audit, bandit, KeyGuard
AI exclusion.cursorignore + .claudeignore block AI tools from reading secrets

Full details in SECURITY.md.

<details> <summary><strong>Packages</strong></summary>
PackageDescription
llmkit-sdk (PyPI)Python SDK: tracked() transport, cost estimation, streaming, sessions
@f3d1/llmkit-sdk (npm)TypeScript client, CostTracker, streaming
@f3d1/llmkit-clinpx @f3d1/llmkit-cli -- <cmd>: zero-code cost tracking for any language
@f3d1/llmkit-proxyHono-based CF Workers proxy: auth, budgets, routing, logging
@f3d1/llmkit-ai-sdk-providerVercel AI SDK v6 custom provider
@f3d1/llmkit-mcp-server11 tools: proxy analytics, local costs (Claude Code + Cline + Cursor)
@f3d1/llmkit-sharedTypes, pricing table (11 providers, 730+ models), cost calculation
</details> <details> <summary><strong>Self-host</strong></summary>
bash
git clone https://github.com/smigolsmigol/llmkit
cd llmkit && pnpm install && pnpm build

cd packages/proxy
echo 'DEV_MODE=true' > .dev.vars
pnpm dev
# proxy running at http://localhost:8787

Deploy to Cloudflare Workers:

bash
npx wrangler login
npx wrangler secret put SUPABASE_URL
npx wrangler secret put SUPABASE_KEY
npx wrangler secret put ENCRYPTION_KEY
npx wrangler deploy
</details> <details> <summary><strong>Testing</strong></summary>

470+ tests across TypeScript and Python: cost calculation, budget enforcement, crypto, reservations, pricing accuracy, streaming, transport hooks, contract tests, and integration tests. CI runs on every push with a 6-stage security pipeline.

</details> <details> <summary><strong>Audit logging</strong></summary>

Per-request logging with timestamps, model attribution, cost tracking, per-end-user attribution (x-llmkit-user-id), tool invocation logging, CSV export with sha256 integrity hash. This data can support record-keeping requirements but does not constitute regulatory compliance.

</details> <details> <summary><strong>Listed on</strong></summary> </details> <p align="center"> <a href="https://github.com/smigolsmigol/llmkit">Star this repo</a> if you find it useful. </p>

常见问题

io.github.smigolsmigol/llmkit 是什么?

跟踪11家LLM provider的AI API成本,可按model、session或时间范围查询支出。

相关 Skills

Claude接口

by anthropics

Universal
热门

面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。

想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心

AI 与智能体
未扫描109.6k

提示工程专家

by alirezarezvani

Universal
热门

覆盖Prompt优化、Few-shot设计、结构化输出、RAG评测与Agent工作流编排,适合分析token成本、评估LLM输出质量,并搭建可落地的AI智能体系统。

把提示优化、LLM评测到RAG与智能体设计串成一套方法,适合想系统提升AI开发效率的人。

AI 与智能体
未扫描9.0k

智能体流程设计

by alirezarezvani

Universal
热门

面向生产级多 Agent 编排,梳理顺序、并行、分层、事件驱动、共识五种工作流设计,覆盖 handoff、状态管理、容错重试、上下文预算与成本优化,适合搭建复杂 AI 协作系统。

帮你把多智能体流程设计、编排和自动化统一起来,复杂工作流也能更稳地落地,适合追求强控制力的团队。

AI 与智能体
未扫描9.0k

相关 MCP Server

顺序思维

编辑精选

by Anthropic

热门

Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。

这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。

AI 与智能体
82.9k

知识图谱记忆

编辑精选

by Anthropic

热门

Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。

帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。

AI 与智能体
82.9k

PraisonAI

编辑精选

by mervinpraison

热门

PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。

如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。

AI 与智能体
6.4k

评论