模型接入
pydantic-ai-model-integration
by anderskev
Configure LLM providers, use fallback models, handle streaming, and manage model settings in PydanticAI. Use when selecting models, implementing resilience, or optimizing API calls.
安装
claude skill add --url github.com/openclaw/skills/tree/main/skills/anderskev/pydantic-ai-model-integration文档
PydanticAI Model Integration
Provider Model Strings
Format: provider:model-name
from pydantic_ai import Agent
# OpenAI
Agent('openai:gpt-4o')
Agent('openai:gpt-4o-mini')
Agent('openai:o1-preview')
# Anthropic
Agent('anthropic:claude-sonnet-4-5')
Agent('anthropic:claude-haiku-4-5')
# Google (API Key)
Agent('google-gla:gemini-2.0-flash')
Agent('google-gla:gemini-2.0-pro')
# Google (Vertex AI)
Agent('google-vertex:gemini-2.0-flash')
# Groq
Agent('groq:llama-3.3-70b-versatile')
Agent('groq:mixtral-8x7b-32768')
# Mistral
Agent('mistral:mistral-large-latest')
# Other providers
Agent('cohere:command-r-plus')
Agent('bedrock:anthropic.claude-3-sonnet')
Model Settings
from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings
agent = Agent(
'openai:gpt-4o',
model_settings=ModelSettings(
temperature=0.7,
max_tokens=1000,
top_p=0.9,
timeout=30.0, # Request timeout
)
)
# Override per-run
result = await agent.run(
'Generate creative text',
model_settings=ModelSettings(temperature=1.0)
)
Fallback Models
Chain models for resilience:
from pydantic_ai.models.fallback import FallbackModel
# Try models in order until one succeeds
fallback = FallbackModel(
'openai:gpt-4o',
'anthropic:claude-sonnet-4-5',
'google-gla:gemini-2.0-flash'
)
agent = Agent(fallback)
result = await agent.run('Hello')
# Custom fallback conditions
from pydantic_ai.exceptions import ModelAPIError
def should_fallback(error: Exception) -> bool:
"""Only fallback on rate limits or server errors."""
if isinstance(error, ModelAPIError):
return error.status_code in (429, 500, 502, 503)
return False
fallback = FallbackModel(
'openai:gpt-4o',
'anthropic:claude-sonnet-4-5',
fallback_on=should_fallback
)
Streaming Responses
async def stream_response():
async with agent.run_stream('Tell me a story') as response:
# Stream text output
async for chunk in response.stream_output():
print(chunk, end='', flush=True)
# Access final result after streaming
print(f"\nTokens used: {response.usage().total_tokens}")
Streaming with Structured Output
from pydantic import BaseModel
class Story(BaseModel):
title: str
content: str
moral: str
agent = Agent('openai:gpt-4o', output_type=Story)
async with agent.run_stream('Write a fable') as response:
# For structured output, stream_output yields partial JSON
async for partial in response.stream_output():
print(partial) # Partial Story object as parsed
# Final validated result
story = response.output
Dynamic Model Selection
import os
# Environment-based selection
model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o')
agent = Agent(model)
# Runtime model override
result = await agent.run(
'Hello',
model='anthropic:claude-sonnet-4-5' # Override default
)
# Context manager override
with agent.override(model='google-gla:gemini-2.0-flash'):
result = agent.run_sync('Hello')
Deferred Model Checking
Delay model validation for testing:
# Default: Validates model immediately (checks env vars)
agent = Agent('openai:gpt-4o')
# Deferred: Validates only on first run
agent = Agent('openai:gpt-4o', defer_model_check=True)
# Useful for testing with override
with agent.override(model=TestModel()):
result = agent.run_sync('Test') # No OpenAI key needed
Usage Tracking
result = await agent.run('Hello')
# Request usage (last request)
usage = result.usage()
print(f"Input tokens: {usage.input_tokens}")
print(f"Output tokens: {usage.output_tokens}")
print(f"Total tokens: {usage.total_tokens}")
# Full run usage (all requests in run)
run_usage = result.run_usage()
print(f"Total requests: {run_usage.requests}")
Usage Limits
from pydantic_ai.usage import UsageLimits
# Limit token usage
result = await agent.run(
'Generate content',
usage_limits=UsageLimits(
total_tokens=1000,
request_tokens=500,
response_tokens=500,
)
)
Provider-Specific Features
OpenAI
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
'gpt-4o',
api_key='your-key', # Or use OPENAI_API_KEY env var
base_url='https://custom-endpoint.com' # For Azure, proxies
)
Anthropic
from pydantic_ai.models.anthropic import AnthropicModel
model = AnthropicModel(
'claude-sonnet-4-5',
api_key='your-key' # Or ANTHROPIC_API_KEY
)
Common Model Patterns
| Use Case | Recommendation |
|---|---|
| General purpose | openai:gpt-4o or anthropic:claude-sonnet-4-5 |
| Fast/cheap | openai:gpt-4o-mini or anthropic:claude-haiku-4-5 |
| Long context | anthropic:claude-sonnet-4-5 (200k) or google-gla:gemini-2.0-flash |
| Reasoning | openai:o1-preview |
| Cost-sensitive prod | FallbackModel with fast model first |
相关 Skills
Claude接口
by anthropics
面向接入 Claude API、Anthropic SDK 或 Agent SDK 的开发场景,自动识别项目语言并给出对应示例与默认配置,快速搭建 LLM 应用。
✎ 想把Claude能力接进应用或智能体,用claude-api上手快、兼容Anthropic与Agent SDK,集成路径清晰又省心
提示工程专家
by alirezarezvani
覆盖Prompt优化、Few-shot设计、结构化输出、RAG评测与Agent工作流编排,适合分析token成本、评估LLM输出质量,并搭建可落地的AI智能体系统。
✎ 把提示优化、LLM评测到RAG与智能体设计串成一套方法,适合想系统提升AI开发效率的人。
智能体流程设计
by alirezarezvani
面向生产级多 Agent 编排,梳理顺序、并行、分层、事件驱动、共识五种工作流设计,覆盖 handoff、状态管理、容错重试、上下文预算与成本优化,适合搭建复杂 AI 协作系统。
✎ 帮你把多智能体流程设计、编排和自动化统一起来,复杂工作流也能更稳地落地,适合追求强控制力的团队。
相关 MCP 服务
顺序思维
编辑精选by Anthropic
Sequential Thinking 是让 AI 通过动态思维链解决复杂问题的参考服务器。
✎ 这个服务器展示了如何让 Claude 像人类一样逐步推理,适合开发者学习 MCP 的思维链实现。但注意它只是个参考示例,别指望直接用在生产环境里。
知识图谱记忆
编辑精选by Anthropic
Memory 是一个基于本地知识图谱的持久化记忆系统,让 AI 记住长期上下文。
✎ 帮 AI 和智能体补上“记不住”的短板,用本地知识图谱沉淀长期上下文,连续对话更聪明,数据也更可控。
PraisonAI
编辑精选by mervinpraison
PraisonAI 是一个支持自反思和多 LLM 的低代码 AI 智能体框架。
✎ 如果你需要快速搭建一个能 24/7 运行的 AI 智能体团队来处理复杂任务(比如自动研究或代码生成),PraisonAI 的低代码设计和多平台集成(如 Telegram)让它上手极快。但作为非官方项目,它的生态成熟度可能不如 LangChain 等主流框架,适合愿意尝鲜的开发者。