io.github.theoddden/terradev

编码与调试

by theoddden

为 Claude Code 提供完整 GPU 基础设施的 MCP server,含 192 个工具用于 provisioning、training 与 inference。

什么是 io.github.theoddden/terradev

为 Claude Code 提供完整 GPU 基础设施的 MCP server,含 192 个工具用于 provisioning、training 与 inference。

README

Terradev MCP Server v2.0.5

Complete Agentic GPU Infrastructure for Claude Code — 192 MCP tools: GPU provisioning, vLLM/SGLang/Ollama inference, Arize Phoenix observability, NeMo Guardrails safety, Qdrant vector DB, Ray cluster management, Datadog monitoring, and Terraform-powered parallel provisioning across 20 cloud providers.

<p align="center"> <img src="https://raw.githubusercontent.com/theoddden/terradev-mcp/main/demo/terradev-mcp-demo.gif" alt="Terradev MCP Demo" width="800"> </p>

What's New in v2.0.5

  • 192 MCP Tools: Massively expanded from 69 → 192 tools
  • Arize Phoenix: LLM trace observability — projects, spans, traces, OTEL env, K8s deployment (7 tools)
  • NeMo Guardrails: Output safety — test, chat, config generation, K8s deployment (4 tools)
  • Qdrant Vector DB: RAG infrastructure — collections, create, info, count, K8s deployment (6 tools)
  • Datadog Monitoring: Metrics, monitors, dashboards, Terraform export (10 tools)
  • HuggingFace Hub: Models, datasets, endpoints, smart templates, hardware recommendations (11 tools)
  • LangChain/LangGraph: Workflow creation, orchestrator-worker, evaluation (9 tools)
  • 20 Cloud Providers: Alibaba Cloud, OVHcloud, FluidStack, Hetzner, SiliconFlow + 15 more
  • vLLM Cost Optimizations: LMCache, KV Cache Offloading, MTP Speculative Decoding, Sleep Mode, Multi-LoRA
  • Data Governance: Consent management, OPA policy evaluation, compliance reports (6 tools)
  • Cost Intelligence: Deep analysis, simulation, budget optimization (4 tools)

Previous Releases

  • Claude.ai Connector: OAuth 2.0 PKCE flow for remote access
  • MoE Cluster Templates: Production-ready infrastructure for Mixture-of-Experts models
  • NVLink Topology Enforcement: Automatic single-node TP with NUMA-aligned GPU placement
  • Terraform Core Engine: All GPU provisioning uses Terraform for optimal parallel efficiency

Architecture

Terraform is the fundamental engine - not just a feature. This provides:

  • True Parallel Provisioning across multiple providers simultaneously
  • State Management for infrastructure tracking
  • Infrastructure as Code with reproducible deployments
  • Cost Optimization through provider arbitrage
  • Bug-Free Operation with all known issues resolved

Installation

Prerequisites

  1. Install Terradev CLI (v3.7.0+):
bash
pip install terradev-cli
# For all providers + HF Spaces:
pip install "terradev-cli[all]"
  1. Set up minimum credentials (RunPod only):
bash
export TERRADEV_RUNPOD_KEY=your_runpod_api_key
  1. Install the MCP server:
bash
npm install -g terradev-mcp

Claude Code Setup (Local — stdio)

Add to your Claude Code MCP configuration:

json
{
  "mcpServers": {
    "terradev": {
      "command": "terradev-mcp"
    }
  }
}

Claude.ai Connector Setup (Remote — SSE)

Use Terradev from Claude.ai on any device — no local install required.

  1. Go to Claude.ai → Settings → Connectors
  2. Add a new connector with URL:
    code
    https://terradev-mcp.terradev.cloud/sse
    
  3. Enter the Bearer token provided by your admin.

That's it — GPU provisioning tools are now available in every Claude.ai conversation.

Self-Hosting the Remote Server

To host your own instance:

bash
# Set required env vars
export TERRADEV_MCP_BEARER_TOKEN=your-secret-token
export TERRADEV_RUNPOD_KEY=your-runpod-key

# Option 1: Run directly
pip install -r requirements.txt
python3 terradev_mcp.py --transport sse --port 8080

# Option 2: Docker
docker-compose up -d

The server exposes:

  • GET /sse — SSE stream endpoint (Claude.ai connects here)
  • POST /messages — MCP message endpoint
  • GET /health — Health check (unauthenticated)

See nginx-mcp.conf for reverse proxy configuration with SSL.

Available MCP Tools

The Terradev MCP server provides 192 tools for complete GPU cloud management:

GPU Operations

  • local_scan - Discover local GPU devices and total VRAM pool (NEW in v1.2.2)
  • quote_gpu - Get real-time GPU prices across all cloud providers
  • provision_gpu - Terraform-powered GPU provisioning with parallel efficiency

Terraform Infrastructure Management

  • terraform_plan - Generate Terraform execution plans
  • terraform_apply - Apply Terraform configurations
  • terraform_destroy - Destroy Terraform-managed infrastructure

Kubernetes Management

  • k8s_create - Create Kubernetes clusters with GPU nodes
  • k8s_list - List all Kubernetes clusters
  • k8s_info - Get detailed cluster information
  • k8s_destroy - Destroy Kubernetes clusters

Inference & Model Deployment

  • inferx_deploy - Deploy models to InferX serverless platform
  • inferx_status - Check inference endpoint status
  • inferx_list - List deployed inference models
  • inferx_optimize - Get cost analysis for inference endpoints
  • hf_space_deploy - Deploy models to HuggingFace Spaces

MoE Expert Parallelism (NEW in v1.5)

  • deploy_wide_ep - Deploy MoE model with Wide-EP across multiple GPUs via Ray Serve LLM
  • deploy_pd - Deploy disaggregated Prefill/Decode serving with NIXL KV transfer
  • ep_group_status - Health check EP groups (all ranks must be healthy for all-to-all)
  • sglang_start - Start SGLang server with EP/EPLB/DBO flags via SSH/systemd
  • sglang_stop - Stop SGLang server on remote instance

Instance & Cost Management

  • status - View all instances and costs
  • manage_instance - Stop/start/terminate GPU instances
  • analytics - Get cost analytics and spending trends
  • optimize - Find cheaper alternatives for running instances

Provider Configuration

  • setup_provider - Get setup instructions for any cloud provider
  • configure_provider - Configure provider credentials locally

Arize Phoenix — LLM Trace Observability

  • phoenix_test - Test connection to Phoenix server
  • phoenix_projects - List Phoenix projects
  • phoenix_spans - List spans with SpanQuery DSL filters
  • phoenix_trace - View full execution tree for a trace
  • phoenix_otel_env - Generate OTEL env vars for serving pods
  • phoenix_snippet - Generate Python instrumentation snippet
  • phoenix_k8s - Generate K8s deployment manifest

NeMo Guardrails — Output Safety

  • guardrails_test - Test connection to Guardrails server
  • guardrails_chat - Send message through safety rails
  • guardrails_generate_config - Generate Colang 2.x config
  • guardrails_k8s - Generate K8s deployment manifest

Qdrant — Vector Database for RAG

  • qdrant_test - Test connection to Qdrant
  • qdrant_collections - List vector collections
  • qdrant_create_collection - Create collection (auto-configures from embedding model)
  • qdrant_info - Get collection stats
  • qdrant_count - Count vectors in collection
  • qdrant_k8s - Generate K8s StatefulSet manifest

Complete Command Reference

Local GPU Discovery (NEW!)

bash
# Scan for local GPUs
terradev local scan

# Example output:
# ✅ Found 2 local GPU(s)
# 📊 Total VRAM Pool: 48 GB
#
# Devices:
# • NVIDIA GeForce RTX 4090
#   - Type: CUDA
#   - VRAM: 24 GB
#   - Compute: 8.9
#
# • Apple Metal
#   - Type: MPS
#   - VRAM: 24 GB
#   - Platform: arm64

Hybrid Use Case: Mac Mini (24GB) + Gaming PC with RTX 4090 (24GB) = 48GB local pool for Qwen2.5-72B!

GPU Price Quotes

bash
# Get prices for specific GPU type
terradev quote -g H100

# Filter by specific providers
terradev quote -g A100 -p runpod,vastai,lambda

# Quick-provision cheapest option
terradev quote -g H100 --quick

GPU Provisioning (Terraform-Powered)

bash
# Provision single GPU via Terraform
terradev provision -g A100

# Provision multiple GPUs in parallel across providers
terradev provision -g H100 -n 4 --providers ["runpod", "vastai", "lambda", "aws"]

# Plan without applying
terradev provision -g A100 -n 2 --plan-only

# Set maximum price ceiling
terradev provision -g A100 --max-price 2.50

# Terraform state is automatically managed

Terraform Infrastructure Management

bash
# Generate execution plan
terraform plan -config-dir ./my-gpu-infrastructure

# Apply infrastructure
terraform apply -config-dir ./my-gpu-infrastructure -auto-approve

# Destroy infrastructure  
terraform destroy -config-dir ./my-gpu-infrastructure -auto-approve

Kubernetes Clusters

bash
# Create multi-cloud K8s cluster
terradev k8s create my-cluster --gpu H100 --count 4 --multi-cloud --prefer-spot

# List all clusters
terradev k8s list

# Get cluster details
terradev k8s info my-cluster

# Destroy cluster
terradev k8s destroy my-cluster

Inference Deployment

bash
# Deploy model to InferX
terradev inferx deploy --model meta-llama/Llama-2-7b-hf --gpu-type a10g

# Check endpoint status
terradev inferx status

# List deployed models
terradev inferx list

# Get cost analysis
terradev inferx optimize

HuggingFace Spaces

bash
# Deploy LLM template
terradev hf-space my-llama --model-id meta-llama/Llama-2-7b-hf --template llm

# Deploy with custom hardware
terradev hf-space my-model --model-id microsoft/DialoGPT-medium --hardware a10g-large --sdk gradio

# Deploy embedding model
terradev hf-space my-embeddings --model-id sentence-transformers/all-MiniLM-L6-v2 --template embedding

Instance Management

bash
# View all running instances and costs
terradev status --live

# Stop instance
terradev manage -i <instance-id> -a stop

# Start instance
terradev manage -i <instance-id> -a start

# Terminate instance
terradev manage -i <instance-id> -a terminate

Analytics & Optimization

bash
# Get 30-day cost analytics
terradev analytics --days 30

# Find cheaper alternatives
terradev optimize

Provider Setup

bash
# Get quick setup instructions
terradev setup runpod --quick
terradev setup aws --quick
terradev setup vastai --quick

# Configure credentials (stored locally)
terradev configure --provider runpod
terradev configure --provider aws
terradev configure --provider vastai

Supported GPU Types

  • H100 - NVIDIA H100 80GB (premium training)
  • A100 - NVIDIA A100 80GB (training/inference)
  • A10G - NVIDIA A10G 24GB (inference)
  • L40S - NVIDIA L40S 48GB (rendering/inference)
  • L4 - NVIDIA L4 24GB (inference)
  • T4 - NVIDIA T4 16GB (light inference)
  • RTX4090 - NVIDIA RTX 4090 24GB (consumer)
  • RTX3090 - NVIDIA RTX 3090 24GB (consumer)
  • V100 - NVIDIA V100 32GB (legacy)

Bug Fixes Applied

This release includes fixes for all known production issues:

BugFixImpact
Wrong import path (terradev_cli.providers)Changed to providers.provider_factory✅ API calls now work
list builtin shadowed by Click commandUsed type([]) instead of isinstance(r, list)✅ No more crashes
aiohttp.ClientSession(trust_env=False)Set trust_env=True for proxy support✅ Proxy environments work
boto3 not in dependenciesAdded boto3>=1.26.0 to requirements✅ AWS provider functional
Vast.ai GPU name filter exact matchSwitched to client-side filtering with "in"✅ Vast.ai provider works

All bugs are now resolved in v1.2.0

Terraform Integration

The MCP now includes a terraform.tf template for custom infrastructure:

hcl
terraform {
  required_providers {
    terradev = {
      source  = "theoddden/terradev"
      version = "~> 3.0"
    }
  }
}

resource "terradev_instance" "gpu" {
  gpu_type = var.gpu_type
  spot     = true
  count    = var.gpu_count
  
  tags = {
    Name        = "terradev-mcp-gpu"
    Provisioned = "terraform"
    GPU_Type    = var.gpu_type
  }
}

MoE Serving Architecture (v1.5)

Terradev v1.5 integrates the full MoE serving stack:

ComponentWhat it doesTerradev integration
Ray Serve LLMOrchestrates Wide-EP and P/D deploymentsbuild_dp_deployment, build_pd_openai_app
Expert ParallelismDistributes experts across GPUsEP/DP flags in task.yaml, K8s, Helm, Terraform
EPLBRebalances experts at runtime--enable-eplb in vLLM/SGLang serving
Dual-Batch OverlapOverlaps compute with all-to-all--enable-dbo flag
DeepEP kernelsOptimized all-to-all for MoEVLLM_ALL2ALL_BACKEND=deepep_low_latency
DeepGEMMFP8 GEMM for MoE expertsVLLM_USE_DEEP_GEMM=1
NIXLZero-copy KV cache transferNixlConnector in P/D tracker
EP Group RouterRoutes to rank hosting target expertsExpert range tracking per endpoint

Supported Cloud Providers

RunPod, Vast.ai, AWS, GCP, Azure, Lambda Labs, CoreWeave, TensorDock, Oracle Cloud, Crusoe Cloud, DigitalOcean, HyperStack, Alibaba Cloud, OVHcloud, FluidStack, Hetzner, SiliconFlow, Baseten, HuggingFace, Paperspace

Environment Variables

Minimum setup:

  • TERRADEV_RUNPOD_KEY: RunPod API key

Remote SSE mode:

  • TERRADEV_MCP_BEARER_TOKEN: Bearer token for authenticating Claude.ai Connector requests (required in production)

Full multi-cloud setup:

  • TERRADEV_AWS_ACCESS_KEY_ID, TERRADEV_AWS_SECRET_ACCESS_KEY, TERRADEV_AWS_DEFAULT_REGION
  • TERRADEV_GCP_PROJECT_ID, TERRADEV_GCP_CREDENTIALS_PATH
  • TERRADEV_AZURE_SUBSCRIPTION_ID, TERRADEV_AZURE_CLIENT_ID, TERRADEV_AZURE_CLIENT_SECRET, TERRADEV_AZURE_TENANT_ID
  • Additional provider keys (VastAI, Oracle, Lambda, CoreWeave, Crusoe, TensorDock)
  • HF_TOKEN: For HuggingFace Spaces deployment

Pricing Tiers

TierPriceInstancesSeats
Research (Free)$011
Research+$49.99/mo81
Enterprise$299.99/mo325
Enterprise+$0.09/GPU-hr (32 GPU min)UnlimitedUnlimited

Enterprise+: Metered billing at $0.09 per GPU-hour with a minimum of 32 GPUs. Unlimited provisions, servers, seats, dedicated support, fleet management, and GPU-hour metering. Run terradev upgrade -t enterprise_plus.

Security

BYOAPI: All API keys stay on your machine. Terradev never proxies credentials through third parties.

Links

常见问题

io.github.theoddden/terradev 是什么?

为 Claude Code 提供完整 GPU 基础设施的 MCP server,含 192 个工具用于 provisioning、training 与 inference。

相关 Skills

网页构建器

by anthropics

Universal
热门

面向复杂 claude.ai HTML artifact 开发,快速初始化 React + Tailwind CSS + shadcn/ui 项目并打包为单文件 HTML,适合需要状态管理、路由或多组件交互的页面。

在 claude.ai 里做复杂网页 Artifact 很省心,多组件、状态和路由都能顺手搭起来,React、Tailwind 与 shadcn/ui 组合效率高、成品也更精致。

编码与调试
未扫描114.1k

前端设计

by anthropics

Universal
热门

面向组件、页面、海报和 Web 应用开发,按鲜明视觉方向生成可直接落地的前端代码与高质感 UI,适合做 landing page、Dashboard 或美化现有界面,避开千篇一律的 AI 审美。

想把页面做得既能上线又有设计感,就用前端设计:组件到整站都能产出,难得的是能避开千篇一律的 AI 味。

编码与调试
未扫描114.1k

网页应用测试

by anthropics

Universal
热门

用 Playwright 为本地 Web 应用编写自动化测试,支持启动开发服务器、校验前端交互、排查 UI 异常、抓取截图与浏览器日志,适合调试动态页面和回归验证。

借助 Playwright 一站式验证本地 Web 应用前端功能,调 UI 时还能同步查看日志和截图,定位问题更快。

编码与调试
未扫描114.1k

相关 MCP Server

GitHub

编辑精选

by GitHub

热门

GitHub 是 MCP 官方参考服务器,让 Claude 直接读写你的代码仓库和 Issues。

这个参考服务器解决了开发者想让 AI 安全访问 GitHub 数据的问题,适合需要自动化代码审查或 Issue 管理的团队。但注意它只是参考实现,生产环境得自己加固安全。

编码与调试
83.4k

by Context7

热门

Context7 是实时拉取最新文档和代码示例的智能助手,让你告别过时资料。

它能解决开发者查找文档时信息滞后的问题,特别适合快速上手新库或跟进更新。不过,依赖外部源可能导致偶尔的数据延迟,建议结合官方文档使用。

编码与调试
52.2k

by tldraw

热门

tldraw 是让 AI 助手直接在无限画布上绘图和协作的 MCP 服务器。

这解决了 AI 只能输出文本、无法视觉化协作的痛点——想象让 Claude 帮你画流程图或白板讨论。最适合需要快速原型设计或头脑风暴的开发者。不过,目前它只是个基础连接器,你得自己搭建画布应用才能发挥全部潜力。

编码与调试
46.3k

评论