CF排障
cloudflare-troubleshooting
by daymade
遇到 Cloudflare 的 ERR_TOO_MANY_REDIRECTS、SSL 或 DNS 异常时,直接调用 Cloudflare API 核查真实配置,系统定位重定向、证书和解析问题的根因。
用 API 直接核查 Cloudflare 真实配置,系统化定位重定向、SSL 和 DNS 异常,比凭经验猜问题更快更准。
安装
claude skill add --url github.com/daymade/claude-code-skills/tree/main/cloudflare-troubleshooting文档
Cloudflare Troubleshooting
Core Principle
Investigate with evidence, not assumptions. Always query Cloudflare API to examine actual configuration before diagnosing issues. The skill's value is the systematic investigation methodology, not predetermined solutions.
Investigation Methodology
1. Gather Credentials
Request from user:
- Domain name
- Cloudflare account email
- Cloudflare Global API Key (or API Token)
Global API Key location: Cloudflare Dashboard → My Profile → API Tokens → View Global API Key
2. Get Zone Information
First step for any Cloudflare troubleshooting - obtain the zone ID:
curl -s -X GET "https://api.cloudflare.com/client/v4/zones?name=<domain>" \
-H "X-Auth-Email: <email>" \
-H "X-Auth-Key: <api_key>" | jq '.'
Extract zone_id from result[0].id for subsequent API calls.
3. Investigate Systematically
For each issue, gather evidence before making conclusions. Use Cloudflare API to inspect:
- Current configuration state
- Recent changes (if audit log available)
- Related settings that might interact
Common Investigation Patterns
Redirect Loops (ERR_TOO_MANY_REDIRECTS)
Evidence gathering sequence:
-
Check SSL/TLS mode:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/ssl" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key"Look for:
result.value- tells current SSL mode -
Check Always Use HTTPS setting:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/always_use_https" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key" -
Check Page Rules for redirects:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/pagerules" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key"Look for:
forwarding_urloralways_use_httpsactions -
Test origin server directly (if possible):
bashcurl -I -H "Host: <domain>" https://<origin_ip>
Diagnosis logic:
- SSL mode "flexible" + origin enforces HTTPS = redirect loop
- Multiple redirect rules can conflict
- Check browser vs curl behavior differences
Fix:
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/ssl" \
-H "X-Auth-Email: email" \
-H "X-Auth-Key: key" \
-H "Content-Type: application/json" \
--data '{"value":"full"}'
Purge cache after fix:
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "X-Auth-Email: email" \
-H "X-Auth-Key: key" \
-d '{"purge_everything":true}'
DNS Issues
Evidence gathering:
-
List DNS records:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key" -
Check external DNS resolution:
bashdig <domain> dig @8.8.8.8 <domain> -
Check DNSSEC status:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/dnssec" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key"
Look for:
- Missing A/AAAA/CNAME records
- Incorrect proxy status (proxied vs DNS-only)
- TTL values
- Conflicting records
SSL Certificate Errors
Evidence gathering:
-
Check SSL certificate status:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/ssl/certificate_packs" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key" -
Check origin certificate (if using Full Strict):
bashopenssl s_client -connect <origin_ip>:443 -servername <domain> -
Check SSL settings:
- Minimum TLS version
- TLS 1.3 status
- Opportunistic Encryption
Common issues:
- Error 526: SSL mode is "strict" but origin cert invalid
- Error 525: SSL handshake failure at origin
- Provisioning delay: Wait 15-30 minutes for Universal SSL
Origin Server Errors (502/503/504)
Evidence gathering:
-
Check if origin is reachable:
bashcurl -I -H "Host: <domain>" https://<origin_ip> -
Check DNS records point to correct origin:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key" -
Review load balancer config (if applicable):
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/load_balancers" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key" -
Check firewall rules:
bashcurl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/firewall/rules" \ -H "X-Auth-Email: email" \ -H "X-Auth-Key: key"
Learning New APIs
When encountering issues not covered above, consult Cloudflare API documentation:
- Browse API reference: https://developers.cloudflare.com/api/
- Search for relevant endpoints using issue keywords
- Check API schema to understand available operations
- Test with GET requests first to understand data structure
- Make changes with PATCH/POST after confirming approach
Pattern for exploring new APIs:
# List available settings for a zone
curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings" \
-H "X-Auth-Email: email" \
-H "X-Auth-Key: key"
API Reference Overview
Consult references/api_overview.md for:
- Common endpoints organized by category
- Request/response schemas
- Authentication patterns
- Rate limits and error handling
Consult references/ssl_modes.md for:
- Detailed SSL/TLS mode explanations
- Platform compatibility
- Security implications
Consult references/common_issues.md for:
- Issue patterns and symptoms
- Investigation checklists
- Platform-specific notes
Best Practices
Evidence-Based Investigation
- Query before assuming - Use API to check actual state
- Gather multiple data points - Cross-reference settings
- Check related configurations - Settings often interact
- Verify externally - Use dig/curl to confirm
- Test incrementally - One change at a time
API Usage
- Parse JSON responses - Use
jqor python for readability - Check success field -
"success": true/falsein responses - Handle errors gracefully - Read
errorsarray in responses - Respect rate limits - Cloudflare API has limits
- Use appropriate methods:
- GET: Retrieve information
- PATCH: Update settings
- POST: Create resources / trigger actions
- DELETE: Remove resources
Making Changes
- Gather evidence first - Understand current state
- Identify root cause - Don't guess
- Apply targeted fix - Change only what's needed
- Purge cache if needed - Especially for SSL/redirect changes
- Verify fix - Re-query API to confirm
- Inform user of wait times:
- Edge server propagation: 30-60 seconds
- DNS propagation: Up to 48 hours
- Browser cache: Requires manual clear
Security
- Never log API keys in output
- Warn if user shares credentials in public context
- Recommend API Tokens with scoped permissions over Global API Key
- Use read-only operations for investigation
Workflow Template
1. Gather: domain, email, API key
2. Get zone_id via zones API
3. Investigate:
- Query relevant APIs for evidence
- Check multiple related settings
- Verify with external tools (dig, curl)
4. Analyze evidence to determine root cause
5. Apply fix via appropriate API endpoint
6. Purge cache if configuration change affects delivery
7. Verify fix via API query and external testing
8. Inform user of resolution and any required actions
Example: Complete Investigation
When user reports "site shows ERR_TOO_MANY_REDIRECTS":
# 1. Get zone ID
curl -s -X GET "https://api.cloudflare.com/client/v4/zones?name=example.com" \
-H "X-Auth-Email: user@example.com" \
-H "X-Auth-Key: abc123" | jq '.result[0].id'
# 2. Check SSL mode (primary suspect for redirect loops)
curl -s -X GET "https://api.cloudflare.com/client/v4/zones/ZONE_ID/settings/ssl" \
-H "X-Auth-Email: user@example.com" \
-H "X-Auth-Key: abc123" | jq '.result.value'
# If returns "flexible" and origin is GitHub Pages/Netlify/Vercel:
# 3. Fix by changing to "full"
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/ZONE_ID/settings/ssl" \
-H "X-Auth-Email: user@example.com" \
-H "X-Auth-Key: abc123" \
-H "Content-Type: application/json" \
--data '{"value":"full"}'
# 4. Purge cache
curl -X POST "https://api.cloudflare.com/client/v4/zones/ZONE_ID/purge_cache" \
-H "X-Auth-Email: user@example.com" \
-H "X-Auth-Key: abc123" \
-d '{"purge_everything":true}'
# 5. Inform user: Wait 60 seconds, clear browser cache, retry
When Scripts Are Useful
The bundled scripts (scripts/check_cloudflare_config.py, scripts/fix_ssl_mode.py) serve as:
- Reference implementations of investigation patterns
- Quick diagnostic tools when Python is available
- Examples of programmatic API usage
However, prefer direct API calls via Bash/curl for flexibility and transparency. Scripts should not limit capability - use them when convenient, but use raw API calls when needed for:
- Unfamiliar scenarios
- Edge cases
- Learning/debugging
- Operations not covered by scripts
The investigation methodology and API knowledge is the core skill, not the scripts.
相关 Skills
环境密钥管理
by alirezarezvani
统一梳理dev/staging/prod的.env和密钥流程,自动生成.env.example、校验必填变量、扫描Git历史泄漏,并联动Vault、AWS SSM、1Password、Doppler完成轮换。
✎ 统一管理环境变量、密钥与配置,减少泄露和部署混乱,安全治理与团队协作一起做好,DevOps 场景很省心。
可观测性设计
by alirezarezvani
面向生产系统规划可落地的可观测性体系,串起指标、日志、链路追踪与 SLI/SLO、错误预算、告警和仪表盘设计,适合搭建监控平台与优化故障响应。
✎ 把监控、日志、链路追踪串起来,帮助团队从设计阶段构建可观测性,排障更快、系统演进更稳。
单仓导航
by alirezarezvani
聚焦monorepo架构治理与迁移,覆盖Turborepo、Nx、pnpm workspaces,支持跨包影响分析、按变更范围构建测试、依赖图可视化和发布流程优化。
✎ 单仓导航专治 monorepo 里找代码、理依赖和切工作区费时的问题,对多项目共仓场景尤其友好,让大型仓库也能像小项目一样好逛。
相关 MCP 服务
kubefwd
编辑精选by txn2
kubefwd 是让 AI 帮你批量转发 Kubernetes 服务到本地的开发神器。
✎ 微服务开发者最头疼的本地调试问题,它一键搞定——自动分配 IP 避免端口冲突,还能用自然语言查询状态。但依赖 AI 工作流,纯命令行爱好者可能觉得不够直接。
Cloudflare
编辑精选by Cloudflare
Cloudflare MCP Server 是让你用自然语言管理 Workers、KV 和 R2 等云资源的工具。
✎ 这个工具解决了开发者频繁切换控制台和文档的痛点,特别适合那些在 Cloudflare 上部署无服务器应用、需要快速调试或管理配置的团队。不过,由于它依赖多个子服务器,初次设置可能有点繁琐,建议先从 Workers Bindings 这类核心功能入手。
Terraform
编辑精选by hashicorp
Terraform MCP Server 是让 AI 助手直接操作 Terraform Registry 和 HCP Terraform 的桥梁。
✎ 如果你经常在 Terraform 里翻文档找模块配置,这个服务器能省不少时间——直接问 Claude 就能生成准确的代码片段。最适合管理多云基础设施的团队,但注意它目前只适合本地使用,别在生产环境里暴露 HTTP 端点。