CF排障

Universal

cloudflare-troubleshooting

by daymade

遇到 Cloudflare 的 ERR_TOO_MANY_REDIRECTS、SSL 或 DNS 异常时,直接调用 Cloudflare API 核查真实配置,系统定位重定向、证书和解析问题的根因。

用 API 直接核查 Cloudflare 真实配置,系统化定位重定向、SSL 和 DNS 异常,比凭经验猜问题更快更准。

884DevOps未扫描2026年3月5日

安装

claude skill add --url github.com/daymade/claude-code-skills/tree/main/cloudflare-troubleshooting

文档

Cloudflare Troubleshooting

Core Principle

Investigate with evidence, not assumptions. Always query Cloudflare API to examine actual configuration before diagnosing issues. The skill's value is the systematic investigation methodology, not predetermined solutions.

Investigation Methodology

1. Gather Credentials

Request from user:

  • Domain name
  • Cloudflare account email
  • Cloudflare Global API Key (or API Token)

Global API Key location: Cloudflare Dashboard → My Profile → API Tokens → View Global API Key

2. Get Zone Information

First step for any Cloudflare troubleshooting - obtain the zone ID:

bash
curl -s -X GET "https://api.cloudflare.com/client/v4/zones?name=<domain>" \
  -H "X-Auth-Email: <email>" \
  -H "X-Auth-Key: <api_key>" | jq '.'

Extract zone_id from result[0].id for subsequent API calls.

3. Investigate Systematically

For each issue, gather evidence before making conclusions. Use Cloudflare API to inspect:

  • Current configuration state
  • Recent changes (if audit log available)
  • Related settings that might interact

Common Investigation Patterns

Redirect Loops (ERR_TOO_MANY_REDIRECTS)

Evidence gathering sequence:

  1. Check SSL/TLS mode:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/ssl" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    

    Look for: result.value - tells current SSL mode

  2. Check Always Use HTTPS setting:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/always_use_https" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    
  3. Check Page Rules for redirects:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/pagerules" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    

    Look for: forwarding_url or always_use_https actions

  4. Test origin server directly (if possible):

    bash
    curl -I -H "Host: <domain>" https://<origin_ip>
    

Diagnosis logic:

  • SSL mode "flexible" + origin enforces HTTPS = redirect loop
  • Multiple redirect rules can conflict
  • Check browser vs curl behavior differences

Fix:

bash
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/ssl" \
  -H "X-Auth-Email: email" \
  -H "X-Auth-Key: key" \
  -H "Content-Type: application/json" \
  --data '{"value":"full"}'

Purge cache after fix:

bash
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
  -H "X-Auth-Email: email" \
  -H "X-Auth-Key: key" \
  -d '{"purge_everything":true}'

DNS Issues

Evidence gathering:

  1. List DNS records:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    
  2. Check external DNS resolution:

    bash
    dig <domain>
    dig @8.8.8.8 <domain>
    
  3. Check DNSSEC status:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/dnssec" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    

Look for:

  • Missing A/AAAA/CNAME records
  • Incorrect proxy status (proxied vs DNS-only)
  • TTL values
  • Conflicting records

SSL Certificate Errors

Evidence gathering:

  1. Check SSL certificate status:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/ssl/certificate_packs" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    
  2. Check origin certificate (if using Full Strict):

    bash
    openssl s_client -connect <origin_ip>:443 -servername <domain>
    
  3. Check SSL settings:

    • Minimum TLS version
    • TLS 1.3 status
    • Opportunistic Encryption

Common issues:

  • Error 526: SSL mode is "strict" but origin cert invalid
  • Error 525: SSL handshake failure at origin
  • Provisioning delay: Wait 15-30 minutes for Universal SSL

Origin Server Errors (502/503/504)

Evidence gathering:

  1. Check if origin is reachable:

    bash
    curl -I -H "Host: <domain>" https://<origin_ip>
    
  2. Check DNS records point to correct origin:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    
  3. Review load balancer config (if applicable):

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/load_balancers" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    
  4. Check firewall rules:

    bash
    curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/firewall/rules" \
      -H "X-Auth-Email: email" \
      -H "X-Auth-Key: key"
    

Learning New APIs

When encountering issues not covered above, consult Cloudflare API documentation:

  1. Browse API reference: https://developers.cloudflare.com/api/
  2. Search for relevant endpoints using issue keywords
  3. Check API schema to understand available operations
  4. Test with GET requests first to understand data structure
  5. Make changes with PATCH/POST after confirming approach

Pattern for exploring new APIs:

bash
# List available settings for a zone
curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/settings" \
  -H "X-Auth-Email: email" \
  -H "X-Auth-Key: key"

API Reference Overview

Consult references/api_overview.md for:

  • Common endpoints organized by category
  • Request/response schemas
  • Authentication patterns
  • Rate limits and error handling

Consult references/ssl_modes.md for:

  • Detailed SSL/TLS mode explanations
  • Platform compatibility
  • Security implications

Consult references/common_issues.md for:

  • Issue patterns and symptoms
  • Investigation checklists
  • Platform-specific notes

Best Practices

Evidence-Based Investigation

  1. Query before assuming - Use API to check actual state
  2. Gather multiple data points - Cross-reference settings
  3. Check related configurations - Settings often interact
  4. Verify externally - Use dig/curl to confirm
  5. Test incrementally - One change at a time

API Usage

  1. Parse JSON responses - Use jq or python for readability
  2. Check success field - "success": true/false in responses
  3. Handle errors gracefully - Read errors array in responses
  4. Respect rate limits - Cloudflare API has limits
  5. Use appropriate methods:
    • GET: Retrieve information
    • PATCH: Update settings
    • POST: Create resources / trigger actions
    • DELETE: Remove resources

Making Changes

  1. Gather evidence first - Understand current state
  2. Identify root cause - Don't guess
  3. Apply targeted fix - Change only what's needed
  4. Purge cache if needed - Especially for SSL/redirect changes
  5. Verify fix - Re-query API to confirm
  6. Inform user of wait times:
    • Edge server propagation: 30-60 seconds
    • DNS propagation: Up to 48 hours
    • Browser cache: Requires manual clear

Security

  • Never log API keys in output
  • Warn if user shares credentials in public context
  • Recommend API Tokens with scoped permissions over Global API Key
  • Use read-only operations for investigation

Workflow Template

code
1. Gather: domain, email, API key
2. Get zone_id via zones API
3. Investigate:
   - Query relevant APIs for evidence
   - Check multiple related settings
   - Verify with external tools (dig, curl)
4. Analyze evidence to determine root cause
5. Apply fix via appropriate API endpoint
6. Purge cache if configuration change affects delivery
7. Verify fix via API query and external testing
8. Inform user of resolution and any required actions

Example: Complete Investigation

When user reports "site shows ERR_TOO_MANY_REDIRECTS":

bash
# 1. Get zone ID
curl -s -X GET "https://api.cloudflare.com/client/v4/zones?name=example.com" \
  -H "X-Auth-Email: user@example.com" \
  -H "X-Auth-Key: abc123" | jq '.result[0].id'

# 2. Check SSL mode (primary suspect for redirect loops)
curl -s -X GET "https://api.cloudflare.com/client/v4/zones/ZONE_ID/settings/ssl" \
  -H "X-Auth-Email: user@example.com" \
  -H "X-Auth-Key: abc123" | jq '.result.value'

# If returns "flexible" and origin is GitHub Pages/Netlify/Vercel:

# 3. Fix by changing to "full"
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/ZONE_ID/settings/ssl" \
  -H "X-Auth-Email: user@example.com" \
  -H "X-Auth-Key: abc123" \
  -H "Content-Type: application/json" \
  --data '{"value":"full"}'

# 4. Purge cache
curl -X POST "https://api.cloudflare.com/client/v4/zones/ZONE_ID/purge_cache" \
  -H "X-Auth-Email: user@example.com" \
  -H "X-Auth-Key: abc123" \
  -d '{"purge_everything":true}'

# 5. Inform user: Wait 60 seconds, clear browser cache, retry

When Scripts Are Useful

The bundled scripts (scripts/check_cloudflare_config.py, scripts/fix_ssl_mode.py) serve as:

  • Reference implementations of investigation patterns
  • Quick diagnostic tools when Python is available
  • Examples of programmatic API usage

However, prefer direct API calls via Bash/curl for flexibility and transparency. Scripts should not limit capability - use them when convenient, but use raw API calls when needed for:

  • Unfamiliar scenarios
  • Edge cases
  • Learning/debugging
  • Operations not covered by scripts

The investigation methodology and API knowledge is the core skill, not the scripts.

相关 Skills

环境密钥管理

by alirezarezvani

Universal
热门

统一梳理dev/staging/prod的.env和密钥流程,自动生成.env.example、校验必填变量、扫描Git历史泄漏,并联动Vault、AWS SSM、1Password、Doppler完成轮换。

统一管理环境变量、密钥与配置,减少泄露和部署混乱,安全治理与团队协作一起做好,DevOps 场景很省心。

DevOps
未扫描12.1k

可观测性设计

by alirezarezvani

Universal
热门

面向生产系统规划可落地的可观测性体系,串起指标、日志、链路追踪与 SLI/SLO、错误预算、告警和仪表盘设计,适合搭建监控平台与优化故障响应。

把监控、日志、链路追踪串起来,帮助团队从设计阶段构建可观测性,排障更快、系统演进更稳。

DevOps
未扫描12.1k

单仓导航

by alirezarezvani

Universal
热门

聚焦monorepo架构治理与迁移,覆盖Turborepo、Nx、pnpm workspaces,支持跨包影响分析、按变更范围构建测试、依赖图可视化和发布流程优化。

单仓导航专治 monorepo 里找代码、理依赖和切工作区费时的问题,对多项目共仓场景尤其友好,让大型仓库也能像小项目一样好逛。

DevOps
未扫描12.1k

相关 MCP 服务

kubefwd

编辑精选

by txn2

热门

kubefwd 是让 AI 帮你批量转发 Kubernetes 服务到本地的开发神器。

微服务开发者最头疼的本地调试问题,它一键搞定——自动分配 IP 避免端口冲突,还能用自然语言查询状态。但依赖 AI 工作流,纯命令行爱好者可能觉得不够直接。

DevOps
4.1k

Cloudflare

编辑精选

by Cloudflare

热门

Cloudflare MCP Server 是让你用自然语言管理 Workers、KV 和 R2 等云资源的工具。

这个工具解决了开发者频繁切换控制台和文档的痛点,特别适合那些在 Cloudflare 上部署无服务器应用、需要快速调试或管理配置的团队。不过,由于它依赖多个子服务器,初次设置可能有点繁琐,建议先从 Workers Bindings 这类核心功能入手。

DevOps
3.6k

Terraform

编辑精选

by hashicorp

Terraform MCP Server 是让 AI 助手直接操作 Terraform Registry 和 HCP Terraform 的桥梁。

如果你经常在 Terraform 里翻文档找模块配置,这个服务器能省不少时间——直接问 Claude 就能生成准确的代码片段。最适合管理多云基础设施的团队,但注意它目前只适合本地使用,别在生产环境里暴露 HTTP 端点。

DevOps
1.3k

评论