转写纠错
transcript-fixer
by daymade
针对会议纪要、课程录音和采访转写里的 ASR/STT 错词、同音词与中英混杂内容,结合词典规则和 AI 自动纠错,并持续学习沉淀个人术语与修正规则库。
专治会议、讲座和访谈转写里的 ASR 错字与同音混淆,规则词典结合 AI 还能越用越懂你的中英混排语料。
安装
claude skill add --url github.com/daymade/claude-code-skills/tree/main/transcript-fixer文档
Transcript Fixer
Correct speech-to-text transcription errors through dictionary-based rules, AI-powered corrections, and automatic pattern detection. Build a personalized knowledge base that learns from each correction.
When to Use This Skill
- Correcting ASR/STT errors in meeting notes, lectures, or interviews
- Building domain-specific correction dictionaries
- Fixing Chinese/English homophone errors or technical terminology
- Collaborating on shared correction knowledge bases
Prerequisites
Python execution must use uv - never use system Python directly.
If uv is not installed:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows PowerShell
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Quick Start
Recommended: Use Enhanced Wrapper (auto-detects API key, opens HTML diff):
# First time: Initialize database
uv run scripts/fix_transcription.py --init
# Process transcript with enhanced UX
uv run scripts/fix_transcript_enhanced.py input.md --output ./corrected
The enhanced wrapper automatically:
- Detects GLM API key from shell configs (checks lines near
ANTHROPIC_BASE_URL) - Moves output files to specified directory
- Opens HTML visual diff in browser for immediate feedback
Alternative: Use Core Script Directly:
# 1. Set API key (if not auto-detected)
export GLM_API_KEY="<api-key>" # From https://open.bigmodel.cn/
# 2. Add common corrections (5-10 terms)
uv run scripts/fix_transcription.py --add "错误词" "正确词" --domain general
# 3. Run full correction pipeline
uv run scripts/fix_transcription.py --input meeting.md --stage 3
# 4. Review learned patterns after 3-5 runs
uv run scripts/fix_transcription.py --review-learned
Output files:
*_stage1.md- Dictionary corrections applied*_stage2.md- AI corrections applied (final version)*_对比.html- Visual diff (open in browser for best experience)
Generate word-level diff (recommended for reviewing corrections):
uv run scripts/generate_word_diff.py original.md corrected.md output.html
This creates an HTML file showing word-by-word differences with clear highlighting:
- 🔴
japanese 3 pro→ 🟢Gemini 3 Pro(complete word replacements) - Easy to spot exactly what changed without character-level noise
Example Session
Input transcript (meeting.md):
今天我们讨论了巨升智能的最新进展。
股价系统需要优化,目前性能不够好。
After Stage 1 (meeting_stage1.md):
今天我们讨论了具身智能的最新进展。 ← "巨升"→"具身" corrected
股价系统需要优化,目前性能不够好。 ← Unchanged (not in dictionary)
After Stage 2 (meeting_stage2.md):
今天我们讨论了具身智能的最新进展。
框架系统需要优化,目前性能不够好。 ← "股价"→"框架" corrected by AI
Learned pattern detected:
✓ Detected: "股价" → "框架" (confidence: 85%, count: 1)
Run --review-learned after 2 more occurrences to approve
Core Workflow
Three-stage pipeline stores corrections in ~/.transcript-fixer/corrections.db:
- Initialize (first time):
uv run scripts/fix_transcription.py --init - Add domain corrections:
--add "错误词" "正确词" --domain <domain> - Process transcript:
--input file.md --stage 3 - Review learned patterns:
--review-learnedand--approvehigh-confidence suggestions
Stages: Dictionary (instant, free) → AI via GLM API (parallel) → Full pipeline
Domains: general, embodied_ai, finance, medical, or custom names including Chinese (e.g., 火星加速器, 具身智能)
Learning: Patterns appearing ≥3 times at ≥80% confidence move from AI to dictionary
See references/workflow_guide.md for detailed workflows, references/script_parameters.md for complete CLI reference, and references/team_collaboration.md for collaboration patterns.
Critical Workflow: Dictionary Iteration
MUST save corrections after each fix. This is the skill's core value.
After fixing errors manually, immediately save to dictionary:
uv run scripts/fix_transcription.py --add "错误词" "正确词" --domain general
See references/iteration_workflow.md for complete iteration guide with checklist.
AI Fallback Strategy
When GLM API is unavailable (503, network issues), the script outputs [CLAUDE_FALLBACK] marker.
Claude Code should then:
- Analyze the text directly for ASR errors
- Fix using Edit tool
- MUST save corrections to dictionary with
--add
Database Operations
MUST read references/database_schema.md before any database operations.
Quick reference:
# View all corrections
sqlite3 ~/.transcript-fixer/corrections.db "SELECT * FROM active_corrections;"
# Check schema version
sqlite3 ~/.transcript-fixer/corrections.db "SELECT value FROM system_config WHERE key='schema_version';"
Stages
| Stage | Description | Speed | Cost |
|---|---|---|---|
| 1 | Dictionary only | Instant | Free |
| 2 | AI only | ~10s | API calls |
| 3 | Full pipeline | ~10s | API calls |
Bundled Resources
Scripts:
ensure_deps.py- Initialize shared virtual environment (run once, optional)fix_transcript_enhanced.py- Enhanced wrapper (recommended for interactive use)fix_transcription.py- Core CLI (for automation)generate_word_diff.py- Generate word-level diff HTML for reviewing correctionsexamples/bulk_import.py- Bulk import example
References (load as needed):
- Critical:
database_schema.md(read before DB operations),iteration_workflow.md(dictionary iteration best practices) - Getting started:
installation_setup.md,glm_api_setup.md,workflow_guide.md - Daily use:
quick_reference.md,script_parameters.md,dictionary_guide.md - Advanced:
sql_queries.md,file_formats.md,architecture.md,best_practices.md - Operations:
troubleshooting.md,team_collaboration.md
Troubleshooting
Verify setup health with uv run scripts/fix_transcription.py --validate. Common issues:
- Missing database → Run
--init - Missing API key →
export GLM_API_KEY="<key>"(obtain from https://open.bigmodel.cn/) - Permission errors → Check
~/.transcript-fixer/ownership
See references/troubleshooting.md for detailed error resolution and references/glm_api_setup.md for API configuration.
相关 Skills
PPT处理
by anthropics
处理 .pptx 全流程:创建演示文稿、提取和解析幻灯片内容、批量修改现有文件,支持模板套用、合并拆分、备注评论与版式调整。
✎ 涉及PPTX的创建、解析、修改到合并拆分都能一站搞定,连备注、模板和评论也能处理,做演示文稿特别省心。
技能工坊
by anthropics
覆盖 Skill 从创建到迭代优化全流程:起草能力、补测试提示、跑评测与基准方差分析,并持续改写内容和描述,提升效果与触发准确率。
✎ 技能工坊把技能从创建、迭代到评测串成闭环,方差分析加描述优化,特别适合把触发准确率打磨得更稳。
Word文档
by anthropics
覆盖Word/.docx文档的创建、读取、编辑与重排,适合生成报告、备忘录、信函和模板,也能处理目录、页眉页脚、页码、图片替换、查找替换、修订批注及内容提取整理。
✎ 搞定 .docx 的创建、改写与精排版,目录、批量替换、批注修订和图片更新都能自动化,做正式文档尤其省心。
相关 MCP 服务
文件系统
编辑精选by Anthropic
Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。
✎ 这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。
by wonderwhy-er
Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。
✎ 这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。
EdgarTools
编辑精选by dgunning
EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。
✎ 这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。