音频智造
audio-cog
by CellCog
AI audio generation powered by CellCog. Text-to-speech, voice synthesis, voiceovers, podcast audio, narration, music generation, background music, sound design. Professional audio creation with AI.
安装
claude skill add --url github.com/openclaw/skills/tree/main/skills/chenghaifeng08-creator/audio-cog-automaton文档
Audio Cog - AI Audio Generation Powered by CellCog
Create professional audio with AI - from voiceovers and narration to background music and sound design.
Prerequisites
This skill requires the cellcog skill for SDK setup and API calls.
clawhub install cellcog
Read the cellcog skill first for SDK setup. This skill shows you what's possible.
Quick pattern (v1.0+):
# Fire-and-forget - returns immediately
result = client.create_chat(
prompt="[your audio request]",
notify_session_key="agent:main:main",
task_label="audio-task",
chat_mode="agent" # Agent mode is optimal for all audio tasks
)
# Daemon notifies you when complete - do NOT poll
What Audio You Can Create
Text-to-Speech / Voiceover
Convert text to natural-sounding speech:
- Narration: "Generate a professional male voiceover for this product video script"
- Audiobook Style: "Create an engaging narration of this short story with emotional delivery"
- Podcast Intros: "Generate a warm, friendly podcast intro: 'Welcome to The Daily Tech...'"
- E-Learning: "Create clear, instructional voiceover for this training module"
- IVR/Phone Systems: "Generate professional phone menu prompts"
Available Voices
CellCog provides 8 high-quality voices with distinct characteristics:
| Voice | Gender | Best For | Characteristics |
|---|---|---|---|
| cedar | Male | Product videos, announcements | Warm, resonant, authoritative, trustworthy |
| marin | Female | Professional content, tutorials | Bright, articulate, emotionally agile |
| ballad | Male | Storytelling, flowing narratives | Smooth, melodic, musical quality |
| coral | Female | Energetic content, ads | Vibrant, lively, dynamic, spirited |
| echo | Male | Thoughtful content, documentaries | Calm, measured, deliberate |
| sage | Female | Educational, knowledge content | Wise, contemplative, reflective |
| shimmer | Female | Gentle content, wellness | Soft, gentle, soothing, approachable |
| verse | Male | Creative, artistic content | Poetic, rhythmic, expressive |
Voice Recommendations by Use Case
For product videos and announcements:
Use cedar (male) or marin (female) - both project confidence and professionalism.
For storytelling and audiobooks:
Use ballad (male) or sage (female) - designed for engaging, flowing narratives.
For high-energy content:
Use coral (female) - vibrant and dynamic, perfect for ads and exciting announcements.
For calm, educational content:
Use echo (male) or shimmer (female) - measured pacing ideal for learning.
Voice Style Customization
Beyond selecting a voice, you can fine-tune delivery with style instructions:
- Accent & dialect: American, British, Australian, Indian, etc.
- Emotional range: Excited, serious, warm, mysterious, dramatic
- Pacing: Slow and deliberate, conversational, fast and energetic
- Special effects: Whispering, character impressions
Example with style instructions:
"Generate voiceover using cedar voice with a warm, conversational tone. Speak at medium pace with slight enthusiasm when mentioning features. American accent."
Music Generation
Create original background music and soundtracks:
- Background Music: "Create calm lo-fi background music for a study video, 2 minutes"
- Podcast Music: "Generate an upbeat intro jingle for a tech podcast, 15 seconds"
- Video Soundtracks: "Create cinematic orchestral music for a product launch video"
- Ambient/Mood: "Generate peaceful ambient sounds for a meditation app"
- Genre-Specific: "Create energetic electronic music for a fitness video"
Music Specifications
| Parameter | Options |
|---|---|
| Duration | 15 seconds to 5+ minutes |
| Genre | Electronic, rock, classical, jazz, ambient, lo-fi, cinematic, pop, hip-hop |
| Tempo | 60 BPM (slow) to 180+ BPM (fast) |
| Mood | Upbeat, calm, dramatic, mysterious, inspiring, melancholic |
| Instruments | Piano, guitar, synth, strings, drums, brass, etc. |
Music Licensing
All AI-generated music from CellCog is royalty-free and fully yours to use commercially.
You have complete rights to use the generated music for:
- YouTube videos (including monetized content)
- Commercial projects and advertisements
- Podcasts and streaming
- Apps and games
- Any other commercial or personal use
No attribution required. No licensing fees. The music is generated uniquely for you.
Audio Output Formats
| Format | Best For |
|---|---|
| MP3 | Standard audio delivery, voiceovers, music |
| Combined with video | Background music for video-cog outputs |
Chat Mode for Audio
Use chat_mode="agent" for all audio generation tasks.
Audio generation—whether voiceovers, music, or sound design—executes efficiently in agent mode. CellCog's audio capabilities don't require multi-angle deliberation; they require precise execution, which agent mode excels at.
There's no scenario where agent team mode provides meaningfully better audio output. Save agent team for research and complex creative work that benefits from multiple reasoning passes.
Example Audio Prompts
Professional voiceover with specific voice:
"Generate a professional voiceover using the marin voice for this script:
'Introducing TaskFlow - the project management tool that actually works. With intelligent automation, seamless collaboration, and powerful analytics, TaskFlow helps teams do their best work.'
Style: Confident and friendly, medium pace. Suitable for a product launch video."
Podcast intro with voice selection:
"Create a podcast intro voiceover using cedar voice:
'Welcome to Future Forward, the podcast where we explore the technologies shaping tomorrow. I'm your host, and today we're diving into...'
Style: Warm and engaging, conversational tone. Also generate a 10-second upbeat intro music bed to go underneath."
Background music:
"Generate 2 minutes of calm, lo-fi hip-hop style background music. Should be chill and unobtrusive, good for studying or working. Include soft piano, mellow beats, and gentle vinyl crackle. 75 BPM."
Audiobook narration:
"Create an audiobook-style narration using ballad voice for this passage:
[passage text]
Style: Warm storytelling quality, measured pace with appropriate pauses for drama."
Cinematic music:
"Generate 90 seconds of cinematic orchestral music for a tech company's 'About Us' video. Start soft and inspiring, build to a confident crescendo, then resolve to a hopeful ending."
Multi-Language Support
CellCog can generate speech in 50+ languages:
- English (multiple accents)
- Spanish, French, German, Italian, Portuguese
- Chinese (Mandarin, Cantonese)
- Japanese, Korean
- Hindi, Arabic
- Russian, Polish, Dutch
- And many more
Specify the language in your prompt:
"Generate this text in Japanese with a native female speaker using shimmer voice: 'いらっしゃいませ...'"
Tips for Better Audio
-
Choose the right voice: Match the voice to your content type. Cedar/marin for professional, ballad/sage for storytelling, coral for energy.
-
Provide the complete script: Don't say "something about our product" - write out exactly what should be said.
-
Include style instructions: "Confident but warm", "slow and deliberate", "with slight excitement" helps shape delivery.
-
For music: Specify duration, tempo (BPM if you know it), mood, and genre.
-
Pronunciation guidance: For names or technical terms, add hints: "CellCog (pronounced SELL-kog)"
-
Emotional beats: For longer voiceovers, indicate tone shifts: "[excited] And now for the big reveal... [serious] But there's a catch."
相关 Skills
内部沟通
by anthropics
按公司常用模板和语气快速起草内部沟通内容,覆盖 3P 更新、状态报告、领导汇报、项目进展、事故复盘、FAQ 与 newsletter,适合需要统一格式的团队沟通场景。
✎ 按公司偏好的模板快速产出状态汇报、领导更新和 FAQ,既省去反复改稿,也让内部沟通更统一、更专业。
主题工厂
by anthropics
给幻灯片、文档、报告和 HTML 落地页快速套用专业配色与字体主题,内置 10 套预设风格并支持现场生成新主题,适合统一品牌或内容视觉。
✎ 主题工厂能帮你把幻灯片、文档到落地页快速统一视觉风格,内置 10 套主题,还能按需即时生成新主题。
文档共著
by anthropics
围绕文档、提案、技术规格、决策记录等写作任务,按上下文收集、结构迭代、读者测试三步协作共创,减少信息遗漏,写出更清晰、经得起他人阅读的内容。
✎ 写文档、方案或技术规格时容易思路散、信息漏,它用结构化共著流程帮你高效传递上下文、反复打磨内容,还能从读者视角做验证。
相关 MCP 服务
by nirholas
免费的加密新闻聚合 MCP,汇集 Bitcoin、Ethereum、DeFi、Solana 与 altcoins 资讯源。
by ProfessionalWiki
让 Large Language Model 客户端无缝连接任意 MediaWiki 站点,可创建、更新、搜索页面,并通过 OAuth 2.0 安全管理内容。
by transloadit
借助 86+ 个云端 media processing robots,处理视频、音频、图像和文档。