安卓自动化工具
tencent-cvp
by cobeizailin
All-in-one Android phone automation via ADB: screen analysis, touch/input, foreground app detection, app install. Use for any task that involves operating the Android device.
安装
claude skill add --url github.com/openclaw/skills/tree/main/skills/cobeizailin/tencent-cvp文档
Tencent CVP — Cloud Virtual Phone Automation
Operate the Tencent Cloud Virtual Phone: observe the screen, interact with it, detect apps, and install new ones.
Core loop: Observe -> Act -> Verify. Always check the screen before and after actions.
1. Screen Analysis
UI Layout Dump (preferred)
Structured XML with every element's text, coordinates, and properties. Always try this first.
adb shell uiautomator dump && adb shell cat /sdcard/window_dump.xml
Each XML node has:
text— visible textresource-id— element identifierclass— widget type (e.g.android.widget.TextView)bounds— coordinates as[left,top][right,bottom]clickable,enabled,focused— interaction state
Compute tap target from bounds: x = (left + right) / 2, y = (top + bottom) / 2.
Screenshot (fallback only)
Use only when uiautomator returns empty or partial XML — common with games, video players, WebView, or custom-rendered surfaces.
adb shell screencap -p /sdcard/screen.png && adb pull /sdcard/screen.png /tmp/screen.png
Then read /tmp/screen.png for visual analysis.
Tips
- Wake screen first:
adb shell input keyevent KEYCODE_WAKEUP - Get resolution:
adb shell wm size - uiautomator dump takes ~1-2s; don't spam it.
2. Input and Interaction
Touch
# Tap
adb shell input tap <x> <y>
# Long press (~1s)
adb shell input swipe <x> <y> <x> <y> 1000
# Swipe
adb shell input swipe <x1> <y1> <x2> <y2> <duration_ms>
Text Input
# ASCII only
adb shell input text "hello"
CJK / Non-ASCII — input text does not support Chinese. Use clipboard:
adb shell am broadcast -a clipper.set -e text "中文内容"
adb shell input keyevent KEYCODE_PASTE
Key Events
adb shell input keyevent KEYCODE_HOME
adb shell input keyevent KEYCODE_BACK
adb shell input keyevent KEYCODE_ENTER
adb shell input keyevent KEYCODE_WAKEUP
adb shell input keyevent KEYCODE_POWER
adb shell input keyevent KEYCODE_APP_SWITCH
adb shell input keyevent KEYCODE_VOLUME_UP
adb shell input keyevent KEYCODE_VOLUME_DOWN
Launch Apps
# By package + activity
adb shell am start -n <package>/<activity>
# By intent (open URL)
adb shell am start -a android.intent.action.VIEW -d "https://example.com"
# From launcher (package only)
adb shell monkey -p <package> -c android.intent.category.LAUNCHER 1
3. Foreground App Detection
adb shell dumpsys window | grep mCurrentFocus | grep -v null
Output example:
mCurrentFocus=Window{abcdef0 u0 com.tencent.mm/com.tencent.mm.ui.LauncherUI}
Output may be empty — this happens when:
- Screen is off or locked
- System transition in progress
- No focused window (e.g. during boot)
When empty: wake screen (KEYCODE_WAKEUP), wait, retry. If still empty, use uiautomator dump.
Common Package Names
| App | Package |
|---|---|
| Home/Launcher | com.android.launcher or vendor variant |
| Settings | com.android.settings |
| Chrome | com.android.chrome |
com.tencent.mm | |
| Alipay | com.eg.android.AlipayGphone |
| Douyin | com.ss.android.ugc.aweme |
| Bilibili | tv.danmaku.bili |
4. App Install
Priority: MyApp (应用宝) first, then browser, then web search.
Via MyApp (应用宝)
adb shell am start -a android.intent.action.VIEW -d "market://details?id=<package_name>" -p com.tencent.android.qqdownloader
Examples:
# WeChat
adb shell am start -a android.intent.action.VIEW -d "market://details?id=com.tencent.mm" -p com.tencent.android.qqdownloader
# Alipay
adb shell am start -a android.intent.action.VIEW -d "market://details?id=com.eg.android.AlipayGphone" -p com.tencent.android.qqdownloader
After opening:
- Use uiautomator dump to verify the page loaded
- Find and tap the install/download button
- Wait for install, then verify
Finding Package Names
If unknown, search the web for <app name> android package name. Common pattern: reverse domain (com.company.appname).
When MyApp Fails
- Try browser:
adb shell am start -a android.intent.action.VIEW -d "https://official-site.com" - Last resort: web search for APK download
相关 Skills
技能工坊
by anthropics
覆盖 Skill 从创建到迭代优化全流程:起草能力、补测试提示、跑评测与基准方差分析,并持续改写内容和描述,提升效果与触发准确率。
✎ 技能工坊把技能从创建、迭代到评测串成闭环,方差分析加描述优化,特别适合把触发准确率打磨得更稳。
表格处理
by anthropics
围绕 .xlsx、.xlsm、.csv、.tsv 做读写、修复、清洗、格式整理、公式计算与格式转换,适合修改现有表格、生成新报表或把杂乱数据整理成交付级电子表格。
✎ 做 Excel/CSV 相关任务很省心,能直接读写、修复、清洗和格式转换,尤其擅长把乱七八糟的表格整理成交付级文件。
Word文档
by anthropics
覆盖Word/.docx文档的创建、读取、编辑与重排,适合生成报告、备忘录、信函和模板,也能处理目录、页眉页脚、页码、图片替换、查找替换、修订批注及内容提取整理。
✎ 搞定 .docx 的创建、改写与精排版,目录、批量替换、批注修订和图片更新都能自动化,做正式文档尤其省心。
相关 MCP 服务
文件系统
编辑精选by Anthropic
Filesystem 是 MCP 官方参考服务器,让 LLM 安全读写本地文件系统。
✎ 这个服务器解决了让 Claude 直接操作本地文件的痛点,比如自动整理文档或生成代码文件。适合需要自动化文件处理的开发者,但注意它只是参考实现,生产环境需自行加固安全。
by wonderwhy-er
Desktop Commander 是让 AI 直接执行终端命令、管理文件和进程的 MCP 服务器。
✎ 这工具解决了 AI 无法直接操作本地环境的痛点,适合需要自动化脚本调试或文件批量处理的开发者。它能让你用自然语言指挥终端,但权限控制需谨慎,毕竟让 AI 执行 rm -rf 可不是闹着玩的。
EdgarTools
编辑精选by dgunning
EdgarTools 是无需 API 密钥即可解析 SEC EDGAR 财报的开源 Python 库。
✎ 这个工具解决了金融数据获取的痛点——直接让 AI 读取结构化财报,比如让 Claude 分析苹果的 10-K 文件。适合量化分析师或金融开发者快速构建数据管道。但注意,它依赖 SEC 网站稳定性,高峰期可能延迟。