Burner Phone Control
Use this skill for ANY request involving phone screens or mobile app automation.
Vision Feedback Loop
ALWAYS follow this pattern:
- •
Screenshot: Capture the current screen
codebash(cmd="adb exec-out screencap -p > ./assets/screen.png")
- •
Analyze: Use vision model to understand the screen
codebash(cmd="python3 ./scripts/vision_helper.py ./assets/screen.png \"Describe the screen and list coordinates (x,y) for interactable elements.\"")
- •
Act: Perform the action using exact coordinates from step 2
codebash(cmd="adb shell input tap <x> <y>")
- •
Verify: Screenshot again to confirm the action worked
Available Commands
Tapping
code
bash(cmd="adb shell input tap <x> <y>")
Swiping
code
bash(cmd="adb shell input swipe <x1> <y1> <x2> <y2> <duration_ms>")
Typing Text
code
bash(cmd="adb shell input text 'your text here'")
Key Events
code
bash(cmd="adb shell input keyevent KEYCODE_HOME") bash(cmd="adb shell input keyevent KEYCODE_BACK") bash(cmd="adb shell input keyevent KEYCODE_ENTER")
Launch App
code
bash(cmd="adb shell am start -n com.package.name/.MainActivity")
Rules
- •ALWAYS screenshot before acting - never guess coordinates
- •ALWAYS use vision_helper.py to get coordinates
- •Use coordinates provided by the vision tool EXACTLY
- •All paths are relative to the skill root directory