AgentSkillsCN

Llm Computer Use

LLM 计算机使用

SKILL.md

LLM Computer Use Skill

Desktop automation via LLM vision. The AI sees your screen, decides what to click/type, and executes actions.

When to Use

Apply when automating desktop interactions that require visual understanding:

  • Opening and controlling applications
  • Clicking buttons, menus, or UI elements
  • Typing text into fields
  • Navigating file systems
  • Browser automation
  • System tasks

Requirements

  • Python 3.10+
  • Anthropic API key (ANTHROPIC_API_KEY environment variable)
  • Windows

CLI Usage

bash
# Navigate to skill folder first
cd .windsurf/skills/llm-computer-use

# Dry-run (safe, no actions executed)
python -m llm_computer_use "Click the Start button"

# Execute mode
python -m llm_computer_use -x "Open Notepad and type Hello World"

# With API key file
python -m llm_computer_use -x -k path/to/api-keys.txt "Open Calculator"

Options

  • -x, --execute - Execute actions (default: dry-run)
  • -n, --max-iterations - Max iterations (default: 10, ~$0.01-0.02 each)
  • -m, --model - Model: claude-sonnet-4-5 (default), claude-haiku-4-5, claude-opus-4
  • -k, --keys-file - API keys file path
  • -s, --save-log - Save session log as JSON
  • -q, --quiet - Minimal output

Programmatic Usage

python
import sys
sys.path.insert(0, ".windsurf/skills/llm-computer-use")

from llm_computer_use import AgentSession

session = AgentSession(
    task_prompt="Open Notepad",
    max_iterations=5,
    dry_run=False,
    model="claude-sonnet-4-5"
)

summary = session.run()
print(f"Cost: ${summary['estimated_cost_usd']:.4f}")

Cost Estimate

ModelCost per Iteration10 Iterations
claude-opus-4~$0.05-0.10~$0.50-1.00
claude-sonnet-4-5~$0.01-0.02~$0.10-0.20
claude-haiku-4-5~$0.003-0.005~$0.03-0.05

Safety

  • Dry-run by default (no actions executed)
  • High-risk action confirmation (Alt+F4, delete, shutdown)
  • Iteration limits prevent runaway costs
  • pyautogui fail-safe (move mouse to corner to abort)

Version

0.5.0