AgentSkillsCN

simulang

利用 Simulang——一款基于 JavaScript 的 DSL,专为 Simular Pro 设计——实现桌面自动化任务。当用户询问关于桌面自动化、UI 脚本编写、应用程序控制、键盘/鼠标自动化,或 Simular Pro 脚本编写的相关问题时,可使用此功能。

SKILL.md
--- frontmatter
name: simulang
description: Automate desktop tasks using Simulang, a JavaScript-based DSL for Simular Pro. Use when the user asks about desktop automation, UI scripting, controlling applications, keyboard/mouse automation, or writing Simular Pro scripts.

Simulang (Simular Pro Action Language)

Simulang is a JavaScript-based DSL for Simular Pro that controls desktop environments through natural language-like functions for keyboard, mouse, perception, and application control.

Quick Reference

CategoryKey Functions
Applicationopen({app, url})
Keyboardtype({text, withReturn}), press({key, cmd, ctrl, shift})
Mouseclick({at, mode, spatialRelation}), scroll({direction})
PerceptionpageContent(), ask({prompt, context}), conceptsExist({concepts})
Waitwait({waitTime}), waitForConcepts({concepts})
Userrespond({message, requireConfirm})
FilesreadFile({path}), writeToFile({text, path})
Google SheetsgetGoogleSheetCellValue({cell}), setGoogleSheetCellValue({cell, value})
Background Browserbrowser.newtab(url), page.click(), page.type(), page.content(), page.ask()

Click Modes

javascript
// Default: text-based grounding
click({at: "sign in button"})

// Vision: for images/icons without text labels
click({at: "logo icon", mode: "vision"})

// Text + Screenshot: combines both methods
click({at: "Introduction in Simular Browser section", mode: "textAndScreenshot"})

// Spatial relation: disambiguate similar elements
click({at: "close button", spatialRelation: "containedIn", anchorConcept: "dialog"})

Spatial options: closest, furthest, above, below, left, right, contains, containedIn

Common Pattern: Extract & Process

javascript
function main() {
    open({url: "https://example.com"});
    wait({waitTime: 3});
    
    var content = pageContent();
    var result = ask({
        prompt: "Extract all items. Return as JSON array.",
        context: content
    });
    
    return JSON.parse(result);
}

Background Browser Control

For web automation tasks that can run in parallel without GUI focus, use the background browser API:

javascript
async function main() {
    // Open multiple pages in parallel
    const pages = await Promise.all([
        browser.newtab("https://news.google.com/"),
        browser.newtab("https://www.bbc.com/news")
    ]);

    // Wait for pages to load
    await Promise.all(pages.map(p => p.wait({ waitTime: 2 })));

    // Get content from all pages
    const contents = await Promise.all(pages.map(p => p.content()));

    // Summarize with LLM
    const summary = await pages[0].ask({
        prompt: "Summarize the key headlines",
        context: { text: contents.join("\n\n") }
    });

    console.log(summary);
}

Key Differences from Desktop Mode:

  • Use browser.newtab(url) instead of open({url})
  • All actions are async (await required)
  • Pages run independently in parallel
  • Use page.ask() with context: {text: ...} format

Best Practices

  1. Wait for loads: Use wait({waitTime: 2-3}) after navigation
  2. Be specific: Use role+value in descriptions ("sign in button" not "button")
  3. Use ask() for extraction: Pair with pageContent() for structured data
  4. Handle errors: Use respond({message, requireConfirm: true}) for confirmations
  5. Parallelize when possible: Open multiple browser tabs concurrently for faster results
  6. Minimize clicks: Use URLs with parameters to navigate directly when possible

Resources