AgentSkillsCN

workflow-recording

以可复现、步骤清晰的流程记录并存档UI工作流。当用户提出“录制工作流”“捕获操作步骤”“文档化处理过程”“自动化工作流”“演示如何复现”“保存这些步骤”“创建操作手册”“记录完整流程”,或需要将多步骤的UI交互过程加以记录,以便后续回放或归档时,这一技能将助您事半功倍。

SKILL.md
--- frontmatter
name: workflow-recording
description: >
  Records and documents UI workflows as reproducible, step-by-step procedures.
  Use this skill when the user says "record workflow", "capture steps",
  "document process", "automate workflow", "show me how to reproduce",
  "save these steps", "create a runbook", "document this procedure",
  or needs to capture a multi-step UI interaction for replay or documentation.

Workflow Recording

You are recording a UI workflow as a series of reproducible steps. Guide the user through the process and capture comprehensive identification data at each step.

1. Start the Recording Session

  • Take an initial screenshot with mcp__hijak__screenshot to document the starting state.
  • Use mcp__hijak__system_active_window and mcp__hijak__window_list to record which applications and windows are open.
  • Ask the user to describe the overall goal of the workflow (e.g., "Submit an expense report in the HR app").
  • Confirm the starting conditions (which app should be open, what state it should be in).

2. Capture Each Step

For every step in the workflow, capture three layers of identification:

a. Visual capture:

  • Take a screenshot with mcp__hijak__screenshot showing the state before the action.
  • Note visual landmarks (what the screen looks like, where the target element is).

b. Accessibility data:

  • Use mcp__hijak__accessibility_tree to capture the full UI tree of the relevant application.
  • Use mcp__hijak__accessibility_find to locate the specific target element and record its role, title, value, and position in the hierarchy.
  • Record the accessibility path to the element (e.g., "Window > Toolbar > Button[title='Submit']").

c. Text/OCR data:

  • Use mcp__hijak__ocr_window to capture all visible text and bounding boxes.
  • Record the text content of and around the target element for text-based matching.

d. Action details:

  • Record the exact action performed: click (with coordinates), type (with text), key press, hotkey, scroll, drag, etc.
  • Record the element coordinates as a fallback identification method.

e. Post-action state:

  • Take a post-action screenshot to document the result.
  • Use mcp__hijak__wait_for_idle before capturing to ensure the UI has settled.
  • Note any state transitions: new windows, dialogs, page changes, loading indicators.

3. Capture Wait Conditions

Between steps, record what must be true before proceeding:

  • Element appearance: "Wait for the 'Confirmation' dialog to appear."
  • Element disappearance: "Wait for the loading spinner to disappear."
  • Text presence: "Wait for 'Upload complete' to appear on screen."
  • Window state: "Wait for the 'Settings' window to open."
  • Idle state: "Wait for the application to finish processing."

Use the corresponding wait tools (mcp__hijak__wait_for_element, mcp__hijak__wait_for_text, mcp__hijak__wait_for_window, mcp__hijak__wait_for_idle) to determine appropriate wait conditions.

4. Note Variability and Conditional Paths

As you record, identify:

  • Steps that might differ between runs (e.g., selecting a date, entering a variable amount).
  • Conditional branches (e.g., "If the confirmation dialog appears, click OK; otherwise proceed").
  • Dynamic content that changes (e.g., timestamps, counters, user-specific data).
  • Potential failure points (e.g., network-dependent steps, authentication prompts).

Mark these as parameters or decision points in the workflow.

5. Generate the Workflow Summary

After all steps are captured, produce a structured summary:

code
## Workflow: [Name]

### Prerequisites
- [App] must be open at [starting state]
- [Any required data or credentials]

### Steps

#### Step 1: [Action Description]
- **Action**: [click/type/press/etc.] on [element description]
- **Target (a11y)**: [role, title, value, hierarchy path]
- **Target (text)**: "[visible text]" at [bounding box]
- **Target (coords)**: [x, y] (fallback)
- **Wait after**: [wait condition]
- **Expected result**: [what should happen]

#### Step 2: [Action Description]
...

### Notes
- [Variability, conditional paths, failure modes]

6. Validate the Recording

  • Review the full step list with the user.
  • Confirm that each step's identification strategies are accurate.
  • Ask the user to verify the wait conditions and expected results.
  • Note any steps that may need adjustment for different environments or screen sizes.