Overview
Parse the current screen using OmniParser v2 to detect all visible UI elements. Returns structured data including element types, text content, interactivity levels, and bounding box coordinates.
This method analyzes the entire screen and returns every detected element. It's useful for:
- •Understanding the full UI layout of a screen
- •Finding all clickable or interactive elements
- •Building custom element-based logic
- •Debugging what elements TestDriver can detect
- •Accessibility auditing
Syntax
javascript
const result = await testdriver.parse()
Parameters
None.
Returns
Promise<ParseResult> - Object containing detected UI elements
ParseResult
| Property | Type | Description |
|---|---|---|
elements | ParsedElement[] | Array of detected UI elements |
annotatedImageUrl | string | URL of the annotated screenshot with bounding boxes |
imageWidth | number | Width of the analyzed screenshot |
imageHeight | number | Height of the analyzed screenshot |
ParsedElement
| Property | Type | Description |
|---|---|---|
index | number | Element index |
type | string | Element type (e.g. "text", "icon", "button") |
content | string | Text content or description of the element |
interactivity | string | Interactivity level (e.g. "clickable", "non-interactive") |
bbox | object | Bounding box in pixel coordinates {x0, y0, x1, y1} |
boundingBox | object | Bounding box as {left, top, width, height} |
Examples
Get All Elements on Screen
javascript
const result = await testdriver.parse();
console.log(`Found ${result.elements.length} elements`);
result.elements.forEach((el, i) => {
console.log(`${i + 1}. [${el.type}] "${el.content}" (${el.interactivity})`);
});
Find Clickable Elements
javascript
const result = await testdriver.parse();
const clickable = result.elements.filter(e => e.interactivity === 'clickable');
console.log(`Found ${clickable.length} clickable elements`);
clickable.forEach(el => {
console.log(`- "${el.content}" at (${el.bbox.x0}, ${el.bbox.y0})`);
});
Find and Click an Element by Content
javascript
const result = await testdriver.parse();
// Find a "Submit" button
const submitBtn = result.elements.find(e =>
e.content.toLowerCase().includes('submit') && e.interactivity === 'clickable'
);
if (submitBtn) {
// Calculate center of the bounding box
const x = Math.round((submitBtn.bbox.x0 + submitBtn.bbox.x1) / 2);
const y = Math.round((submitBtn.bbox.y0 + submitBtn.bbox.y1) / 2);
await testdriver.click({ x, y });
}
Filter by Element Type
javascript
const result = await testdriver.parse();
// Get all text elements
const textElements = result.elements.filter(e => e.type === 'text');
textElements.forEach(e => console.log(`Text: "${e.content}"`));
// Get all icons
const icons = result.elements.filter(e => e.type === 'icon');
console.log(`Found ${icons.length} icons`);
// Get all buttons
const buttons = result.elements.filter(e => e.type === 'button');
console.log(`Found ${buttons.length} buttons`);
Build Custom Assertions
javascript
import { describe, expect, it } from "vitest";
import { TestDriver } from "testdriverai/lib/vitest/hooks.mjs";
describe("Login Page", () => {
it("should have expected form elements", async (context) => {
const testdriver = TestDriver(context);
await testdriver.provision.chrome({
url: 'https://myapp.com/login',
});
const result = await testdriver.parse();
// Assert expected elements exist
const textContent = result.elements.map(e => e.content.toLowerCase());
expect(textContent).toContain('email');
expect(textContent).toContain('password');
// Assert there are clickable elements
const clickable = result.elements.filter(e => e.interactivity === 'clickable');
expect(clickable.length).toBeGreaterThan(0);
});
});
Use Bounding Box Coordinates
javascript
const result = await testdriver.parse();
result.elements.forEach(el => {
// Pixel coordinates
console.log(`Element "${el.content}":`);
console.log(` bbox: (${el.bbox.x0}, ${el.bbox.y0}) to (${el.bbox.x1}, ${el.bbox.y1})`);
console.log(` size: ${el.boundingBox.width}x${el.boundingBox.height}`);
console.log(` position: left=${el.boundingBox.left}, top=${el.boundingBox.top}`);
});
View Annotated Screenshot
javascript
const result = await testdriver.parse();
// The annotated image shows all detected elements with bounding boxes
console.log('Annotated screenshot:', result.annotatedImageUrl);
console.log(`Image dimensions: ${result.imageWidth}x${result.imageHeight}`);
How It Works
- •TestDriver captures a screenshot of the current screen
- •The image is sent to the TestDriver API
- •OmniParser v2 analyzes the image to detect all UI elements
- •Each element is classified by type (text, icon, button, etc.) and interactivity
- •Bounding box coordinates are returned in pixel coordinates matching the screen resolution
Best Practices
<AccordionGroup> <Accordion title="Use find() for targeting specific elements"> For locating and interacting with a specific element, prefer `find()` which uses AI vision. Use `parse()` when you need a complete inventory of all elements on screen.code
```javascript
// Prefer this for clicking a specific element
await testdriver.find("Submit button").click();
// Use parse() for full UI analysis
const result = await testdriver.parse();
const allButtons = result.elements.filter(e => e.type === 'button');
```
code
```javascript const result = await testdriver.parse(); const interactive = result.elements.filter(e => e.interactivity === 'clickable'); const static_ = result.elements.filter(e => e.interactivity === 'non-interactive'); ```
code
```javascript // Wait for page to stabilize await testdriver.wait(2000); // Then parse const result = await testdriver.parse(); ```
code
```javascript
const result = await testdriver.parse();
console.log('View annotated screenshot:', result.annotatedImageUrl);
```
Related
- •find() - AI-powered element location
- •assert() - Make AI-powered assertions about screen state
- •screenshot() - Capture screenshots
- •Elements Reference - Complete Element API