Evolve SDK

Run terminal-based AI agents in secure sandboxes with built-in observability.

Repo: https://github.com/evolving-machines-lab/evolve — cookbooks in cookbooks/, skills in skills/

SDK Choice

Language	Package	Syntax Reference
TypeScript	`@evolvingmachines/sdk`	references/typescript.md
Python	`evolve`	references/python.md

Both SDKs are functionally identical. Choose based on application language.

Quick Start

bash

# TypeScript
npm install @evolvingmachines/sdk

# Python
pip install evolve-sdk

Core Concepts

1. Evolve (Single Agent)

For single-agent tasks with multi-turn conversations.

code

Evolve → run() → getOutputFiles()

Key capabilities:

•Agent configuration (type, model, reasoning effort)
•Skills (pdf, docx, pptx, etc.)
•Composio integrations (GitHub, Gmail, Slack, etc.)
•MCP servers for custom tools
•Structured output via schema
•Context/files upload, session management

2. Swarm (Parallel Agents)

For parallel processing with functional abstractions.

Operation	Type	Description
`map`	transform	Process items in parallel → outputs
`filter`	gate	Evaluate items, condition decides pass/fail
`reduce`	synthesize	Many items → single output
`best_of`	select	N candidates → judge picks winner

When to use:

•map: Batch processing (analyze 100 docs)
•filter: Quality gates (keep only critical items)
•reduce: Synthesis (combine into report)
•best_of: Quality (run 3 agents, pick best)

3. Pipeline (Chained Operations)

Preferred API for most workflows. Fluent, readable, reusable.

code

Pipeline → .map() → .filter() → .reduce() → .run()

Reusable across different data batches.

Pipeline vs standalone Swarm calls:

•Use Pipeline for chains (map → filter → reduce) - cleaner API
•Use Swarm.bestOf() only for standalone best-of-N on single item (not available as pipeline step)

Pipeline restrictions:

•best_of only available as option within .map({ bestOf }), not as standalone step
•.reduce() is terminal - no steps after

Authentication

API keys auto-resolve from environment variables.

Mode	Setup	Included
Gateway	`EVOLVE_API_KEY`	E2B, browser-use, tracing
BYOK	Provider key + `E2B_API_KEY`	Direct provider access
Claude Max	`CLAUDE_CODE_OAUTH_TOKEN` + `E2B_API_KEY`	Use existing subscription
Codex	`CODEX_OAUTH_FILE_PATH` + `E2B_API_KEY`	Use existing subscription
Gemini	`GEMINI_OAUTH_FILE_PATH` + `E2B_API_KEY`	Use existing subscription

Note: In Gateway mode, EVOLVE_API_KEY is automatically injected into this sandbox by the parent Evolve process. The SDK picks it up from the environment—no manual configuration needed.

Gateway Mode

bash

# .env
EVOLVE_API_KEY=sk-...

code

Evolve().withAgent({ type: "claude", apiKey: EVOLVE_API_KEY })

Included: E2B sandbox auto-provisioned, browser-use integrated, tracing at dashboard.evolvingmachines.ai

BYOK Mode

bash

# .env
ANTHROPIC_API_KEY=sk-...   # or OPENAI_API_KEY, GEMINI_API_KEY
E2B_API_KEY=e2b_...

code

sandbox = E2BProvider({ apiKey: E2B_API_KEY })
Evolve().withAgent({ type: "claude", providerApiKey: ANTHROPIC_API_KEY }).withSandbox(sandbox)

Claude Max (OAuth)

bash

claude --setup-token  # Run in terminal → receive token

bash

# .env
CLAUDE_CODE_OAUTH_TOKEN=sk-ant-...
E2B_API_KEY=e2b_...

code

Evolve().withAgent({ type: "claude" }).withSandbox(sandbox)  # Auto-picks from env

Codex (OAuth)

bash

codex auth --provider openai  # Creates ~/.codex/auth.json

bash

# .env
CODEX_OAUTH_FILE_PATH=~/.codex/auth.json
E2B_API_KEY=e2b_...

code

Evolve().withAgent({ type: "codex" }).withSandbox(sandbox)  # Auto-picks from env

Gemini (OAuth)

bash

gemini auth login  # Creates ~/.gemini/oauth_creds.json

bash

# .env
GEMINI_OAUTH_FILE_PATH=~/.gemini/oauth_creds.json
E2B_API_KEY=e2b_...

code

Evolve().withAgent({ type: "gemini" }).withSandbox(sandbox)  # Auto-picks from env

Full docs: See references/typescript.md or references/python.md

BYOK Mode: Set provider env vars (see Agent Types table) + E2B_API_KEY.

Agent Types

Type	Models	Default	Env Var
`"claude"`	`"opus"` `"sonnet"` `"haiku"`	`"opus"`	`ANTHROPIC_API_KEY` or `CLAUDE_CODE_OAUTH_TOKEN`
`"codex"`	`"gpt-5.2"` `"gpt-5.2-codex"`	`"gpt-5.2"`	`OPENAI_API_KEY` or `CODEX_OAUTH_FILE_PATH`
`"gemini"`	`"gemini-3-pro-preview"` `"gemini-3-flash-preview"`	`"gemini-3-flash-preview"`	`GEMINI_API_KEY` or `GEMINI_OAUTH_FILE_PATH`
`"qwen"`	`"qwen3-coder-plus"`	`"qwen3-coder-plus"`	`OPENAI_API_KEY`

Workspace Structure

Agents run in sandboxes with this filesystem:

code

/home/user/workspace/
├── context/     # Input files (read-only)
├── scripts/     # Agent code
├── temp/        # Scratch space
└── output/      # Final deliverables

Structured Output

Provide a schema (Zod/JSON Schema for TS, Pydantic/dict for Python) and the agent writes output/result.json.

code

withSchema(schema) → run() → getOutputFiles() → { files, data, error }

Key Patterns

Skills + Composio + MCP hierarchy

code

.withSkills([...])           # Agent capabilities (browser-use included by default with Gateway)
.withComposio(userId, {...}) # 1000+ integrations
.withMcpServers({...})       # Custom tools (advanced)

Note: With EVOLVE_API_KEY, browser-use is already integrated. Additional browser skills (dev-browser, agent-browser) are optional.

Streaming Events

Event	Description
`content`	Parsed events (recommended)
`stdout`	Raw JSONL output
`stderr`	Error output

Content events: agent_message_chunk, agent_thought_chunk, tool_call, tool_call_update, plan

Session Management

code

run() → run() → run()     # Same session, maintains history
getSession() → save       # Persist session ID
setSession(id) → run()    # Reconnect later
pause() / resume()        # Suspend/reactivate billing
kill()                    # Destroy sandbox

Swarm Operations Detail

Input: FileMap

code

FileMap = { "path": content }  # path in context/, content is string or bytes

Items can be: single file, multiple files, or entire folders per worker.

Result Types

SwarmResult (from map, filter, best_of):

•status: 'success' | 'filtered' | 'error'
•data: Parsed schema or null
•files: Output files
•meta: Operation metadata

SwarmResultList (from map, filter):

•.success / .filtered / .error accessors

ReduceResult (from reduce):

•Single result with data, files, meta

Chaining

When chaining, result.json from previous step → data.json in next step's context.

Quality Options

verify: LLM-as-judge verifies output, retries with feedback

code

verify: { criteria: "...", maxAttempts: 3 }

bestOf: Run N candidates, judge picks best

code

bestOf: { n: 3, judgeCriteria: "..." }

retry: Auto-retry on error with backoff

code

retry: { maxAttempts: 3, backoffMs: 1000 }

Language-Specific Syntax

For complete syntax and examples:

•TypeScript: references/typescript.md
•Python: references/python.md