AgentSkillsCN

shipyard-neo

Shipyard Neo 沙盒 MCP 工具使用指南。当用户需要在沙盒中执行代码、管理沙盒工作区中的文件、在沙盒中自动化浏览器操作、管理执行历史,或通过 Shipyard Neo MCP 服务器来管理技能生命周期(候选人、评估、发布、回滚)时,应使用此技能。触发条件包括:“在沙盒中运行代码”“创建沙盒”“在沙盒中执行 Python/Shell 脚本”“在沙盒中自动化浏览器操作”“管理技能”“查看执行历史”,或任何涉及 Shipyard Neo MCP 工具的任务。

SKILL.md
--- frontmatter
name: shipyard-neo
description: "Shipyard Neo sandbox MCP tools usage guide. This skill should be used when the user needs to execute code in sandboxes, manage files in sandbox workspaces, automate browsers in sandboxes, manage execution history, or work with skill lifecycle (candidates, evaluations, releases, rollbacks) through the shipyard-neo MCP server. Triggers include requests to 'run code in a sandbox', 'create a sandbox', 'execute Python/Shell in sandbox', 'automate browser in sandbox', 'manage skills', 'check execution history', or any task involving the shipyard-neo MCP tools."

Shipyard Neo Sandbox MCP Tools

Shipyard Neo provides isolated sandbox environments for executing Python, Shell, browser automation, and file operations through MCP tools. All MCP tools are prefixed with mcp--shipyard___neo--.

Architecture Overview

A sandbox's container topology depends on the profile used to create it. Call list_profiles to discover available profiles and their container layout.

Single-Container Profile (e.g., python-default)

Only a Ship container — supports Python, Shell, and Filesystem operations. No browser capability.

Multi-Container Profile (e.g., browser-python)

code
┌──────────────────────────────────────────────────────────────┐
│                        Sandbox                               │
│  ┌──────────────────┐      ┌──────────────────┐              │
│  │  Ship Container   │      │  Gull Container   │             │
│  │  (code execution) │      │ (browser automat.)│             │
│  │  Python, Shell,   │      │  agent-browser    │             │
│  │  Filesystem       │      │  Chromium headless│             │
│  └────────┬──────────┘      └────────┬──────────┘             │
│           └──────────┬───────────────┘                        │
│           ┌──────────┴──────────┐                             │
│           │   Cargo Volume      │                             │
│           │   /workspace        │                             │
│           └─────────────────────┘                             │
└──────────────────────────────────────────────────────────────┘

Container Isolation Rules

ContainerResponsibilityMCP ToolsCannot Do
ShipPython / Shell / Filesystemexecute_python, execute_shell, read_file, write_file, upload_file, download_file, list_files, delete_fileNo agent-browser installed — cannot run browser commands
GullBrowser automationexecute_browser, execute_browser_batchNo Python/Shell — cannot execute code

Critical rules:

  • Never run agent-browser commands in execute_shell — Ship container does not have agent-browser installed
  • Never prefix execute_browser commands with agent-browser — Gull auto-injects it; duplicating causes agent-browser agent-browser ... error
  • Both containers share the Cargo Volume at /workspace

Cross-Container Data Sharing

Both containers exchange files through the shared /workspace volume:

code
# Browser screenshot → Python processing
execute_browser(cmd="screenshot /workspace/page.png")  → Gull writes file
read_file(path="page.png")                             → Ship reads file
execute_python(code="from PIL import Image; img = Image.open('page.png')")

# Python generates data → Browser uses it
execute_python(code="with open('data.json', 'w') as f: json.dump(data, f)")
execute_browser(cmd="open file:///workspace/report.html")

Ship Container Pre-installed Environment

Ship container is based on python:3.13-slim-bookworm with rich pre-installed tools. See references/sandbox-environment.md for details.

Language Runtimes

RuntimeDetails
Python 3.13Executed via IPython kernel; variables persist across calls within same sandbox
Node.js LTSIncludes npm, pnpm, vercel

Pre-installed Python Libraries

CategoryLibraries
Data Sciencenumpy, pandas, scikit-learn, matplotlib, seaborn
Image ProcessingPillow, opencv-python-headless, imageio
Document Processingpython-docx, python-pptx, openpyxl, xlrd, pypdf, pdfplumber, reportlab
Web/XMLbeautifulsoup4, lxml, jinja2
Utilitiestomli, pydantic

System Tools

git, curl, vim-tiny, nano, less, htop, procps, sudo

Core Workflows

1. Sandbox Lifecycle

code
list_profiles → create_sandbox → [operations] → delete_sandbox
  1. Call list_profiles to discover available profiles (e.g., python-default for Ship only, browser-python for Ship + Gull)
  2. Call create_sandbox to create a sandbox and obtain sandbox_id
  3. Use sandbox_id for all subsequent operations
  4. Call delete_sandbox when finished to release resources

2. Code Execution

Python (via IPython — variables persist across calls):

python
# First call
execute_python(sandbox_id="xxx", code="import pandas as pd; df = pd.read_csv('data.csv')")

# Subsequent call can use df directly — variables persist within the same sandbox
execute_python(sandbox_id="xxx", code="print(df.describe())")

Shell:

python
execute_shell(sandbox_id="xxx", command="ls -la", cwd="src")
execute_shell(sandbox_id="xxx", command="npm install && npm run build")
execute_shell(sandbox_id="xxx", command="git init && git add .")

3. File Operations

All sandbox paths are relative to /workspace:

python
# Text file operations (content passed through MCP protocol)
write_file(sandbox_id="xxx", path="src/main.py", content="print('hello')")
read_file(sandbox_id="xxx", path="src/main.py")
list_files(sandbox_id="xxx", path="src")
delete_file(sandbox_id="xxx", path="src/temp.py")

# Binary/local file transfer (reads/writes actual files on the MCP server's filesystem)
upload_file(sandbox_id="xxx", local_path="/path/to/data.csv", sandbox_path="data/input.csv")
download_file(sandbox_id="xxx", sandbox_path="output/result.png", local_path="./result.png")

When to use upload_file/download_file vs write_file/read_file:

ScenarioUse
Writing text/code content inlinewrite_file
Reading text file content for analysisread_file
Uploading existing local files (images, datasets, archives)upload_file
Downloading binary outputs (images, PDFs, compiled artifacts)download_file
Large files (>5MB text or any binary)upload_file/download_file

4. Browser Automation

Browser commands execute in the Gull container. Do NOT add the agent-browser prefix.

Operational model (do this to avoid flaky runs):

  • Treat cmd as a single command line split into args and executed directly (not a shell script).
  • Orchestrate branching/loops in the agent logic, not inside cmd.
  • Always follow the cadence: Navigate → Snapshot → Interact → Wait → Re-snapshot.

Standard workflow:

  1. execute_browser(cmd="open https://example.com") — Navigate
  2. execute_browser(cmd="wait --load networkidle") — Stabilize (optional but recommended)
  3. execute_browser(cmd="snapshot -i") — Get interactive element refs (@e1, @e2, ...)
  4. Analyze snapshot output to determine next action
  5. execute_browser(cmd="click ..." | "fill ..." | "select ...") — Interact using refs
  6. If the DOM may have changed, run execute_browser(cmd="snapshot -i") again before further interactions

When to use single vs batch:

ScenarioRecommended
Need intermediate reasoning (snapshot → analyze → decide)Multiple single execute_browser calls
Deterministic sequence (open → fill → click → wait)execute_browser_batch
Complex conditional flows (login, error recovery)Agent orchestrates multiple single calls

Read skills/shipyard-neo/references/browser.md for the detailed command reference, patterns, and artifact handling.

5. Execution History

Track and retrieve past executions for debugging, auditing, or skill creation:

  • get_execution_history — Query with filters (exec_type, success_only, tags, limit)
  • get_execution — Get full details of one execution by ID
  • get_last_execution — Get the most recent execution
  • annotate_execution — Add/update description, tags, notes

6. Skill Self-Update Lifecycle

Turn proven execution patterns into reusable, versioned skills:

  1. Execute tasks → collect execution_ids
  2. annotate_execution — Tag and describe executions
  3. create_skill_candidate — Bundle execution IDs into a candidate
  4. evaluate_skill_candidate — Record evaluation results (pass/fail, score, report)
  5. promote_skill_candidate — Release as canary or stable
  6. rollback_skill_release — Revert to previous version if needed

See references/skills-lifecycle.md for the complete workflow.

Key Constraints

ConstraintValue
sandbox_id format1-128 chars, only [a-zA-Z0-9_-]
Execution timeoutSingle: 1-300s (default 30); Batch: 1-600s (default 60)
Output truncationAuto-truncated beyond 12,000 characters
write_file limit5MB max (UTF-8 encoded)
upload/download limit50MB max per file
Browser prefixNever include agent-browser prefix
Ref lifecycleInvalidated after page navigation or DOM changes; always re-snapshot
Container isolationShip cannot run browser commands; Gull cannot run Python/Shell

Deep-Dive Documentation

ReferenceWhen to Use
references/tools-reference.mdFull parameter reference for all 21 MCP tools
references/browser.mdBrowser automation commands, patterns, and troubleshooting
references/skills-lifecycle.mdSkill candidate → evaluate → promote → rollback workflow
references/sandbox-environment.mdShip/Gull container pre-installed environment and capability details