Codex Agent Teams
Overview
Run one lead agent and multiple specialist teammates to emulate Anthropic Agent Teams patterns in Codex. Keep teammates narrowly scoped, use explicit communication and task-state protocols, and converge through shared quality gates.
This skill mirrors documented Anthropic patterns such as:
- •shared task list with
pending,in_progress,completed - •teammates claiming independent tasks
- •direct teammate messaging and broadcasts
- •explicit debate lifecycle (
start-debate->add-position->decide/apply) - •automatic debate loop orchestration with decision-to-task reflection
- •explicit approval checkpoints before implementation
- •delegate mode where lead coordinates and does not code
See references/anthropic-pattern-map.md for native-vs-emulated feature mapping.
Routing Gate
Choose the execution mode before dispatch:
- •
single-agent: one clear change, no coordination benefit. - •
single-subagent: moderate scope, one main workstream. - •
agent-teams: 2+ independent workstreams, high ambiguity, or strong parallelism benefit.
Promote from single-subagent to agent-teams if scope expands across subsystems or repeated blockers appear.
Startup Sequence
0) Path-safe command setup
Set script paths once so commands work from any directory:
TEAM_SKILL_DIR="${TEAM_SKILL_DIR:-$HOME/.agents/skills/skills_by_openai/codex-agent-teams}"
TEAM_OPS_SCRIPT="$TEAM_SKILL_DIR/scripts/team_ops.py"
TEAM_BRIEF_SCRIPT="$TEAM_SKILL_DIR/scripts/create_team_brief.py"
# Optional explicit override. Leave unset to use auto-discovery.
# TEAM_OPS_ROOT="/absolute/path/to/<workspace>/.codex/teams"
team_ops.py auto-discovers the nearest .codex workspace root when TEAM_OPS_ROOT is not set.
Set TEAM_OPS_ROOT only when you intentionally want a non-default storage location.
1) Team bootstrap
Initialize a shared team workspace:
python3 "$TEAM_OPS_SCRIPT" init \ --team-name "<team-name>" \ --goal "<goal>" \ --members "lead,implementer-a,implementer-b,reviewer"
If the team already exists, init is a no-op by default.
Use --reset only when you intentionally want to wipe and recreate team state:
python3 "$TEAM_OPS_SCRIPT" init \ --team-name "<team-name>" \ --goal "<goal>" \ --members "lead,implementer-a,implementer-b,reviewer" \ --reset
If team metadata is partial/corrupted, init automatically recovers state even without --reset.
2) Charter and initial backlog
Create team charter and seed workstreams:
python3 "$TEAM_BRIEF_SCRIPT" \ --team-name "<team-name>" \ --goal "<goal>" \ --workstreams "stream-a,stream-b" \ --roles "lead,implementer,reviewer,tester" \ --communication-mode "direct" \ --delegate-mode \ --output "<teams-root>/<team-name>/brief.md"
<teams-root> is the same root used by team_ops.py:
- •explicit
TEAM_OPS_ROOT, if set - •otherwise the auto-discovered
<workspace>/.codex/teams
For each work item:
python3 "$TEAM_OPS_SCRIPT" add-task \ --team-name "<team-name>" \ --title "Implement X" \ --owner "unassigned" \ --status "pending" \ --depends-on ""
Role prompt hand-off (manual, no auto-generator):
- •Build per-role prompts from
references/prompt-templates.mdby filling placeholders frombrief.md, task ownership, and current debate/task state. - •
team_ops.pymanages state/logs only; it does not synthesize role prompts. - •For traceability, include role id, owned scope, required outputs, and exact
team_ops.pycommands each sub-agent should run.
3) Approval checkpoint
Before coding, present team plan and wait for explicit user approval. For strict planning phases, pair with:
- •
superpowers:brainstorming - •
superpowers:writing-plans
4) Optional delegate mode
If delegate mode is on, lead does orchestration only:
- •assign and re-balance work
- •resolve cross-stream conflicts
- •enforce convergence gates
- •avoid direct code edits unless user asks
Communication Protocol
Use shared mailbox commands to emulate direct teammate communication:
Send direct message:
python3 "$TEAM_OPS_SCRIPT" message \ --team-name "<team-name>" \ --from "implementer-a" \ --to "reviewer" \ --body "Please review task-3 patch and edge-case notes."
Broadcast update:
python3 "$TEAM_OPS_SCRIPT" broadcast \ --team-name "<team-name>" \ --from "lead" \ --body "Task dependency changed: api-schema now blocks frontend-sync."
Read inbox:
python3 "$TEAM_OPS_SCRIPT" inbox --team-name "<team-name>" --member "reviewer"
Rules:
- •Every teammate reports decisions and blockers through messages.
- •Critical dependency changes are broadcast to all members.
- •Lead records final decisions in brief and task board.
- •For visible team behavior, prefer multiple message rounds (direct messages and inbox reads) instead of a single end-state broadcast.
Sub-agent guardrails:
- •
--team-namemust be a single identifier (no/,\\,.,.., whitespace, or control characters) for bothteam_ops.pyandcreate_team_brief.py. - •
--from,--to,--member,--members,--decidermust refer to registered team members. - •
--owneraccepts either a registered team member orunassigned. - •
create_team_brief.py --rolesdefines descriptive role labels only; command flags still require actual team member IDs fromteam.json. - •
--owner-mapowners must be registered team members orunassigned(one mapping per option; duplicates are rejected). Owners are not limited to debate participants. - •
start-debateand new-debateorchestrate-debatedefault--decidertolead; if team has nolead, the command falls back to the first debate member with a warning (explicit--decideris still recommended). - •New debate creation requires at least two unique
--optionsand at least two unique registered--members; otherwise commands fail fast with explicit input guidance. - •
add-position --membermust be one of the debate's registered members (not just any team member). - •Unknown member/task/debate ids include contextual guidance (registered members or available ids, plus typo suggestions when detectable), including
--notify-fromvalidation errors. - •If a team is missing under the current root and
--team-rootis not explicitly set, commands auto-switch to a uniquely discovered ancestor.codex/teamsroot with a warning. - •If
--team-rootis explicitly set (orTEAM_OPS_ROOTis set), commands do not auto-switch roots; they fail under the specified root for deterministic behavior. - •
--team-root(orTEAM_OPS_ROOT) must resolve to a directory path (not a file path). - •Mutating commands serialize writes with a per-team lock at the resolved team root (after implicit auto-switch when applicable) to prevent concurrent state clobbering; if lock wait times out, retry or tune
TEAM_OPS_LOCK_WAIT_SECONDS. - •Stale lock eviction checks recorded PID liveness and lock identity; active or newly replaced locks are not reclaimed solely by age.
- •Commands fail fast on non-member identities to prevent mailbox/task-board drift.
Task-State Protocol
State model:
- •
pending: not started or waiting on dependency - •
in_progress: claimed by one teammate - •
completed: done and verified
Claim task:
python3 "$TEAM_OPS_SCRIPT" claim \ --team-name "<team-name>" \ --task-id "task-2" \ --member "implementer-b"
Update task status:
python3 "$TEAM_OPS_SCRIPT" update-task \ --team-name "<team-name>" \ --task-id "task-2" \ --status "completed" \ --note "Unit and integration checks passed."
List task board:
python3 "$TEAM_OPS_SCRIPT" list-tasks --team-name "<team-name>"
Debate Protocol (Conflict to Decision)
When two or more approaches conflict, run a formal debate and persist the outcome.
Debates are stored in <teams-root>/<team-name>/debates.json.
Storage boundaries:
- •
debates.jsonstores structured debate state only (topic, options, members, positions, decision, applied state). - •
messages.jsonlstores communication events only (direct/broadcast messages). - •
start-debate --notifyfans out one direct message per debate member inmessages.jsonl(not a single broadcast record). - •Sending a message does not create a debate position; only
add-positionappends todebates.json.
Start a debate:
python3 "$TEAM_OPS_SCRIPT" start-debate \ --team-name "<team-name>" \ --topic "Choose retry strategy for API sync" \ --task-id "task-2" \ --options "fixed-backoff,exponential-backoff" \ --members "lead,implementer-a,implementer-b,reviewer" \ --decider "lead" \ --notify
Each member submits a position:
python3 "$TEAM_OPS_SCRIPT" add-position \ --team-name "<team-name>" \ --debate-id "debate-1" \ --member "implementer-a" \ --option "exponential-backoff" \ --confidence 0.8 \ --rationale "Lower p95 under burst failures."
--confidence must be a finite numeric value in [0, 1].
Lead decides and applies to the linked task:
python3 "$TEAM_OPS_SCRIPT" decide-debate \ --team-name "<team-name>" \ --debate-id "debate-1" \ --rationale "Best reliability/cost tradeoff from submitted evidence." \ --require-all-positions \ --apply \ --status-on-apply "in_progress" \ --owner-map "fixed-backoff:implementer-a,exponential-backoff:implementer-b"
Review debate state:
python3 "$TEAM_OPS_SCRIPT" list-debates --team-name "<team-name>" python3 "$TEAM_OPS_SCRIPT" show-debate --team-name "<team-name>" --debate-id "debate-1"
Interaction fidelity recommendation:
- •When you want explicit teammate interaction traces, it is desirable to run multiple rounds of
messageandadd-position, while keepingdecide-debateas a single final step. - •Suggested pattern:
- •
start-debatewith--notifyto open the decision and notify participants. - •Each member reads
inboxand sends at least one directmessageto challenge or clarify assumptions. - •Each member submits
add-positionwhile debate status isopen(revisions are allowed only before decision). - •Lead sends a final synthesis
messageorbroadcast, then runsdecide-debate.
- •This produces an auditable collaboration trail in
messages.jsonlanddebates.jsonrather than a one-shot decision record.
Decision semantics:
- •
decide-debatedefaults the decider to the debate's configured decider when--decideris omitted. - •
--rationaleis required when recording a new decision. - •For an already
decideddebate,decide-debate --applyreflects the existing decision to task state; it does not allow changing the selected option. - •For an already
applieddebate,decide-debateexits as no-op before--owner-mapparsing/validation. - •
--owner-mapis validated only when applying to a linked task (task_idpresent); debates without linked tasks ignore owner mapping. - •To change direction after a decision, open a new debate linked to the blocked task.
Automatic Orchestration Loop
Use one command to run the loop:
- •create/load debate
- •remind missing members
- •auto-decide when all positions are present (weighted confidence)
- •reflect decision into linked task (
status, optional owner mapping, note, broadcast)
python3 "$TEAM_OPS_SCRIPT" orchestrate-debate \ --team-name "<team-name>" \ --topic "Choose cache invalidation strategy" \ --task-id "task-4" \ --options "ttl-only,event-driven" \ --members "lead,implementer-a,implementer-b,reviewer" \ --decider "lead" \ --send-reminders \ --status-on-apply "in_progress" \ --owner-map "ttl-only:implementer-a,event-driven:implementer-b"
Re-run orchestrate-debate until applied. It is idempotent once a decision is applied.
For an already applied debate, orchestrate-debate exits as no-op before --owner-map parsing/validation.
For debates that are still waiting on positions, orchestrate-debate defers --owner-map validation until the apply step.
If an existing decision payload is corrupted (for example, decision option not in debate options), orchestration fails fast instead of applying invalid state.
Reminder/broadcast sender behavior:
- •
--notify-fromis validated only when reminders or apply broadcasts are actually emitted. - •If
--notify-fromis left as defaultleadandleadis not in team members, sender falls back to the active decider. - •The same fallback rule applies to
start-debate --notify.
Monitoring (Optional, Default OFF)
team_ops.py supports opt-in monitor logging for command-level observability.
Monitoring is disabled by default and no monitor file is created unless explicitly enabled.
Enable monitoring for a command:
python3 "$TEAM_OPS_SCRIPT" orchestrate-debate \ --team-name "<team-name>" \ --debate-id "debate-1" \ --monitoring
Override monitor file path:
python3 "$TEAM_OPS_SCRIPT" update-task \ --team-name "<team-name>" \ --task-id "task-1" \ --status "completed" \ --monitoring \ --monitor-log-file "/tmp/team-monitor.jsonl"
Environment alternatives:
- •
TEAM_OPS_MONITORING=1 - •
TEAM_OPS_MONITOR_LOG_FILE=/path/to/monitor.jsonl - •
TEAM_OPS_ROOT=/path/to/.codex/teams - •
TEAM_OPS_LOCK_WAIT_SECONDS=30(optional write-lock wait override) - •
TEAM_OPS_LOCK_STALE_SECONDS=600(optional stale-lock eviction window)
Generate monitor summary report:
python3 "$TEAM_OPS_SCRIPT" monitor-report \ --team-name "<team-name>" \ --output "<teams-root>/<team-name>/monitor-report.json"
monitor-report is team-filtered by team_name, and reports:
- •
invalid_event_lines: malformed lines skipped (including missing/invalidteam_name) - •
other_team_event_lines: schema-valid lines skipped because they belong to another team - •ISO timestamps without timezone offset are interpreted as UTC when computing reflection latency.
- •
ataccepts common ISO-8601 UTC/offset forms (for example...Z,...+00,...+00:00,...+0000). - •Missing required event fields (for example
at,event_type,command,actor,entity_type,entity_id) or invalidattimestamp formats are treated as invalid monitor lines.
Monitoring event schema (monitor.jsonl, one JSON object per line):
| Field | Type | Description |
|---|---|---|
at | string (ISO-8601 timestamp) | Event timestamp |
event_type | string | Event name (for example message.sent, debate.started, debate.applied, task.updated) |
command | string | team_ops.py command that emitted the event |
team_name | string | Team identifier |
actor | string | Member or system actor that performed the action |
entity_type | string | Domain object type (team, task, debate, message) |
entity_id | string | Object identifier (task-1, debate-2, lead->reviewer) |
before | object or null | Snapshot before change (when applicable) |
after | object or null | Snapshot after change (when applicable) |
metadata | object | Additional event context (option, rationale, missing members, etc.) |
correlation_id | string | Correlation id to group related command runs |
For debate.applied, command is logged as apply-decision (internal apply step used by both decide-debate --apply and orchestrate-debate).
Example line:
{"at":"2026-02-16T06:00:00+00:00","event_type":"debate.applied","command":"apply-decision","team_name":"example-team","actor":"lead","entity_type":"debate","entity_id":"debate-1","before":{"task":{"owner":"unassigned","status":"pending"}},"after":{"task":{"owner":"implementer-a","status":"in_progress"}},"metadata":{"task_id":"task-1","option":"session","status":"in_progress","sender":"lead"},"correlation_id":"2e5db8f0d25d4f08b6f2f4f13a4ed8ad"}
Superpowers Integration
Map workstreams to one control skill:
- •plan refinement and ambiguity reduction:
superpowers:brainstormingthensuperpowers:writing-plans - •independent implementation in this session:
superpowers:subagent-driven-development - •multiple unrelated failures or investigations:
superpowers:dispatching-parallel-agents - •execution from approved written plan in batches:
superpowers:executing-plans - •isolated workspace setup:
superpowers:using-git-worktrees - •pre-completion validation:
superpowers:verification-before-completion - •final quality check:
superpowers:requesting-code-review - •closeout:
superpowers:finishing-a-development-branch
Convergence Gates
Before merging teammate outputs:
- •Spec compliance gate
- •Code quality gate
- •Verification gate (
superpowers:verification-before-completion) - •Final review for major diffs (
superpowers:requesting-code-review) - •Branch closeout (
superpowers:finishing-a-development-branch)
Operating Rules
- •Keep teammates specialized and replaceable.
- •Prefer teams of 2 to 5 members.
- •Keep one source of truth for status in team task board.
- •Do not mark task complete without command output.
- •Escalate unresolved blockers after two failed attempts.
- •If two solutions conflict, start a debate and link it to the blocked task.
- •Do not close debates without recorded rationale and selected option.
- •Reflect debate outcomes into task state before resuming implementation.
Limitations
This skill emulates Agent Teams patterns in Codex, but some native Anthropic UX features may not exist in every runtime. When unavailable, fall back to:
- •explicit message commands via
scripts/team_ops.py - •shared files for task board and decisions
- •lead-mediated routing while preserving protocol semantics
Resources
- •
scripts/create_team_brief.py: generate a reusable team charter and workstream summary template (canonical task board state lives inscripts/team_ops.py). - •
scripts/team_ops.py: manage team members, task states, mailbox, debates, and auto-orchestration. - •
references/anthropic-pattern-map.md: what is mirrored from Anthropic docs. - •
references/team-topologies.md: choose collaboration pattern by coupling and risk profile. - •
references/prompt-templates.md: copy/paste lead, specialist, reviewer, and challenger prompts.