MAS Decision Gate
Meta-Rule
If a single agent can do the job, do not use MAS. Multi-agent systems are organizational scaling tools, not capability multipliers by default. Research shows 80% of AI projects fail due to premature architectural complexity.
12 Factor Agents Perspective (Factor 10)
The 12 Factor Agents framework reinforces this principle:
Factor 10: Small Focused Agents
Smaller focused prompts with controlled context always beat long autonomous runs.
This applies at two levels:
- •Single vs Multi-Agent: Start with single agent
- •Within Multi-Agent: Each agent should be small and focused
The Progression:
Level 0: Deterministic workflow (no agent)
↓ Only if judgment needed
Level 1: Single focused agent
↓ Only if tools needed
Level 2: Single agent with tools
↓ Only if verification critical
Level 3: Minimal MAS (Planner→Executor→Verifier)
↓ Only if multiple domains
Level 4: Full MAS (when justified by evidence)
Only advance levels when evidence supports it. Most tasks belong at Level 0-2.
Decision Criteria
Use MAS only if at least one is true:
- •Natural decomposition: Tasks split into semi-independent roles
- •Parallel benefit: Concurrent reasoning materially reduces latency (≥40%)
- •Distinct world models: Agents need different knowledge bases or incentives
- •Internal verification: Long-horizon work requires checks and balances
If none apply → build a single agent with structured tools.
Quantitative Thresholds
Deploy Single-Agent When:
| Factor | Threshold |
|---|---|
| Domain complexity | < 3 distinct domains |
| Reasoning steps | < 10 required steps |
| Context needs | < 8K tokens |
| Parallel execution | Not required |
| Budget | Tight constraints (MAS costs 2-4x more) |
| Team expertise | Limited distributed systems experience |
Deploy Multi-Agent When:
| Factor | Threshold |
|---|---|
| Domain complexity | ≥3 distinct domains requiring different expertise |
| Parallel benefit | Reduces latency by ≥40% |
| Verification needs | Long-horizon tasks requiring internal checkpoints |
| Model specialization | Need for expert ensembling (code + security, etc.) |
| Concurrency benefit | Outweighs coordination costs |
Decision Questions
To determine whether MAS is appropriate, answer these questions:
Question 1: Can a single agent complete this task?
- •If yes with reasonable quality → use single agent
- •If struggling with scope/quality → consider MAS
Question 2: What distinct expertise areas are needed?
- •Count domains requiring specialized knowledge
- •1-2 domains → single agent with tools
- •3+ domains → MAS may be justified
Question 3: Are subtasks truly independent?
- •Can work proceed in parallel without dependencies?
- •Yes → MAS provides latency benefit
- •No → MAS adds coordination overhead without benefit
Question 4: Is internal verification critical?
- •Would self-checking be insufficient?
- •Do outputs need adversarial review?
- •Yes → MAS with separate verifier agent
Question 5: What is the failure cost?
- •Low-stakes task → prefer simplicity (single agent)
- •High-stakes task → MAS verification may justify complexity
Simplicity Test
Before building any agent system, answer these sanity-check questions:
Core Questions
- •
Could this just be a deterministic workflow or cron job?
- •If yes → use traditional automation, not agents
- •
Where does uncertainty or judgment actually exist?
- •If nowhere → scripted workflow is sufficient
- •If bounded → single agent with tools
- •If distributed across domains → MAS may be justified
- •
What would happen if the agent vanished tomorrow—could you survive?
- •If operations stop → high value, proceed carefully
- •If minor inconvenience → question the investment
- •
What's the simplest version that would provide value?
- •Build that first, then add complexity only when evidence supports it
12 Factor Simplicity Questions
In addition to the core questions, apply these 12 Factor Agents checks:
- •
Could you own the control flow with code? (Factor 8)
- •If yes → code + single agent likely suffices
- •If no → MAS may help distribute complexity
- •Key insight: Code-controlled DAG beats LLM-controlled DAG
- •
Can state be modeled as (state, event) → new_state? (Factor 12)
- •If yes → cleaner architecture possible with reducers
- •If no → complexity is intrinsic, MAS may help
- •Key insight: Stateless reducers enable debugging and replay
- •
Is context building well-understood? (Factor 3)
- •If yes → single agent with explicit context
- •If no → need to understand context before adding agents
- •Key insight: Context engineering is the core of agent quality
- •
At what level does the task belong? (Factor 10)
- •Level 0: Deterministic (script/workflow)
- •Level 1-2: Single agent (with or without tools)
- •Level 3-4: Multi-agent (only if justified)
- •Key insight: Start simple, add agents only when evidence supports it
Red Flags (Agent May Be Overkill)
| Flag | Implication |
|---|---|
| Task can be fully specified with if/then rules | Use deterministic code |
| No variability in inputs or required responses | Use templates/scripts |
| Human oversight would be faster than building | Skip the agent |
| The "intelligence" needed is just API orchestration | Use workflow automation |
Green Flags (Agent Justified)
| Flag | Implication |
|---|---|
| Genuine ambiguity in how to respond | Agent reasoning needed |
| Need to adapt to novel situations | Learning/flexibility required |
| Complex reasoning across multiple inputs | Agent synthesis valuable |
| Learning from feedback improves outcomes | Agent adaptation worthwhile |
Simplicity Test Output
Document simplicity assessment:
## Simplicity Assessment **Task**: [Brief description] **Deterministic alternative?**: [Yes/No - what would it look like?] **Where is judgment needed?**: [Specific points] **If agent vanished?**: [Impact assessment] **Minimum viable version**: [Description] **Conclusion**: [Proceed with agent / Use simpler alternative]
Common Anti-Patterns
Anti-Pattern: MAS for Capability
Wrong: "Multiple agents will be smarter than one"
Reality: Coordination overhead often exceeds capability gains. ChatDev shows 25% correctness, 60-87% failure rates across frameworks.
Anti-Pattern: Premature Decomposition
Wrong: "Let's split this into 5 agents for better organization"
Reality: Every agent boundary introduces failure points (specification, alignment, verification). Start simple, add agents only when evidence supports it.
Anti-Pattern: Personality-Based Splitting
Wrong: "Creative agent, analytical agent, careful agent"
Reality: Split by functional orthogonality, not personality. Planner, Executor, Verifier—not "smart" vs "creative."
Specialist vs Generalist
2026 consensus strongly favors specialists:
Why specialists win:
- •40-60% fewer tokens for domain tasks
- •Higher accuracy in specialized domains
- •Clearer audit trails and governance
- •Reduced computational waste
Architecture pattern: Specialist agents orchestrated by a coordinator that handles delegation.
Decision Output Format
After analysis, document the decision:
## MAS Decision **Task**: [Brief description] **Decision**: [Single-Agent / Multi-Agent] **Rationale**: - Domain count: [X] domains - Parallel benefit: [Yes/No - expected %] - Verification need: [Low/Medium/High] - Failure cost: [Low/Medium/High] **If Multi-Agent, justify each agent**: - Agent 1: [Role] - [Why separate agent needed] - Agent 2: [Role] - [Why separate agent needed]
Additional Resources
Reference Files
For detailed decision frameworks and evidence:
- •
references/evidence.md- Research data supporting thresholds - •
references/decision-tree.md- Step-by-step decision flowchart - •
../agent-specification/references/twelve-factor-agents.md- Quick reference for all 12 factors
Related Skills
After deciding on MAS, use:
- •agent-specification - For writing proper agent specs (Factors 1, 2, 4, 7)
- •coordination-patterns - For choosing architecture (Factors 3, 5/6, 8, 12)
- •production-readiness - For cost/observability planning (Factors 9, 11)