Red/Blue Team Validator
"Find weaknesses before reality does."
Every proposition—whether a decision, strategy, architecture, or plan—has vulnerabilities. This skill systematically exposes them through iterative adversarial cycles. Red Team attacks with substantive, steel-manned challenges. Blue Team defends with mitigations and hardening. The cycle continues until convergence: a battle-tested proposition with documented defenses.
1. Purpose
Core Value Proposition
Static analysis misses what adversarial pressure reveals. Red/Blue validation simulates the attacks your proposition will face—from competitors, critics, reality itself—and forces you to build defenses before you need them. The output is not just a risk list, but a hardened proposition that has survived systematic assault.
Capabilities
| # | Capability | Phase | Value |
|---|---|---|---|
| 1 | Proposition intake with attack surface mapping | Pre-Round | Define what can be attacked |
| 2 | Experience pool loading (domain failure patterns) | Pre-Round | Avoid reinventing known failures |
| 3 | Multi-category attack generation | Round N: Red | Surface vulnerabilities systematically |
| 4 | Steel-manning attacks to maximum strength | Round N: Red | Ensure attacks are not strawmen |
| 5 | Severity scoring (CRITICAL/HIGH/MEDIUM/LOW) | Round N: Red | Prioritize responses |
| 6 | Defense generation (REFUTE/MITIGATE/ACCEPT/HARDEN) | Round N: Blue | Address each attack |
| 7 | Proposition hardening through iterative refinement | Round N: Blue | Strengthen against attacks |
| 8 | Convergence evaluation with explicit criteria | Round N: Eval | Know when to stop |
| 9 | RISK-ASSESSMENT synthesis (CONTRACT-08) | Post-Round | Standardized output |
| 10 | Hardened proposition generation | Post-Round | Battle-tested version |
| 11 | Attack/defense log compilation | Post-Round | Audit trail |
| 12 | Go/no-go recommendation | Post-Round | Decision support |
2. When to Use
Ideal Use Cases
| Scenario | Why Red/Blue Validation Matters |
|---|---|
| Pre-commitment decision review | Simulate objections before committing resources |
| Strategy validation | War-game competitive responses and market realities |
| Architecture decision hardening | Stress-test technical choices before implementation |
| Proposal defense preparation | Anticipate and prepare for stakeholder pushback |
| Investment due diligence | Adversarial review of financial projections and market assumptions |
| Security posture assessment | Systematic attack surface enumeration |
| Go/no-go decisions | High-stakes decisions need adversarial pressure |
| Policy/process validation | Find edge cases and failure modes |
| Product launch readiness | Anticipate market, competitive, and operational challenges |
| M&A target evaluation | Adversarial review of synergy claims |
Anti-Patterns (When NOT to Use)
| Anti-Pattern | Why It's Ineffective | Better Alternative |
|---|---|---|
| Low-stakes decisions | Over-engineering for trivial choices | Just decide and iterate |
| Time-critical emergencies | Fires need extinguishing, not philosophy | Act, then debrief |
| Already committed | Adversarial review after commitment creates conflict | Use for future decisions |
| Early exploration | Premature to attack ideas still forming | Use after initial validation |
| Confirmation theater | Going through motions without genuine adversarial intent | Either commit to true adversarial thinking or skip |
| Reversible decisions | Two-way doors don't need siege testing | Save intensity for one-way doors |
3. Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
subject_type | enum | yes | — | decision | strategy | architecture | plan | policy | investment | security |
max_rounds | integer | no | 3 | Maximum red/blue cycles (1-5) |
attack_intensity | enum | no | standard | light | standard | aggressive |
attack_categories | list | no | auto | Categories to probe (see catalog); auto selects by subject_type |
convergence_mode | enum | no | no_new_critical | no_new_critical | all_addressed | round_limit |
include_experience_pool | boolean | no | true | Load domain-specific failure patterns |
steel_manning_level | enum | no | standard | minimal | standard | maximum |
output_mode | enum | no | full | risk_assessment | hardened_proposition | full_log |
Parameter Effects Matrix
| Parameter | Red Phase Effect | Blue Phase Effect | Convergence Effect |
|---|---|---|---|
attack_intensity: light | Top 3 attack categories | Quick defenses | max_rounds capped at 2 |
attack_intensity: standard | Top 5 attack categories | Full defense protocol | Normal convergence |
attack_intensity: aggressive | All applicable categories | Exhaustive defense | Requires no_new_critical |
steel_manning_level: minimal | 1-pass attacks | — | Faster rounds |
steel_manning_level: standard | 2-pass steel-manning | — | Normal rounds |
steel_manning_level: maximum | 3-pass with ideological Turing test | — | Thorough rounds |
convergence_mode: no_new_critical | — | — | Stop when 0 new CRITICAL/HIGH |
convergence_mode: all_addressed | — | Must address all | Stop when no ACCEPT responses |
convergence_mode: round_limit | — | — | Stop at max_rounds |
Auto-Selected Attack Categories by Subject Type
| Subject Type | Default Attack Categories |
|---|---|
decision | ASSUMPTIONS, ALTERNATIVES, REVERSIBILITY, CONSEQUENCES, TIMING |
strategy | COMPETITIVE, MARKET, EXECUTION, DEPENDENCIES, TIMELINE |
architecture | SCALABILITY, SECURITY, DEPENDENCIES, OPERATIONAL, EDGE_CASES |
plan | FEASIBILITY, RESOURCES, TIMELINE, DEPENDENCIES, RISKS |
policy | EDGE_CASES, ENFORCEMENT, UNINTENDED_CONSEQUENCES, POLITICAL |
investment | ECONOMIC, MARKET, EXECUTION, COMPETITIVE, ASSUMPTIONS |
security | ATTACK_SURFACE, VULNERABILITIES, DEPENDENCIES, OPERATIONAL |
4. Checkpoints
This skill uses interactive checkpoints (see references/checkpoints.yaml) to resolve ambiguity:
- •subject_type_classification — When proposition type is ambiguous
- •attack_intensity_selection — When attack intensity not specified
- •convergence_mode_selection — When convergence criteria not specified
- •premature_convergence_check — When convergence met but warning signs present
- •infinite_loop_risk — When defenses generate more attacks than they resolve
- •output_mode_selection — When output format not specified
5. Iterative Workflow
Workflow Overview
┌─────────────────────────────────────────────────────────────────────────────┐ │ RED/BLUE TEAM VALIDATOR │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ╔══════════════════════════════════════════════════════════════════════╗ │ │ ║ PRE-ROUND SETUP ║ │ │ ║ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ║ │ │ ║ │ Proposition │ │ Attack │ │ Experience │ ║ │ │ ║ │ Intake │─▶│ Surface │─▶│ Pool │ ║ │ │ ║ │ │ │ Mapping │ │ Loading │ ║ │ │ ║ └─────────────┘ └─────────────┘ └─────────────┘ ║ │ │ ╚══════════════════════════════════════════════════════════════════════╝ │ │ │ │ │ ▼ │ │ ╔══════════════════════════════════════════════════════════════════════╗ │ │ ║ ROUND N ║ │ │ ║ ┌─────────────┐ ┌─────────────┐ ║ │ │ ║ │ RED TEAM │ │ BLUE TEAM │ ║ │ │ ║ │ ATTACK │───── Attacks ─────▶│ DEFENSE │ ║ │ │ ║ │ (Generate & │ │ (Respond & │ ║ │ │ ║ │ Steel-man) │ │ Harden) │ ║ │ │ ║ └─────────────┘ └─────────────┘ ║ │ │ ║ │ │ ║ │ │ ║ └────────────┬─────────────────────┘ ║ │ │ ║ ▼ ║ │ │ ║ ┌─────────────┐ ║ │ │ ║ │ EVALUATION │ ║ │ │ ║ │ & Converge? │ ║ │ │ ║ └─────────────┘ ║ │ │ ║ │ ║ │ │ ║ ┌──────────┴──────────┐ ║ │ │ ║ ▼ ▼ ║ │ │ ║ [NOT CONVERGED] [CONVERGED] ║ │ │ ║ → Round N+1 → Exit loop ║ │ │ ╚══════════════════════════════════════════════════════════════════════╝ │ │ │ │ │ ▼ │ │ ╔══════════════════════════════════════════════════════════════════════╗ │ │ ║ POST-ROUND SYNTHESIS ║ │ │ ║ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ║ │ │ ║ │ RISK │ │ Hardened │ │ Attack/ │ ║ │ │ ║ │ ASSESSMENT │ │Proposition │ │ Defense Log │ ║ │ │ ║ │(CONTRACT-08)│ │ Output │ │ │ ║ │ │ ║ └─────────────┘ └─────────────┘ └─────────────┘ ║ │ │ ╚══════════════════════════════════════════════════════════════════════╝ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘
Pre-Round Setup
Purpose: Prepare the battlefield—understand what's being tested and load relevant knowledge.
Steps:
- •
Proposition Intake
- •Receive subject (decision, strategy, architecture, plan, etc.)
- •If verbal, request written summary or create one together
- •Extract key claims and assertions to be defended
- •Identify stakeholders and constraints
- •Note: Proposition should be specific enough to attack
- •
Attack Surface Mapping
- •Identify dimensions available for attack (from attack-vector-catalog)
- •Map proposition claims to attackable surfaces
- •Select attack categories based on
subject_typeor explicitattack_categories - •See:
references/attack-vector-catalog.mdfor categories
- •
Experience Pool Loading (if
include_experience_pool: true)- •Load domain-specific failure patterns
- •Reference historical failures in similar contexts
- •Prepare anti-patterns to probe
- •See:
references/experience-pool-patterns.mdfor patterns
- •
Set Parameters
- •Confirm attack intensity, convergence mode, steel-manning level
- •Estimate expected rounds based on complexity
CHECKPOINT: subject_type_classification
- •If subject_type not specified or ambiguous: AskUserQuestion
- •Present subject type options with attack category implications
CHECKPOINT: attack_intensity_selection
- •If attack_intensity not specified: AskUserQuestion
- •Present intensity options with effort implications
CHECKPOINT: convergence_mode_selection
- •If convergence_mode not specified: AskUserQuestion
- •Present convergence options with trade-offs
Quality Gate: Attack Surface Mapped
- • Proposition boundaries explicitly defined
- • Attack categories selected (minimum 3)
- • Experience pool loaded (if enabled)
- • Parameters confirmed
Output: Attack-ready proposition with mapped attack surface
Round N: Red Team Phase
Purpose: Generate substantive, steel-manned attacks on the proposition.
Reference: See references/red-team-techniques.md and references/steel-manning-protocol.md.
Steps:
- •
Attack Generation
For each attack category in scope, generate attacks:
Technique When to Use Expected Yield Pre-mortem Always 3-5 attacks Inversion Strategy, Decision 2-4 attacks Competitor Simulation Strategy, Investment 2-3 attacks Stress Test Amplification Architecture, Plan 2-4 attacks Devil's Advocate Policy, Decision 2-3 attacks Blind Spot Hunter All 1-3 attacks Historical Pattern Matching All (with experience pool) 2-4 attacks Black Hat Thinking Security, Competitive 3-5 attacks - •See:
references/red-team-techniques.mdfor protocols
- •See:
- •
Steel-Manning (per
steel_manning_level)For each attack, strengthen to maximum potency:
Level Passes Protocol minimal 1 Basic attack formulation standard 2 + "How can this be more damaging?" maximum 3 + Ideological Turing test: "Would a true opponent accept this?" Steel-manning checklist:
- •
Attack is specific, not vague
- •
Attack has clear mechanism of harm
- •
Attack includes realistic trigger conditions
- •
Attack would concern a reasonable proponent
- •
Attack is not easily dismissed
- •
See:
references/steel-manning-protocol.mdfor full protocol
- •
- •
Severity Scoring
Score each attack using SEVERITY-SCORING (RUBRIC-07):
Severity Definition Response Urgency CRITICAL Blocks primary objective; cannot proceed Must address in Blue Phase HIGH Significant impact; major rework required Should address in Blue Phase MEDIUM Degrades quality; should fix but can proceed Address if time permits LOW Minor issue; cosmetic Document and monitor Scoring dimensions:
- •Impact (0.5 weight): How damaging if attack succeeds?
- •Likelihood (0.3 weight): How likely is this attack vector?
- •Detectability (0.2 weight): How hard to see this coming?
- •
Attack Documentation
For each attack:
codeAttack ID: ATK-[round]-[number] Category: [From attack-vector-catalog] Target: [What aspect of proposition] Statement: [Clear attack formulation] Mechanism: [How this would cause harm] Severity: [CRITICAL | HIGH | MEDIUM | LOW] Steel-manning: [minimal | standard | maximum] - [notes] Experience pool match: [Pattern ID if applicable]
Quality Gate: Attacks Substantive
- • Minimum 3 attacks generated
- • At least 2 different attack categories represented
- • Steel-manning applied per level
- • No trivial or easily dismissed attacks
- • Severities assigned with rationale
Output: Prioritized attack list for Blue Team
Round N: Blue Team Phase
Purpose: Respond to each attack with defenses, mitigations, or proposition hardening.
Reference: See references/blue-team-techniques.md for defense protocols.
Steps:
- •
Attack Triage
Prioritize attacks by severity:
- •CRITICAL: Must address this round
- •HIGH: Should address this round
- •MEDIUM: Address if time/capacity permits
- •LOW: Document for monitoring
- •
Defense Generation
For each attack, determine response type:
Response Type When to Use Effect REFUTE Attack is invalid; evidence proves it wrong Attack dismissed MITIGATE Attack is valid; add safeguards Risk reduced ACCEPT Attack is valid; insufficient mitigation possible Residual risk documented HARDEN Modify proposition to eliminate vulnerability Proposition strengthened Defense techniques:
Technique Response Type When to Use Evidence-Based Refutation REFUTE When data contradicts attack Mitigation Design MITIGATE When attack is valid but manageable Contingency Planning MITIGATE When fallback is needed Monitoring/Detection MITIGATE When early warning helps Hardening Protocol HARDEN When proposition can be strengthened Risk Transfer MITIGATE When others can absorb risk Staged Commitment MITIGATE When phasing reduces exposure Kill Switch Design MITIGATE When reversibility is critical - •See:
references/blue-team-techniques.mdfor detailed protocols
- •See:
- •
Defense Documentation
For each defense:
codeDefense ID: DEF-[round]-[number] Attack Addressed: ATK-[round]-[number] Response Type: [REFUTE | MITIGATE | ACCEPT | HARDEN] Defense: [Specific response] Evidence/Rationale: [Why this defense works] Residual Risk: [ELIMINATED | REDUCED | UNCHANGED] Proposition Change: [If HARDEN, what changed]
- •
Proposition Hardening
Apply all HARDEN responses to proposition:
- •Document each modification
- •Track changes between rounds
- •Maintain hardened proposition version
- •
Defense Quality Check
For each defense, verify:
- • Defense actually addresses the attack (not adjacent issue)
- • REFUTE claims have supporting evidence
- • MITIGATE responses are actionable
- • ACCEPT responses have residual risk documented
- • HARDEN changes don't introduce new vulnerabilities
Quality Gate: Attacks Addressed
- • Every attack has a defense response
- • CRITICAL attacks have REFUTE or MITIGATE (not ACCEPT)
- • HIGH attacks have REFUTE, MITIGATE, or documented ACCEPT with rationale
- • Hardening changes documented
- • No hand-waving defenses
Output: Defense log with updated (hardened) proposition
Round N: Evaluation Phase
Purpose: Determine if another round is needed or convergence achieved.
Reference: See references/convergence-criteria.md for detailed criteria.
Steps:
- •
Assess Round Quality
Red Team assessment:
- •Were attacks substantive or rehashes of previous rounds?
- •Are there novel attack angles remaining?
- •Is Red Team finding diminishing returns?
Blue Team assessment:
- •Were defenses genuine or hand-waving?
- •Are mitigations actionable?
- •Has proposition been strengthened?
- •
Apply Convergence Criteria
Mode Stop When Continue When no_new_criticalRound produced 0 new CRITICAL or HIGH attacks New CRITICAL or HIGH attacks found all_addressedNo ACCEPT responses remain (all REFUTE/MITIGATE/HARDEN) Any ACCEPT responses remain round_limitmax_roundsreachedBelow max_roundsOverride conditions (continue despite convergence):
- •Obvious attack categories not yet explored
- •Blue Team defenses appear inadequate
- •Stakeholder requests additional scrutiny
Premature termination signs (don't stop too early):
- •Less than 2 rounds completed
- •CRITICAL attacks still have ACCEPT responses
- •Key attack categories unexplored
- •
Document Convergence Decision
codeRound [N] Evaluation: - New CRITICAL attacks: [count] - New HIGH attacks: [count] - ACCEPT responses remaining: [count] - Convergence mode: [mode] - Decision: [CONTINUE | CONVERGED] - Rationale: [explanation]
- •
Proceed or Exit
- •If NOT CONVERGED: Increment round, return to Red Phase
- •If CONVERGED: Proceed to Post-Round Synthesis
CHECKPOINT: premature_convergence_check
- •If convergence met but warning signs present: AskUserQuestion
- •Warning signs: <2 rounds, CRITICAL ACCEPTs remain, key categories unexplored
CHECKPOINT: infinite_loop_risk
- •If new attacks from defenses exceed previous round: AskUserQuestion
- •May indicate fundamental proposition issues
Quality Gate: Convergence Evaluated
- • Explicit continue/stop decision documented
- • Rationale provided
- • Override conditions checked
- • Premature termination signs checked
Output: Convergence decision with rationale
Post-Round Synthesis
Purpose: Compile findings into actionable outputs.
Reference: See templates/ for output formats.
CHECKPOINT: output_mode_selection
- •If output_mode not specified: AskUserQuestion
- •Options: risk_assessment, hardened_proposition, full_log
Steps:
- •
Compile Attack/Defense Log
Consolidate all rounds:
- •All attacks with responses
- •Round-by-round progression
- •Convergence trajectory
- •See:
templates/attack-defense-log.md
- •
Derive RISK-ASSESSMENT (CONTRACT-08)
Transform unresolved attacks into risks:
Attack Status Risk Derivation ACCEPT response Direct risk: attack remains valid MITIGATE with residual Risk: partially addressed MITIGATE with ELIMINATED No risk (resolved) REFUTE No risk (invalid attack) HARDEN No risk (vulnerability removed) Score each derived risk using SEVERITY-SCORING:
- •Include mitigations from Blue Team responses
- •See:
templates/risk-assessment-output.md
- •
Generate Hardened Proposition
Compile final version:
- •Original proposition + all HARDEN modifications
- •List of accepted residual risks
- •Battle-tested confidence score
- •Conditions for validity
- •Review triggers
- •See:
templates/hardened-proposition-output.md
- •
Calculate Battle-Tested Confidence
Score based on:
- •Rounds completed (more = higher confidence)
- •Attack quality (substantive attacks survived)
- •Defense quality (genuine defenses, not hand-waving)
- •Residual risk profile (fewer ACCEPT = higher confidence)
Score Meaning 80-100 High confidence: withstood aggressive scrutiny 60-79 Moderate confidence: key challenges addressed 40-59 Low confidence: significant risks remain 0-39 Very low confidence: fundamental issues unresolved - •
Generate Go/No-Go Recommendation
Recommendation When PROCEED Low/very low residual risk; proposition battle-tested PROCEED_WITH_CAUTION Moderate risk; mitigations in place SIGNIFICANT_CONCERNS High risk; key attacks unresolved DO_NOT_PROCEED Very high risk; fundamental flaws exposed
Quality Gates:
- • RISK-ASSESSMENT complete with go/no-go
- • All attacks traced to risks or resolutions
- • Hardened proposition documented
- • Battle-tested confidence calculated
- • Attack/defense log compiled
Output: RISK-ASSESSMENT (CONTRACT-08), Hardened Proposition, Attack/Defense Log
5. Attack Vector Catalog
Ten categories of attacks, applicable across subject types:
5.1 ASSUMPTIONS
Definition: Attacks targeting hidden, unstated, or fragile assumptions.
| Attack Pattern | Target | Example |
|---|---|---|
| Hidden assumption exposure | Unstated beliefs | "You're assuming customers want this feature" |
| Load-bearing challenge | Critical assumptions | "If this assumption fails, the whole plan collapses" |
| Temporal decay | Time-sensitive assumptions | "This assumption won't hold in 2 years" |
| Behavioral assumptions | Human behavior predictions | "You're assuming the team will change behavior" |
| Counterfactual reversal | Any assumption | "What if the opposite is true?" |
Risk Level: HIGH (often invisible until failure)
5.2 DEPENDENCIES
Definition: Attacks targeting external or internal dependencies.
| Attack Pattern | Target | Example |
|---|---|---|
| External dependency failure | Third parties | "What if the vendor goes out of business?" |
| Technology obsolescence | Tech dependencies | "This framework may not be maintained in 3 years" |
| Team capability dependency | People | "This requires skills the team doesn't have" |
| Resource availability | Resources | "What if the budget is cut 30%?" |
| Single point of failure | Critical dependencies | "Everything depends on this one system" |
Risk Level: HIGH (external factors often uncontrollable)
5.3 EDGE_CASES
Definition: Attacks targeting boundary conditions and unusual scenarios.
| Attack Pattern | Target | Example |
|---|---|---|
| Boundary conditions | Limits | "What happens at 0? At max capacity?" |
| Scale extremes | Very large/small | "Does this work with 1 user? 1 million?" |
| Timing edge cases | Timing | "What if these events happen simultaneously?" |
| Data quality | Inputs | "What if the input data is garbage?" |
| Concurrency/race conditions | Parallel operations | "What if two users do this at the same time?" |
Risk Level: MEDIUM (often discoverable through testing)
5.4 SCALABILITY
Definition: Attacks targeting ability to grow or shrink.
| Attack Pattern | Target | Example |
|---|---|---|
| Horizontal scaling limits | Adding instances | "Can you just add more servers?" |
| Vertical scaling limits | Bigger instances | "What if you need 10x the memory?" |
| Cost scaling non-linearity | Economics | "Costs grow O(n²) with users" |
| Operational complexity | Team capacity | "Can the team manage 50 services?" |
| Data volume scaling | Storage/processing | "What happens with 10TB of data?" |
Risk Level: HIGH (often not discovered until growth happens)
5.5 SECURITY
Definition: Attacks targeting security posture and vulnerabilities.
| Attack Pattern | Target | Example |
|---|---|---|
| Attack surface exposure | Entry points | "Every API is an attack vector" |
| Data breach scenarios | Data protection | "What if this database is compromised?" |
| Authentication gaps | Identity | "How do you prevent unauthorized access?" |
| Authorization gaps | Permissions | "Can users access others' data?" |
| Compliance violations | Regulations | "Does this violate GDPR?" |
Risk Level: CRITICAL (security failures can be catastrophic)
5.6 COMPETITIVE
Definition: Attacks targeting competitive dynamics.
| Attack Pattern | Target | Example |
|---|---|---|
| Competitor response | Competitive reaction | "What will [competitor] do when they see this?" |
| Market timing | Windows | "The market window may close before launch" |
| Differentiation erosion | Uniqueness | "This feature can be copied in weeks" |
| Pricing pressure | Economics | "Competitor can undercut by 50%" |
| Acquisition/partnership disruption | Strategic moves | "What if competitor acquires your key partner?" |
Risk Level: HIGH (competitive dynamics are unpredictable)
5.7 OPERATIONAL
Definition: Attacks targeting day-to-day operations.
| Attack Pattern | Target | Example |
|---|---|---|
| Complexity explosion | Manageability | "This will be impossible to debug" |
| Incident scenarios | Failure recovery | "What's the MTTR when this breaks at 3 AM?" |
| Recovery time | Resilience | "Can you recover within SLA?" |
| Monitoring gaps | Observability | "How would you even know it's failing?" |
| On-call burden | Team health | "This will burn out the team" |
Risk Level: MEDIUM-HIGH (operational issues compound)
5.8 ECONOMIC
Definition: Attacks targeting financial viability.
| Attack Pattern | Target | Example |
|---|---|---|
| Unit economics failure | Per-unit costs | "Each customer costs more than they pay" |
| Cost structure vulnerability | Fixed costs | "Break-even requires 10x current volume" |
| Revenue model fragility | Income sources | "Revenue depends on one customer segment" |
| Funding/cash flow | Capital | "You'll run out of runway in 8 months" |
| Market size overestimation | TAM/SAM/SOM | "Your market is 1/10th the claimed size" |
Risk Level: HIGH (financial failure is existential)
5.9 ORGANIZATIONAL
Definition: Attacks targeting people and organization.
| Attack Pattern | Target | Example |
|---|---|---|
| Capability gaps | Skills | "No one on the team has done this before" |
| Key person dependency | Individuals | "If [person] leaves, this fails" |
| Cultural resistance | Adoption | "The organization will reject this change" |
| Political opposition | Stakeholders | "[Executive] will block this" |
| Change management | Transition | "Users will refuse to migrate" |
Risk Level: MEDIUM-HIGH (organizational dynamics are complex)
5.10 TEMPORAL
Definition: Attacks targeting timing and duration.
| Attack Pattern | Target | Example |
|---|---|---|
| Timeline compression | Deadlines | "What if the deadline is moved up 3 months?" |
| Timeline extension impact | Delays | "What if this takes twice as long?" |
| Market window closure | Timing | "The opportunity won't exist in 12 months" |
| Technology obsolescence | Tech lifecycle | "This technology will be obsolete" |
| Regulatory timeline | External deadlines | "Regulation changes in 6 months" |
Risk Level: HIGH (timing failures are often unrecoverable)
6. Convergence Criteria
Mode Definitions
| Mode | Definition | Best For |
|---|---|---|
no_new_critical | Stop when round produces 0 new CRITICAL or HIGH attacks | Most use cases |
all_addressed | Stop when no ACCEPT responses remain | High-stakes decisions |
round_limit | Stop at max_rounds regardless | Time-constrained reviews |
Measurement Methods
no_new_critical:
- •Count CRITICAL attacks generated this round: must be 0
- •Count HIGH attacks generated this round: must be 0
- •Attacks that are variants of previous attacks don't count as "new"
all_addressed:
- •Count ACCEPT responses across all rounds
- •Must be 0 (all attacks have REFUTE, MITIGATE, or HARDEN)
round_limit:
- •Simply check
current_round >= max_rounds
Override Conditions (Continue Despite Convergence)
- •Obvious attack categories not yet explored
- •Stakeholder requests additional rounds
- •Blue Team defenses appear superficial
- •Recent hardening changes may introduce new vulnerabilities
Premature Termination Signs (Don't Stop Too Early)
- •Less than 2 rounds completed
- •CRITICAL attacks still have ACCEPT responses
- •Attack quality improving (not diminishing) each round
- •Key experience pool patterns not yet probed
7. Output Specifications
7.1 Primary Output: RISK-ASSESSMENT
Compliant with CONTRACT-08 from artifact-contracts.yaml.
See: templates/risk-assessment-output.md for complete XML template.
Key extensions for adversarial validation:
- •
<adversarial_summary>with attack/defense statistics - •Risks traced to source attacks (ATK-X-Y)
- •Battle-tested confidence score
- •Defense quality assessment
7.2 Secondary Output: Hardened Proposition
See: templates/hardened-proposition-output.md for complete template.
Includes:
- •Original proposition vs. battle-tested version
- •All modifications with rationale
- •Accepted residual risks
- •Conditions for validity
- •Review triggers
7.3 Secondary Output: Attack/Defense Log
See: templates/attack-defense-log.md for complete template.
Includes:
- •Round-by-round attack and defense tables
- •Convergence evaluation per round
- •Severity distribution
- •Resolution statistics
8. Quality Gates Summary
| # | Gate | Criterion | Phase |
|---|---|---|---|
| 1 | Attack Surface Mapped | Proposition boundaries defined, categories selected | Pre-Round |
| 2 | Experience Pool Loaded | Domain patterns available (if enabled) | Pre-Round |
| 3 | Attacks Substantive | Attacks are non-trivial, steel-manned | Round N: Red |
| 4 | Attacks Diverse | At least 2 different categories represented | Round N: Red |
| 5 | Severities Assigned | All attacks have severity with rationale | Round N: Red |
| 6 | All Attacks Addressed | Every attack has a defense response | Round N: Blue |
| 7 | Critical Attacks Defended | CRITICAL/HIGH have REFUTE or MITIGATE | Round N: Blue |
| 8 | No Hand-Waving | Defenses are actionable, not vague | Round N: Blue |
| 9 | Convergence Evaluated | Explicit continue/stop decision | Round N: Eval |
| 10 | Risks Derived | Unresolved attacks become risks | Post-Round |
| 11 | Go/No-Go Issued | Clear recommendation | Post-Round |
| 12 | Hardened Proposition | Battle-tested version documented | Post-Round |
Gate Requirements by Intensity
| Gate | Light | Standard | Aggressive |
|---|---|---|---|
| Attack categories | 3 | 5 | All applicable |
| Minimum attacks | 5 | 10 | 15+ |
| Steel-manning level | minimal | standard | maximum |
| Convergence mode | round_limit (2) | no_new_critical | no_new_critical |
| Max rounds | 2 | 3 | 5 |
9. Behavioral Guidelines
Red Team Principles
- •Steel-man, don't strawman: Make attacks as strong as possible
- •Attack the proposition, not the proposer: Focus on ideas, not people
- •Be creative but realistic: Novel attacks should be plausible
- •Prioritize ruthlessly: CRITICAL issues first
- •Use the experience pool: Don't reinvent known failures
- •Ideological Turing test: Would a true critic accept this attack?
Blue Team Principles
- •Defend genuinely, don't dismiss: Every attack deserves honest consideration
- •Evidence over assertion: REFUTE claims need proof
- •Actionable mitigations: MITIGATE responses must be specific
- •Honest acceptance: If you can't defend, ACCEPT the risk
- •Harden proactively: Don't wait for attacks to strengthen
- •Avoid defensive denial: Admitting weakness is strength
Tone Calibration
| Intensity | Red Team Tone | Blue Team Tone |
|---|---|---|
| Light | Collaborative skeptic | Quick sanity check |
| Standard | Professional adversary | Thorough defense |
| Aggressive | Determined opponent | Comprehensive rebuttal |
10. Workflow Integration
Upstream Skills
| Skill | Provides | Use Case |
|---|---|---|
assumption-validator | Assumption inventory | Attack assumptions already surfaced |
expert-panel-deliberation | Multi-perspective input | Diverse attack/defense viewpoints |
research-interviewer | KNOWLEDGE-CORPUS | Domain knowledge for attacks |
Downstream Skills
| Skill | Receives | Use Case |
|---|---|---|
expert-panel-deliberation | RISK-ASSESSMENT | Panel review of risks |
generate-ideas | Attack gaps | Generate alternatives for failed propositions |
Skill Chaining Example
assumption-validator → RISK-ASSESSMENT (assumption-derived)
↓
red-blue-validator → RISK-ASSESSMENT (adversarial-derived)
↓
expert-panel-deliberation → Final recommendation with multi-expert review
11. References
| Document | Purpose |
|---|---|
references/attack-vector-catalog.md | 10 attack categories with specific attacks |
references/red-team-techniques.md | 8 attack generation techniques |
references/blue-team-techniques.md | 8 defense techniques |
references/steel-manning-protocol.md | Protocol for maximizing attack strength |
references/convergence-criteria.md | Detailed criteria for stopping |
references/experience-pool-patterns.md | 50+ failure patterns by domain |
Core Library References
| Library | Element | Usage |
|---|---|---|
core/skill-patterns.yaml | PATTERN-06: ADVERSARIAL-VALIDATE | Workflow pattern |
core/artifact-contracts.yaml | CONTRACT-08: RISK-ASSESSMENT | Output format |
core/scoring-rubrics.yaml | RUBRIC-07: SEVERITY-SCORING | Attack severity |
core/technique-taxonomy.yaml | CAT-UR, CAT-PP | Adversarial techniques |
12. Templates
| Template | Purpose |
|---|---|
templates/risk-assessment-output.md | CONTRACT-08 compliant RISK-ASSESSMENT with adversarial extensions |
templates/attack-defense-log.md | Round-by-round attack/defense documentation |
templates/hardened-proposition-output.md | Battle-tested proposition with modifications |
13. Examples
Example 1: Architecture Decision — Microservices Migration
input:
subject: "Migrate payment processing from monolith to microservices"
subject_type: architecture
max_rounds: 3
attack_intensity: standard
convergence_mode: no_new_critical
include_experience_pool: true
steel_manning_level: standard
flow:
pre_round:
proposition: "Decompose payment monolith into 5 microservices over 12 months"
attack_surface:
- ASSUMPTIONS: Team capability, timeline, complexity estimates
- DEPENDENCIES: Infrastructure, vendor APIs, data consistency
- SCALABILITY: Service coordination overhead
- OPERATIONAL: Debugging distributed systems
- EDGE_CASES: Partial failures, network partitions
experience_pool_loaded:
- "Distributed monolith anti-pattern"
- "Service boundary misalignment"
- "Operational complexity explosion"
round_1:
red_team:
attacks:
- ATK-1-1: "Team has zero production microservices experience"
Category: ORGANIZATIONAL
Severity: CRITICAL
Steel-manned: "Even with training, production microservices require
tacit knowledge that only comes from operating them"
- ATK-1-2: "12-month timeline ignores learning curve and unknowns"
Category: TEMPORAL
Severity: HIGH
Steel-manned: "Industry benchmarks show microservices migrations
typically take 2-3x initial estimates"
- ATK-1-3: "Distributed transactions will break payment consistency"
Category: EDGE_CASES
Severity: CRITICAL
Steel-manned: "Payment systems require ACID guarantees that
eventual consistency cannot provide"
- ATK-1-4: "Debugging distributed payment failures at 3 AM"
Category: OPERATIONAL
Severity: HIGH
Steel-manned: "When payments fail across service boundaries,
MTTR could exceed SLA without distributed tracing expertise"
new_critical: 2
new_high: 2
blue_team:
defenses:
- DEF-1-1: Response to ATK-1-1
Type: MITIGATE
Defense: "Hire 2 senior engineers with microservices experience.
Engage architecture consultancy for first 6 months."
Residual: REDUCED
- DEF-1-2: Response to ATK-1-2
Type: HARDEN
Defense: "Extend timeline to 18 months. Add 3-month buffer for unknowns."
Proposition Change: "12 months" → "18 months with 3-month buffer"
Residual: ELIMINATED
- DEF-1-3: Response to ATK-1-3
Type: HARDEN
Defense: "Keep payment processing in single service with ACID guarantees.
Only extract non-critical services to microservices."
Proposition Change: "5 microservices" → "3 microservices + 1 payment service"
Residual: ELIMINATED
- DEF-1-4: Response to ATK-1-4
Type: MITIGATE
Defense: "Implement distributed tracing (Jaeger) before migration.
Establish on-call runbooks. Require observability as launch gate."
Residual: REDUCED
evaluation:
new_critical: 2
new_high: 2
convergence_mode: no_new_critical
decision: CONTINUE
rationale: "New critical attacks found; continue to Round 2"
round_2:
red_team:
attacks:
- ATK-2-1: "Hiring 2 senior engineers in 6 months is optimistic"
Category: ORGANIZATIONAL
Severity: HIGH
Steel-manned: "Market for microservices expertise is extremely
competitive; 6-month hiring timeline may slip"
- ATK-2-2: "Distributed tracing adds operational complexity itself"
Category: OPERATIONAL
Severity: MEDIUM
Steel-manned: "Jaeger requires infrastructure, maintenance,
and expertise to operate"
- ATK-2-3: "Service boundary around payments may be wrong"
Category: ASSUMPTIONS
Severity: MEDIUM
Steel-manned: "Without event storming, service boundaries
are guesses that may need rework"
new_critical: 0
new_high: 1
blue_team:
defenses:
- DEF-2-1: Response to ATK-2-1
Type: MITIGATE
Defense: "Begin hiring immediately. Have contingency: extend
consultancy or use contractor bridge if hiring slips."
Residual: REDUCED
- DEF-2-2: Response to ATK-2-2
Type: ACCEPT
Defense: "Accept additional complexity as cost of observability.
Allocate 0.5 FTE for observability platform."
Residual: UNCHANGED (but monitored)
- DEF-2-3: Response to ATK-2-3
Type: HARDEN
Defense: "Conduct event storming workshop before finalizing
service boundaries. Add 4 weeks for domain modeling."
Proposition Change: Add event storming phase
Residual: ELIMINATED
evaluation:
new_critical: 0
new_high: 1
convergence_mode: no_new_critical
decision: CONTINUE (HIGH attack found)
round_3:
red_team:
attacks:
- ATK-3-1: "Event storming may reveal the migration is unnecessary"
Category: ASSUMPTIONS
Severity: MEDIUM
Steel-manned: "Domain modeling might show modular monolith
is sufficient for scaling needs"
- ATK-3-2: "Consultancy dependency creates knowledge transfer risk"
Category: DEPENDENCIES
Severity: MEDIUM
Steel-manned: "If consultants leave, tacit knowledge leaves with them"
new_critical: 0
new_high: 0
blue_team:
defenses:
- DEF-3-1: Response to ATK-3-1
Type: ACCEPT
Defense: "Valid point. Event storming is a gate; if it reveals
microservices aren't needed, we pivot to modular monolith."
Residual: UNCHANGED (but this is a feature, not a bug)
- DEF-3-2: Response to ATK-3-2
Type: MITIGATE
Defense: "Require knowledge transfer sessions, documentation
deliverables, and pair programming in consultancy contract."
Residual: REDUCED
evaluation:
new_critical: 0
new_high: 0
convergence_mode: no_new_critical
decision: CONVERGED
rationale: "No new CRITICAL or HIGH attacks. Proposition has been hardened."
post_round:
hardened_proposition:
original: "Migrate payment processing from monolith to 5 microservices
over 12 months"
battle_tested: "Migrate to 3 microservices + 1 payment service over
18 months (with 3-month buffer), after event storming
confirms boundaries. Requires: 2 hired senior engineers,
architecture consultancy, distributed tracing infrastructure,
0.5 FTE observability platform maintenance."
modifications:
- MOD-1: 5 services → 3 + 1 payment (Response to ATK-1-3)
- MOD-2: 12 months → 18 months + buffer (Response to ATK-1-2)
- MOD-3: Added event storming prerequisite (Response to ATK-2-3)
- MOD-4: Added observability infrastructure requirement (Response to ATK-1-4)
accepted_residual_risks:
- "Hiring timeline may slip (mitigated by contingency)"
- "Observability platform adds operational overhead (accepted)"
- "Event storming may reveal migration unnecessary (feature)"
- "Consultancy knowledge transfer requires active management"
battle_tested_confidence: 72
confidence_rationale: "Proposition survived 3 rounds of substantive attacks.
Critical issues addressed through hardening.
Residual risks are manageable and monitored."
risk_assessment:
total_risks: 4
critical_risks: 0
high_risks: 1 (hiring timeline)
moderate_risks: 3
risk_profile: MODERATE
go_no_go: PROCEED_WITH_CAUTION
recommendation: |
PROCEED_WITH_CAUTION — The original proposition had critical flaws
(team capability, payment consistency, timeline). The hardened
proposition addresses these through:
- Scoped migration (keeping payments transactional)
- Extended timeline with buffer
- Event storming validation gate
- Observability prerequisites
Key risks to monitor:
1. Hiring: Start immediately; have contingency ready
2. Consultancy knowledge transfer: Contract requirements
3. Event storming outcome: Be prepared to pivot if boundaries don't hold
Example 2: Strategy Validation — Market Expansion
input:
subject: "Expand to European market in Q3 with existing product"
subject_type: strategy
max_rounds: 2
attack_intensity: standard
convergence_mode: no_new_critical
flow:
round_1:
red_team:
attacks:
- ATK-1-1: "GDPR compliance timeline is unrealistic"
Severity: CRITICAL
- ATK-1-2: "Competitor X already dominates EU market"
Severity: HIGH
- ATK-1-3: "Pricing model assumes US willingness-to-pay"
Severity: HIGH
- ATK-1-4: "No local sales team or market knowledge"
Severity: MEDIUM
blue_team:
defenses:
- DEF-1-1: HARDEN - Push launch to Q4; engage GDPR consultancy
- DEF-1-2: MITIGATE - Focus on underserved segments competitor ignores
- DEF-1-3: HARDEN - Conduct pricing research; plan EU-specific pricing
- DEF-1-4: MITIGATE - Partner with EU distributor initially
evaluation:
new_critical: 1
decision: CONTINUE
round_2:
red_team:
attacks:
- ATK-2-1: "EU distributor takes 40% margin"
Severity: MEDIUM
- ATK-2-2: "Underserved segments may be underserved for good reason"
Severity: MEDIUM
blue_team:
defenses:
- DEF-2-1: ACCEPT - Cost of market entry; build direct sales in Year 2
- DEF-2-2: MITIGATE - Validate segment with EU market research
evaluation:
new_critical: 0
new_high: 0
decision: CONVERGED
post_round:
hardened_proposition: |
Launch EU expansion in Q4 (not Q3) with:
- GDPR compliance verified by consultancy
- EU-specific pricing based on market research
- Initial distribution through EU partner
- Focus on [specific underserved segments]
- Year 2: Build direct sales capability
risk_assessment:
profile: MODERATE
go_no_go: PROCEED_WITH_CAUTION
Example 3: Investment Decision — Series B Funding
input:
subject: "Accept Series B term sheet at $50M valuation"
subject_type: investment
max_rounds: 2
attack_intensity: standard
convergence_mode: no_new_critical
flow:
pre_round:
proposition: "Accept $15M Series B at $50M pre-money valuation from [VC Firm]"
attack_surface:
- ECONOMIC: Valuation, dilution, runway
- ASSUMPTIONS: Growth projections, market size
- DEPENDENCIES: VC firm reputation, board dynamics
- COMPETITIVE: Market timing, competitor funding
round_1:
red_team:
attacks:
- ATK-1-1: "Valuation assumes 3x YoY growth; current trajectory is 1.8x"
Category: ASSUMPTIONS
Severity: HIGH
Steel-manned: "At 1.8x growth, next round valuation math doesn't work;
down round likely in 18 months"
- ATK-1-2: "15-month runway at current burn; need to hit milestones or raise bridge"
Category: ECONOMIC
Severity: HIGH
Steel-manned: "Milestones require growth acceleration you haven't demonstrated"
- ATK-1-3: "[VC Firm] has reputation for replacing founders at Series C"
Category: DEPENDENCIES
Severity: MEDIUM
Steel-manned: "3 of their last 5 Series B companies had founder transitions"
- ATK-1-4: "Competitor just raised $40M; will outspend on customer acquisition"
Category: COMPETITIVE
Severity: HIGH
Steel-manned: "Their CAC advantage compounds; market share gap widens"
blue_team:
defenses:
- DEF-1-1: Response to ATK-1-1
Type: HARDEN
Defense: "Negotiate milestone-based valuation adjustment; lower initial
valuation with ratchet up if growth targets hit"
Proposition Change: Add milestone ratchet provision
- DEF-1-2: Response to ATK-1-2
Type: MITIGATE
Defense: "Negotiate 18-month runway minimum; reduce burn by 20% through
hiring pause; extend runway to 20 months"
Residual: REDUCED
- DEF-1-3: Response to ATK-1-3
Type: MITIGATE
Defense: "Negotiate founder-friendly protective provisions; 2-year
employment agreements; board composition safeguards"
Residual: REDUCED
- DEF-1-4: Response to ATK-1-4
Type: ACCEPT
Defense: "Competitive pressure is real but unavoidable. Focus on
capital-efficient growth and product differentiation over
CAC war. This is market reality, not term sheet issue."
Residual: UNCHANGED
evaluation:
new_critical: 0
new_high: 3
decision: CONTINUE
round_2:
red_team:
attacks:
- ATK-2-1: "Milestone ratchet creates misaligned incentives; may optimize
for metrics over business health"
Category: ASSUMPTIONS
Severity: MEDIUM
- ATK-2-2: "Hiring pause delays product roadmap; competitive gap widens"
Category: TEMPORAL
Severity: MEDIUM
blue_team:
defenses:
- DEF-2-1: Response to ATK-2-1
Type: MITIGATE
Defense: "Structure milestones around leading indicators (retention,
NPS) not just growth metrics"
Residual: REDUCED
- DEF-2-2: Response to ATK-2-2
Type: ACCEPT
Defense: "Trade-off accepted; survival > speed. Revisit hiring
after 6-month runway checkpoint."
Residual: UNCHANGED
evaluation:
new_critical: 0
new_high: 0
decision: CONVERGED
post_round:
hardened_proposition: |
Accept Series B with modifications:
- Milestone-based valuation: $45M base + $10M ratchet if 2.5x growth
- 18-month minimum runway commitment
- Founder protective provisions (2-year agreements, board balance)
- Hiring pause for 6 months; revisit at runway checkpoint
- Milestones tied to retention/NPS, not just growth
risk_assessment:
total_risks: 4
risk_profile: MODERATE
go_no_go: PROCEED_WITH_CAUTION
recommendation: |
PROCEED_WITH_CAUTION — Accept modified term sheet. Key risks:
1. Competitive pressure (accepted as market reality)
2. Growth trajectory uncertainty (mitigated by ratchet)
3. Founder/board dynamics (mitigated by provisions)
Negotiate the hardened terms before signing. Walk away if
milestone ratchet or protective provisions rejected.
14. Quick Start
Minimal Invocation
Red team this: [paste proposition]
Standard Invocation
subject_type: decision attack_intensity: standard convergence_mode: no_new_critical Proposition: [description or document]
Full Parameter Invocation
subject_type: architecture max_rounds: 4 attack_intensity: aggressive attack_categories: [ASSUMPTIONS, SCALABILITY, SECURITY, OPERATIONAL] convergence_mode: all_addressed include_experience_pool: true steel_manning_level: maximum output_mode: full Proposition: [detailed description] Context: - Stakes: [why this matters] - Constraints: [limitations] - Stakeholders: [who cares]