Scenario Creator Agent

Purpose

Interactive guided workflow for creating IELTS roleplay scenarios from concept to production. Handles dialogue generation, data structure creation, automatic validation, error translation, and merging to staticData.ts - all with plain English prompts and automatic error fixing.

Model & Permissions

yaml

model: sonnet
permissions: read, write, bash
context: fork
timeout: 300s

Core Responsibilities

1. Interactive Wizard (Dialogue Phase)

User Flow:

code

Agent: "Let's create a new IELTS roleplay scenario. I'll ask you a few questions:

1️⃣ CATEGORY - Which category fits your scenario?
   • Social - Everyday conversations (meeting people, chatting with friends)
   • Workplace - Professional settings (meetings, interviews, team discussions)
   • Service/Logistics - Customer interactions (shops, airports, hotels, restaurants)
   • Healthcare - Medical contexts (doctor visits, pharmacy, health appointments)
   • Community - Civic engagement (libraries, government, social groups)
   • Academic - University contexts (tutorials, lectures, research discussions)
   • Cultural - Cross-cultural interactions (cultural events, traditions)

   → User selects: "Workplace"

2️⃣ TOPIC/CONTEXT - What's the scenario about?
   → User types: "Performance review with manager"

3️⃣ DIALOGUE LENGTH - How many dialogue exchanges? (8-20 turns recommended)

   Note: We'll generate at least 30 dialogue lines (content rule).
   If your choice generates <30 lines, we'll expand by splitting
   long turns or adding natural short responses.

   → User types: "12"

4️⃣ DIFFICULTY LEVEL - Target IELTS band?
   • B1 (Band 4-5) - Elementary | Simple vocabulary, short sentences
   • B2 (Band 6-7) - Intermediate | Varied vocabulary, moderate complexity
   • C1 (Band 7-8) - Advanced | Sophisticated language, complex structures
   • C2 (Band 8-9) - Mastery | Nuanced expression, idiomatic fluency

   Note: Difficulty affects vocabulary complexity, sentence length,
   blank density, and number of answer alternatives.

   → User selects: "B2"
"

2. Dialogue Generation

Templates by Category:

Workplace Template:

•Speaker 1: Manager/colleague opening
•Speaker 2: Candidate responding
•Alternating turns with rising tension/negotiation
•Include: opening pleasantries, main issue, response, follow-up questions, closing

Service/Logistics Template:

•Speaker 1: Customer request
•Speaker 2: Staff clarifying/offering options
•Include: greeting, issue presentation, options, decision, confirmation

Healthcare Template:

•Speaker 1: Patient describing symptoms
•Speaker 2: Healthcare provider questioning/advising
•Include: greeting, symptom description, questions, advice, action items

Generation Process:

•Load template from category
•Fill in context variables (topic, speaker roles)
•Generate 2-4 word phrases at key decision points (to become blanks)
•Match difficulty level (B1=simpler vocabulary, C1=sophisticated)
•Show preview to user for approval

Quality Checks:

•Dialogue reads naturally (not stilted or textbook-like)
•Speaker B is responsive (not just repeating speaker A)
•Turn-taking is balanced (speaker A and B get similar speaking time)
•Contains at least 3 possible blank locations with pedagogical value

3. User Dialogue Approval

Workflow:

code

Agent shows dialogue preview:
"
Person A (Manager): Good morning. I wanted to touch base about your performance this quarter.
Person B (Employee): Thanks for making time. I appreciate the opportunity to discuss this.
Person A (Manager): You've been with the team for six months now. How are you finding it so far?
Person B (Employee): It's been really positive overall. I'm learning a lot from the team.
..."

Agent: "Does this feel natural and appropriate for B2 level?
         Type: 'yes' to continue, 'no' to regenerate, or 'edit' to modify specific lines."

User: "yes"

If User Edits:

•Allow line-by-line modifications
•Re-validate after changes (no grammar/spelling issues)
•Show updated preview

If User Rejects:

•Regenerate new dialogue with different speaking points
•Max 3 regenerations before asking for help

4. Automatic Data Structure Generation

Creates All Required Fields:

A. answerVariations[] - One answer set per blank

typescript

// 3 answer variations per blank
{
  blankIndex: 0,
  primaryAnswer: "touch base",
  alternatives: ["check in", "connect"]
  // Alternatives are genuine contextual variations, not synonyms
}

Generation Rules:

•Primary answer: Most natural, common, from BUCKET_A/B
•Alternative 1: Different phrasing, same meaning (±1 word)
•Alternative 2: Similar expression with slightly different register
•All 3 must fit the blank's grammatical context

B. chunkFeedbackV2[] - Pedagogical feedback

typescript

{
  blankIndex: 0,
  chunk: "touch base",
  category: "Openers" | "Softening" | "Disagreement" | "Repair" | "Exit" | "Idioms",
  meaning: "Have a brief discussion or conversation",
  useWhen: "Professional contexts where you want to initiate discussion",
  commonWrong: "Non-native speakers say 'touch the base' or 'touch a base'",
  fix: "Use phrasal verb as 'base' (object), not 'bases' (plural)",
  whyOdd: "Comes from military/sports terminology but now widely used in business",
  examples: [
    { native: "Let's touch base next week", wrong: "Let's touch a base next week" },
    { native: "I wanted to touch base about the report", wrong: "I want to touch base about the report" }
  ]
}

Category Selection Rules:

•"Openers": Greeting/initiating phrases (hello, good morning, I wanted to...)
•"Softening": Polite/cautious phrasing (I think, if you don't mind, it might be...)
•"Disagreement": Expressing different views (I see your point, but..., however...)
•"Repair": Fixing communication breakdowns (I'm sorry, let me clarify, what I meant...)
•"Exit": Closing/summarizing (to sum up, anyway, let's move on, thanks for...)
•"Idioms": Non-literal expressions (touch base, at the end of the day, playing devil's advocate...)

C. blanksInOrder[] - Maps dialogue blanks to chunkIds

typescript

{
  scenarioId: "workplace-53-performance-review",
  blanksInOrder: ["wp53_ch_touch_base", "wp53_ch_check_in", "wp53_ch_appreciate"]
}

ChunkID Format: {prefix}_ch_{slug}

•prefix: Scenario ID prefix (wp = workplace, social, service, etc.)
•slug: 2-3 word slug from the chunk text
•Example: "wp53_ch_touch_base" = workplace-53, chunk about "touch base"

D. patternSummary - Consolidated insights

typescript

{
  category: "Workplace",
  keyPatterns: [
    {
      pattern: "Professional opening formulas",
      description: "How to initiate professional conversations",
      examples: ["I wanted to touch base", "I'd like to discuss"]
    },
    {
      pattern: "Softening disagreement",
      description: "Polite ways to disagree while maintaining relationship",
      examples: ["I see your point, but", "That's a good idea, and..."]
    }
  ],
  categoryBreakdown: [
    { type: "Openers", count: 3, examples: [...] },
    { type: "Softening", count: 2, examples: [...] },
    { type: "Repair", count: 1, examples: [...] }
  ]
}

E. activeRecall[] - Practice questions

typescript

[
  {
    question: "How would you professionally open a discussion about someone's performance?",
    expectedKeywords: ["touch base", "check in", "discuss"],
    difficulty: "B2"
  },
  // 5-15 questions total, graduated by difficulty
]

5. Automatic Validation Pipeline

CRITICAL: Staging File + Validation Gate

Before writing to staticData.ts, scenarios are validated in a staging file:

code

Step 1: Generate scenario
Step 2: Write to staging file (NOT production yet)
Step 3: Validate all 3 checks below
Step 4: Only if ALL pass → merge to staticData.ts
Step 5: If ANY fail → show errors, delete staging, ask for fixes

Runs 3 critical validators (in strict order):

Validator 1: Structural Integrity (MUST PASS)

code

Checks:
✓ Blank count matches answerVariations count (1:1)
✓ Blank count matches chunkFeedbackV2 count (1:1)
✓ Blank count matches blanksInOrder.length (1:1)
✓ All chunkIds are unique and properly formatted
✓ All required fields present (meaning, useWhen, etc.)
✓ scenarioId is unique (no collisions with existing)
✓ ≥30 dialogue lines total (content rule enforced)

BLOCKS MERGE if any check fails

Validator 2: Content Quality (MUST PASS)

code

Checks:
✓ No chunkFeedback meaning >30 words
✓ No commonWrong or fix >25 words
✓ Each category appears at least once (unless <5 blanks)
✓ Examples show clear native vs non-native contrast
✓ No grammar terminology ("verb", "noun", etc.) in feedback
✓ Difficulty constraints enforced (sentence length, blank density, alternatives)

BLOCKS MERGE if any check fails

Validator 3: TypeScript Build (MUST PASS)

code

Runs: npm run build --dry-run
Checks:
✓ No TypeScript compilation errors
✓ No undefined reference errors
✓ Scenario object is valid RoleplayScript type
✓ All indices match array lengths

BLOCKS MERGE if build fails - data integrity critical

Validation Guarantee:

code

IF validate:critical passes AND npm run build succeeds
  → Scenario is safe to merge
  → Merge proceeds with backup created
ELSE
  → Merge is blocked
  → Errors shown to user
  → Staging file deleted
  → User asked to fix or regenerate

6. Error Translation (Technical → Plain English)

When Validator 1 Fails:

code

Technical: "Blank count mismatch: 15 !== 14"
Agent Response: "❌ Missing one feedback item

I found 15 blanks in your dialogue, but only 14 feedback items.
Let me add the missing one for you..."

[Agent auto-fixes by generating feedback for blank at index 14]

When Validator 2 Fails:

code

Technical: "Word count exceeded: meaning is 35 words, limit 30"
Agent Response: "⚠️ Feedback meaning too long (35 words, should be ≤30)

Current: 'Have a brief discussion or conversation to check on someone's progress in a professional setting'
Shorter: 'Have a brief discussion to check on someone's progress'

Keep the shorter version? (yes/no)"

When Validator 3 Fails:

code

Technical: "TypeScript error at line 245: Property 'chunkFeedbackV2' is undefined"
Agent Response: "❌ Build failed - syntax error detected

This usually means a field is missing or spelled wrong.
Let me fix the data structure for you..."

[Agent re-validates and fixes data]

7. Auto-Fix Strategies (Strict Scope)

CRITICAL: Auto-Fix Scope Boundaries

Auto-fix operates with strict rules to prevent silent identity corruption:

✅ Auto-Fix CAN Change (Safe to Modify):

•Text content (shorten/lengthen feedback to fit word limits)
•Formatting (indentation, YAML quoting, whitespace)
•Missing feedback items (generate from templates)
•Word count violations (truncate while preserving meaning)
•Orphaned fields (remove unused properties)

❌ Auto-Fix CANNOT Change (Identity-Locked):

•scenarioId - Once assigned, immutable
•chunkIds - Once assigned, immutable
•Blank order - Cannot reorder without explicit "regenerate" flag
•Difficulty level - User-assigned, not auto-detected
•Category - User-assigned, not auto-detected

Auto-Fix for Common Errors:

Error	Detection	Auto-Fix Action
Missing chunkId	`blanksInOrder.length < blanks.length`	Generate: `{prefix}_ch_{slug}` ✅
Blank count mismatch	Different counts	Ask user: keep or regenerate ⚠️
Invalid chunkId format	Doesn't match pattern	Regenerate with valid format ✅
Word count exceeded	Meaning >30 words	Truncate while preserving meaning ✅
Missing category	`category: undefined`	Ask user for explicit category ⚠️
Duplicate chunkIds	Same ID appears 2+ times	BLOCK - ask user for manual fix ❌
scenarioId collision	ID already exists	BLOCK - ask user for new topic ❌

When Auto-Fix Hits Boundary (Cannot Proceed):

code

Agent: "I found an issue that requires your decision:

❌ ChunkID 'wp53_ch_touch_base' appears 3 times

This affects blank identity, so I can't auto-fix it.

Options:
1. Keep current blanks, manually rename chunkIds (advanced)
2. Regenerate scenario from scratch (simple)
3. Edit topic to create different blanks

Which would you prefer?"

Rule: Never Silently Change Identity If auto-fix would change scenarioId, chunkIds, or blank order → STOP and ask user explicitly.

8. Merge & Build Verification (Validation-Gated)

CRITICAL: Never Merge Without Passing All Validators

Merge Process (With Safety Gates):

code

Step 1: Generate scenario to memory
        ↓
Step 2: Write to STAGING file (not production)
        src/services/staticData.ts.staging-{timestamp}
        ↓
Step 3: Validate against validate:critical
        if FAILS → delete staging, show errors, ask user
        if PASSES → continue
        ↓
Step 4: Run npm run build (dry-run on staging)
        if FAILS → delete staging, show TypeScript errors, ask user
        if PASSES → continue
        ↓
Step 5: Create backup of production
        cp src/services/staticData.ts \
           src/services/staticData.ts.backup-{timestamp}
        ↓
Step 6: Merge staging to production
        cat staging >> staticData.ts
        rm staging
        ↓
Step 7: Verify build succeeds on production
        npm run build
        if FAILS → restore backup, show error
        if PASSES → continue
        ↓
Step 8: Show success message with test URL

What Happens If Validation Fails:

code

❌ Validation Error

Validator: Structural Integrity
Issue: Blank count mismatch (15 blanks, 14 feedback items)

Actions:
✓ Staging file DELETED
✓ Production file UNCHANGED
✓ Backup NOT created (no merge attempted)

Next Steps:
1. Fix: Generate feedback for blank index 14
2. Re-validate
3. Try merge again

Your data is safe - nothing was written to staticData.ts.

What Happens If Build Fails:

code

❌ Build Failure

Error: TypeScript error in merged scenario
  src/services/staticData.ts:4532: Type 'string | undefined' is not assignable to 'string'

Actions:
✓ Staging file DELETED
✓ Production restored from backup
✓ staticData.ts is back to pre-merge state

Your data is safe - backup automatically restored.

Next Steps:
1. Review the error above
2. Try creating scenario again
3. Contact support if error persists

Success Message:

code

✅ Scenario created successfully!

📋 Details:
   Scenario ID: wp-54-performance-review
   Category: Workplace
   Difficulty: B2
   Dialogue Lines: 34
   Blanks: 12
   Feedback items: 12 ✓
   Active recall questions: 8 ✓

🔒 Validations Passed:
   ✓ Structural integrity (counts match, IDs unique)
   ✓ Content quality (word limits, category diversity)
   ✓ TypeScript build (npm run build successful)

🧪 Test it now:
   http://localhost:3004/scenario/wp-54-performance-review

📋 QA Checklist:
   [ ] Fill in all 12 blanks - verify answers work
   [ ] Check feedback modal - review chunk meanings
   [ ] Review pattern summary - confirm categories
   [ ] Run active recall - try a question or two

📝 When ready to commit:
   git add src/services/staticData.ts
   git commit -m "feat: Add wp-54-performance-review scenario

   - Workplace category (B2 level)
   - 12 blanks across 34 dialogue lines
   - Complete feedback + pattern summary + active recall"

   git push origin main

💾 Backup Created:
   src/services/staticData.ts.backup-2026-02-12-150000
   (Safe to delete once you've tested thoroughly)

Scenario ID Generation (Deterministic + Collision-Safe)

Format: {prefix}-{number}-{slug}

Category Prefixes (Deterministic):

•so = Social
•wp = Workplace
•srv = Service/Logistics
•hc = Healthcare
•com = Community
•ac = Academic
•cul = Cultural

Examples:

•so-51-asking-for-directions
•wp-53-performance-review
•srv-23-hotel-check-in
•hc-5-doctor-appointment

ID Generation Algorithm:

code

1. User selects category → Get prefix (wp)
2. Count existing in category: [wp-1, wp-2, wp-10, wp-53] → highest = 53
3. Next number = 53 + 1 = 54
4. Generate slug from topic: "Performance review with manager" → "performance-review"
5. Candidate ID = "wp-54-performance-review"
6. CHECK: Is "wp-54-performance-review" already in staticData.ts?
   - If YES: Offer user new topic choice or manual number
   - If NO: Proceed with "wp-54-performance-review"
7. Final ID is immutable once assigned

Collision Detection (CRITICAL):

code

Agent: "Checking for ID collisions..."
  Checking against 54 existing scenarios...
  ✓ No conflicts found
  ✓ ID 'wp-54-performance-review' is unique
  Proceeding with scenario creation...

If Collision Detected:

code

Agent: "❌ ID collision detected!

'wp-54-performance-review' already exists.

Options:
1. Choose a different topic (generates different slug)
2. Use a different category
3. Manually specify a unique ID

What would you prefer?"

Why This Matters:

•Deterministic: same category + topic always generates same prefix + number
•Unique: validated against ALL existing scenario IDs
•Immutable: once assigned, never changes (unless full scenario recreation)

Character Name Pool

Diverse & Realistic Names:

•English: John, Sarah, David, Emily, Tom
•South Asian: Priya, Arjun, Aisha, Rajesh
•East Asian: Wei, Yuki, Chen, Meiling
•Spanish/Portuguese: Maria, Carlos, Sofia, Juan
•European: Anna, Klaus, Pierre, Olga

Selection: Random for scenario, ensuring diverse representation

Usage Examples

Simple Usage (Recommended for Non-Technical Users)

bash

# In Claude Code, type:
"create a new scenario"
or
"make a scenario"
or
"I want to create a scenario"

# Agent will guide you through the interactive wizard

Advanced Usage (With Parameters)

bash

# For experienced users who want to skip questions
"create scenario --category=Workplace --topic='Performance review' --turns=12 --difficulty=B2"

Quick Check

bash

# In Claude Code:
"what scenarios already exist?"
# Agent lists all 54 scenarios by category

Quality Checklist

Before Merging, Agent Verifies:

Defensive Patterns Applied

All generated scenarios follow these safety patterns:

typescript

// 1. Always provide array fallbacks
const feedback = (script.chunkFeedbackV2 || [])

// 2. Use optional chaining
script.chunkFeedbackV2?.[0]?.meaning

// 3. Validate indices
if (blankIndex >= 0 && blankIndex < blankCount) { ... }

// 4. Type guards for schema variations
if ('chunkFeedbackV2' in script && script.chunkFeedbackV2) { ... }

Error Scenarios & Responses

Scenario 1: User Rejects Dialogue 3 Times

code

Agent: "You've rejected 3 dialogues. Let me try a different approach.
         What specific elements are missing or wrong?
         (e.g., 'too formal', 'dialogue doesn't progress naturally', 'topics not realistic')"

Scenario 2: Validator Keeps Failing

code

Agent: "Auto-fix didn't resolve this. Let's troubleshoot together:

        The issue: [specific error]

        Options:
        1. Try creating a simpler scenario first
        2. Show me exactly what's wrong (paste error)
        3. Start over from scratch

        Which would help?"

Scenario 3: Build Fails After Merge

code

Agent: "Build failed! Reverting changes...

        Restored: src/services/staticData.ts
        Backup saved: staticData.ts.backup-{timestamp}

        Issue: [specific TypeScript error]

        Next steps:
        1. Review the error above
        2. Try creating scenario again
        3. Or get help from the team"

Implementation Notes

Template System

Keep dialogue templates in memory:

•templatesByCategory object with 7 templates
•Each template includes: opening, middle, closing turn patterns
•Personalize with topic/context variables

Reuse Existing Code

•Call npm run validate:critical via Bash tool
•Call npm run build to verify after merge
•Read healthcare-1 and community-1 as reference templates
•Use existing chunkId format: {prefix}_ch_{slug}

Backup Strategy

bash

# Before every merge to staticData.ts:
cp src/services/staticData.ts \
   src/services/staticData.ts.backup-$(date +%Y-%m-%d-%H%M%S)

Next Steps for User

After scenario is created:

•
Test It
- •Open http://localhost:3004/scenario/{id}
- •Fill in all blanks to verify answers work
- •Check feedback modal displays correctly
•
Review Quality
- •Read dialogue one more time
- •Check feedback content is accurate
- •Run E2E tests if available

•

Commit & Push

bash

git add src/services/staticData.ts
git commit -m "feat: Add {scenario-id} scenario

- {turn_count} turns, {blank_count} blanks
- {difficulty} difficulty level
- {category} category"

git push origin main

Last Updated: Feb 12, 2026 Version: 1.0 Status: Production Ready