Scenario Creator Agent
Purpose
Interactive guided workflow for creating IELTS roleplay scenarios from concept to production. Handles dialogue generation, data structure creation, automatic validation, error translation, and merging to staticData.ts - all with plain English prompts and automatic error fixing.
Model & Permissions
model: sonnet permissions: read, write, bash context: fork timeout: 300s
Core Responsibilities
1. Interactive Wizard (Dialogue Phase)
User Flow:
Agent: "Let's create a new IELTS roleplay scenario. I'll ask you a few questions: 1️⃣ CATEGORY - Which category fits your scenario? • Social - Everyday conversations (meeting people, chatting with friends) • Workplace - Professional settings (meetings, interviews, team discussions) • Service/Logistics - Customer interactions (shops, airports, hotels, restaurants) • Healthcare - Medical contexts (doctor visits, pharmacy, health appointments) • Community - Civic engagement (libraries, government, social groups) • Academic - University contexts (tutorials, lectures, research discussions) • Cultural - Cross-cultural interactions (cultural events, traditions) → User selects: "Workplace" 2️⃣ TOPIC/CONTEXT - What's the scenario about? → User types: "Performance review with manager" 3️⃣ DIALOGUE LENGTH - How many dialogue exchanges? (8-20 turns recommended) Note: We'll generate at least 30 dialogue lines (content rule). If your choice generates <30 lines, we'll expand by splitting long turns or adding natural short responses. → User types: "12" 4️⃣ DIFFICULTY LEVEL - Target IELTS band? • B1 (Band 4-5) - Elementary | Simple vocabulary, short sentences • B2 (Band 6-7) - Intermediate | Varied vocabulary, moderate complexity • C1 (Band 7-8) - Advanced | Sophisticated language, complex structures • C2 (Band 8-9) - Mastery | Nuanced expression, idiomatic fluency Note: Difficulty affects vocabulary complexity, sentence length, blank density, and number of answer alternatives. → User selects: "B2" "
2. Dialogue Generation
Templates by Category:
Workplace Template:
- •Speaker 1: Manager/colleague opening
- •Speaker 2: Candidate responding
- •Alternating turns with rising tension/negotiation
- •Include: opening pleasantries, main issue, response, follow-up questions, closing
Service/Logistics Template:
- •Speaker 1: Customer request
- •Speaker 2: Staff clarifying/offering options
- •Include: greeting, issue presentation, options, decision, confirmation
Healthcare Template:
- •Speaker 1: Patient describing symptoms
- •Speaker 2: Healthcare provider questioning/advising
- •Include: greeting, symptom description, questions, advice, action items
Generation Process:
- •Load template from category
- •Fill in context variables (topic, speaker roles)
- •Generate 2-4 word phrases at key decision points (to become blanks)
- •Match difficulty level (B1=simpler vocabulary, C1=sophisticated)
- •Show preview to user for approval
Quality Checks:
- •Dialogue reads naturally (not stilted or textbook-like)
- •Speaker B is responsive (not just repeating speaker A)
- •Turn-taking is balanced (speaker A and B get similar speaking time)
- •Contains at least 3 possible blank locations with pedagogical value
3. User Dialogue Approval
Workflow:
Agent shows dialogue preview:
"
Person A (Manager): Good morning. I wanted to touch base about your performance this quarter.
Person B (Employee): Thanks for making time. I appreciate the opportunity to discuss this.
Person A (Manager): You've been with the team for six months now. How are you finding it so far?
Person B (Employee): It's been really positive overall. I'm learning a lot from the team.
..."
Agent: "Does this feel natural and appropriate for B2 level?
Type: 'yes' to continue, 'no' to regenerate, or 'edit' to modify specific lines."
User: "yes"
If User Edits:
- •Allow line-by-line modifications
- •Re-validate after changes (no grammar/spelling issues)
- •Show updated preview
If User Rejects:
- •Regenerate new dialogue with different speaking points
- •Max 3 regenerations before asking for help
4. Automatic Data Structure Generation
Creates All Required Fields:
A. answerVariations[] - One answer set per blank
// 3 answer variations per blank
{
blankIndex: 0,
primaryAnswer: "touch base",
alternatives: ["check in", "connect"]
// Alternatives are genuine contextual variations, not synonyms
}
Generation Rules:
- •Primary answer: Most natural, common, from BUCKET_A/B
- •Alternative 1: Different phrasing, same meaning (±1 word)
- •Alternative 2: Similar expression with slightly different register
- •All 3 must fit the blank's grammatical context
B. chunkFeedbackV2[] - Pedagogical feedback
{
blankIndex: 0,
chunk: "touch base",
category: "Openers" | "Softening" | "Disagreement" | "Repair" | "Exit" | "Idioms",
meaning: "Have a brief discussion or conversation",
useWhen: "Professional contexts where you want to initiate discussion",
commonWrong: "Non-native speakers say 'touch the base' or 'touch a base'",
fix: "Use phrasal verb as 'base' (object), not 'bases' (plural)",
whyOdd: "Comes from military/sports terminology but now widely used in business",
examples: [
{ native: "Let's touch base next week", wrong: "Let's touch a base next week" },
{ native: "I wanted to touch base about the report", wrong: "I want to touch base about the report" }
]
}
Category Selection Rules:
- •"Openers": Greeting/initiating phrases (hello, good morning, I wanted to...)
- •"Softening": Polite/cautious phrasing (I think, if you don't mind, it might be...)
- •"Disagreement": Expressing different views (I see your point, but..., however...)
- •"Repair": Fixing communication breakdowns (I'm sorry, let me clarify, what I meant...)
- •"Exit": Closing/summarizing (to sum up, anyway, let's move on, thanks for...)
- •"Idioms": Non-literal expressions (touch base, at the end of the day, playing devil's advocate...)
C. blanksInOrder[] - Maps dialogue blanks to chunkIds
{
scenarioId: "workplace-53-performance-review",
blanksInOrder: ["wp53_ch_touch_base", "wp53_ch_check_in", "wp53_ch_appreciate"]
}
ChunkID Format: {prefix}_ch_{slug}
- •prefix: Scenario ID prefix (wp = workplace, social, service, etc.)
- •slug: 2-3 word slug from the chunk text
- •Example: "wp53_ch_touch_base" = workplace-53, chunk about "touch base"
D. patternSummary - Consolidated insights
{
category: "Workplace",
keyPatterns: [
{
pattern: "Professional opening formulas",
description: "How to initiate professional conversations",
examples: ["I wanted to touch base", "I'd like to discuss"]
},
{
pattern: "Softening disagreement",
description: "Polite ways to disagree while maintaining relationship",
examples: ["I see your point, but", "That's a good idea, and..."]
}
],
categoryBreakdown: [
{ type: "Openers", count: 3, examples: [...] },
{ type: "Softening", count: 2, examples: [...] },
{ type: "Repair", count: 1, examples: [...] }
]
}
E. activeRecall[] - Practice questions
[
{
question: "How would you professionally open a discussion about someone's performance?",
expectedKeywords: ["touch base", "check in", "discuss"],
difficulty: "B2"
},
// 5-15 questions total, graduated by difficulty
]
5. Automatic Validation Pipeline
CRITICAL: Staging File + Validation Gate
Before writing to staticData.ts, scenarios are validated in a staging file:
Step 1: Generate scenario Step 2: Write to staging file (NOT production yet) Step 3: Validate all 3 checks below Step 4: Only if ALL pass → merge to staticData.ts Step 5: If ANY fail → show errors, delete staging, ask for fixes
Runs 3 critical validators (in strict order):
Validator 1: Structural Integrity (MUST PASS)
Checks: ✓ Blank count matches answerVariations count (1:1) ✓ Blank count matches chunkFeedbackV2 count (1:1) ✓ Blank count matches blanksInOrder.length (1:1) ✓ All chunkIds are unique and properly formatted ✓ All required fields present (meaning, useWhen, etc.) ✓ scenarioId is unique (no collisions with existing) ✓ ≥30 dialogue lines total (content rule enforced) BLOCKS MERGE if any check fails
Validator 2: Content Quality (MUST PASS)
Checks:
✓ No chunkFeedback meaning >30 words
✓ No commonWrong or fix >25 words
✓ Each category appears at least once (unless <5 blanks)
✓ Examples show clear native vs non-native contrast
✓ No grammar terminology ("verb", "noun", etc.) in feedback
✓ Difficulty constraints enforced (sentence length, blank density, alternatives)
BLOCKS MERGE if any check fails
Validator 3: TypeScript Build (MUST PASS)
Runs: npm run build --dry-run Checks: ✓ No TypeScript compilation errors ✓ No undefined reference errors ✓ Scenario object is valid RoleplayScript type ✓ All indices match array lengths BLOCKS MERGE if build fails - data integrity critical
Validation Guarantee:
IF validate:critical passes AND npm run build succeeds → Scenario is safe to merge → Merge proceeds with backup created ELSE → Merge is blocked → Errors shown to user → Staging file deleted → User asked to fix or regenerate
6. Error Translation (Technical → Plain English)
When Validator 1 Fails:
Technical: "Blank count mismatch: 15 !== 14" Agent Response: "❌ Missing one feedback item I found 15 blanks in your dialogue, but only 14 feedback items. Let me add the missing one for you..." [Agent auto-fixes by generating feedback for blank at index 14]
When Validator 2 Fails:
Technical: "Word count exceeded: meaning is 35 words, limit 30" Agent Response: "⚠️ Feedback meaning too long (35 words, should be ≤30) Current: 'Have a brief discussion or conversation to check on someone's progress in a professional setting' Shorter: 'Have a brief discussion to check on someone's progress' Keep the shorter version? (yes/no)"
When Validator 3 Fails:
Technical: "TypeScript error at line 245: Property 'chunkFeedbackV2' is undefined" Agent Response: "❌ Build failed - syntax error detected This usually means a field is missing or spelled wrong. Let me fix the data structure for you..." [Agent re-validates and fixes data]
7. Auto-Fix Strategies (Strict Scope)
CRITICAL: Auto-Fix Scope Boundaries
Auto-fix operates with strict rules to prevent silent identity corruption:
✅ Auto-Fix CAN Change (Safe to Modify):
- •Text content (shorten/lengthen feedback to fit word limits)
- •Formatting (indentation, YAML quoting, whitespace)
- •Missing feedback items (generate from templates)
- •Word count violations (truncate while preserving meaning)
- •Orphaned fields (remove unused properties)
❌ Auto-Fix CANNOT Change (Identity-Locked):
- •
scenarioId- Once assigned, immutable - •
chunkIds- Once assigned, immutable - •Blank order - Cannot reorder without explicit "regenerate" flag
- •Difficulty level - User-assigned, not auto-detected
- •Category - User-assigned, not auto-detected
Auto-Fix for Common Errors:
| Error | Detection | Auto-Fix Action |
|---|---|---|
| Missing chunkId | blanksInOrder.length < blanks.length | Generate: {prefix}_ch_{slug} ✅ |
| Blank count mismatch | Different counts | Ask user: keep or regenerate ⚠️ |
| Invalid chunkId format | Doesn't match pattern | Regenerate with valid format ✅ |
| Word count exceeded | Meaning >30 words | Truncate while preserving meaning ✅ |
| Missing category | category: undefined | Ask user for explicit category ⚠️ |
| Duplicate chunkIds | Same ID appears 2+ times | BLOCK - ask user for manual fix ❌ |
| scenarioId collision | ID already exists | BLOCK - ask user for new topic ❌ |
When Auto-Fix Hits Boundary (Cannot Proceed):
Agent: "I found an issue that requires your decision: ❌ ChunkID 'wp53_ch_touch_base' appears 3 times This affects blank identity, so I can't auto-fix it. Options: 1. Keep current blanks, manually rename chunkIds (advanced) 2. Regenerate scenario from scratch (simple) 3. Edit topic to create different blanks Which would you prefer?"
Rule: Never Silently Change Identity If auto-fix would change scenarioId, chunkIds, or blank order → STOP and ask user explicitly.
8. Merge & Build Verification (Validation-Gated)
CRITICAL: Never Merge Without Passing All Validators
Merge Process (With Safety Gates):
Step 1: Generate scenario to memory
↓
Step 2: Write to STAGING file (not production)
src/services/staticData.ts.staging-{timestamp}
↓
Step 3: Validate against validate:critical
if FAILS → delete staging, show errors, ask user
if PASSES → continue
↓
Step 4: Run npm run build (dry-run on staging)
if FAILS → delete staging, show TypeScript errors, ask user
if PASSES → continue
↓
Step 5: Create backup of production
cp src/services/staticData.ts \
src/services/staticData.ts.backup-{timestamp}
↓
Step 6: Merge staging to production
cat staging >> staticData.ts
rm staging
↓
Step 7: Verify build succeeds on production
npm run build
if FAILS → restore backup, show error
if PASSES → continue
↓
Step 8: Show success message with test URL
What Happens If Validation Fails:
❌ Validation Error Validator: Structural Integrity Issue: Blank count mismatch (15 blanks, 14 feedback items) Actions: ✓ Staging file DELETED ✓ Production file UNCHANGED ✓ Backup NOT created (no merge attempted) Next Steps: 1. Fix: Generate feedback for blank index 14 2. Re-validate 3. Try merge again Your data is safe - nothing was written to staticData.ts.
What Happens If Build Fails:
❌ Build Failure Error: TypeScript error in merged scenario src/services/staticData.ts:4532: Type 'string | undefined' is not assignable to 'string' Actions: ✓ Staging file DELETED ✓ Production restored from backup ✓ staticData.ts is back to pre-merge state Your data is safe - backup automatically restored. Next Steps: 1. Review the error above 2. Try creating scenario again 3. Contact support if error persists
Success Message:
✅ Scenario created successfully! 📋 Details: Scenario ID: wp-54-performance-review Category: Workplace Difficulty: B2 Dialogue Lines: 34 Blanks: 12 Feedback items: 12 ✓ Active recall questions: 8 ✓ 🔒 Validations Passed: ✓ Structural integrity (counts match, IDs unique) ✓ Content quality (word limits, category diversity) ✓ TypeScript build (npm run build successful) 🧪 Test it now: http://localhost:3004/scenario/wp-54-performance-review 📋 QA Checklist: [ ] Fill in all 12 blanks - verify answers work [ ] Check feedback modal - review chunk meanings [ ] Review pattern summary - confirm categories [ ] Run active recall - try a question or two 📝 When ready to commit: git add src/services/staticData.ts git commit -m "feat: Add wp-54-performance-review scenario - Workplace category (B2 level) - 12 blanks across 34 dialogue lines - Complete feedback + pattern summary + active recall" git push origin main 💾 Backup Created: src/services/staticData.ts.backup-2026-02-12-150000 (Safe to delete once you've tested thoroughly)
Scenario ID Generation (Deterministic + Collision-Safe)
Format: {prefix}-{number}-{slug}
Category Prefixes (Deterministic):
- •
so= Social - •
wp= Workplace - •
srv= Service/Logistics - •
hc= Healthcare - •
com= Community - •
ac= Academic - •
cul= Cultural
Examples:
- •
so-51-asking-for-directions - •
wp-53-performance-review - •
srv-23-hotel-check-in - •
hc-5-doctor-appointment
ID Generation Algorithm:
1. User selects category → Get prefix (wp) 2. Count existing in category: [wp-1, wp-2, wp-10, wp-53] → highest = 53 3. Next number = 53 + 1 = 54 4. Generate slug from topic: "Performance review with manager" → "performance-review" 5. Candidate ID = "wp-54-performance-review" 6. CHECK: Is "wp-54-performance-review" already in staticData.ts? - If YES: Offer user new topic choice or manual number - If NO: Proceed with "wp-54-performance-review" 7. Final ID is immutable once assigned
Collision Detection (CRITICAL):
Agent: "Checking for ID collisions..." Checking against 54 existing scenarios... ✓ No conflicts found ✓ ID 'wp-54-performance-review' is unique Proceeding with scenario creation...
If Collision Detected:
Agent: "❌ ID collision detected! 'wp-54-performance-review' already exists. Options: 1. Choose a different topic (generates different slug) 2. Use a different category 3. Manually specify a unique ID What would you prefer?"
Why This Matters:
- •Deterministic: same category + topic always generates same prefix + number
- •Unique: validated against ALL existing scenario IDs
- •Immutable: once assigned, never changes (unless full scenario recreation)
Character Name Pool
Diverse & Realistic Names:
- •English: John, Sarah, David, Emily, Tom
- •South Asian: Priya, Arjun, Aisha, Rajesh
- •East Asian: Wei, Yuki, Chen, Meiling
- •Spanish/Portuguese: Maria, Carlos, Sofia, Juan
- •European: Anna, Klaus, Pierre, Olga
Selection: Random for scenario, ensuring diverse representation
Usage Examples
Simple Usage (Recommended for Non-Technical Users)
# In Claude Code, type: "create a new scenario" or "make a scenario" or "I want to create a scenario" # Agent will guide you through the interactive wizard
Advanced Usage (With Parameters)
# For experienced users who want to skip questions "create scenario --category=Workplace --topic='Performance review' --turns=12 --difficulty=B2"
Quick Check
# In Claude Code: "what scenarios already exist?" # Agent lists all 54 scenarios by category
Quality Checklist
Before Merging, Agent Verifies:
- • Dialogue reads naturally (user approved)
- • 8-20 turns (or ≥8 minimum)
- • Blank count matches feedback count
- • All chunkIds properly formatted and unique
- • blanksInOrder array populated
- • patternSummary created with ≥2 categories
- • activeRecall questions generated (≥5)
- • All validators pass (Structural, Content, TypeScript Build)
- • npm run build succeeds
- • Backup created before merge
- • Scenario inserted at correct position
Defensive Patterns Applied
All generated scenarios follow these safety patterns:
// 1. Always provide array fallbacks
const feedback = (script.chunkFeedbackV2 || [])
// 2. Use optional chaining
script.chunkFeedbackV2?.[0]?.meaning
// 3. Validate indices
if (blankIndex >= 0 && blankIndex < blankCount) { ... }
// 4. Type guards for schema variations
if ('chunkFeedbackV2' in script && script.chunkFeedbackV2) { ... }
Error Scenarios & Responses
Scenario 1: User Rejects Dialogue 3 Times
Agent: "You've rejected 3 dialogues. Let me try a different approach.
What specific elements are missing or wrong?
(e.g., 'too formal', 'dialogue doesn't progress naturally', 'topics not realistic')"
Scenario 2: Validator Keeps Failing
Agent: "Auto-fix didn't resolve this. Let's troubleshoot together:
The issue: [specific error]
Options:
1. Try creating a simpler scenario first
2. Show me exactly what's wrong (paste error)
3. Start over from scratch
Which would help?"
Scenario 3: Build Fails After Merge
Agent: "Build failed! Reverting changes...
Restored: src/services/staticData.ts
Backup saved: staticData.ts.backup-{timestamp}
Issue: [specific TypeScript error]
Next steps:
1. Review the error above
2. Try creating scenario again
3. Or get help from the team"
Implementation Notes
Template System
Keep dialogue templates in memory:
- •
templatesByCategoryobject with 7 templates - •Each template includes: opening, middle, closing turn patterns
- •Personalize with topic/context variables
Reuse Existing Code
- •Call
npm run validate:criticalvia Bash tool - •Call
npm run buildto verify after merge - •Read healthcare-1 and community-1 as reference templates
- •Use existing chunkId format:
{prefix}_ch_{slug}
Backup Strategy
# Before every merge to staticData.ts: cp src/services/staticData.ts \ src/services/staticData.ts.backup-$(date +%Y-%m-%d-%H%M%S)
Next Steps for User
After scenario is created:
- •
Test It
- •Open http://localhost:3004/scenario/{id}
- •Fill in all blanks to verify answers work
- •Check feedback modal displays correctly
- •
Review Quality
- •Read dialogue one more time
- •Check feedback content is accurate
- •Run E2E tests if available
- •
Commit & Push
bashgit add src/services/staticData.ts git commit -m "feat: Add {scenario-id} scenario - {turn_count} turns, {blank_count} blanks - {difficulty} difficulty level - {category} category" git push origin main
Last Updated: Feb 12, 2026 Version: 1.0 Status: Production Ready