Adding TTS Audio to a Feature
This skill walks you through adding text-to-speech audio to a feature in the app. The TTS system handles playback (pre-generated mp3s first, browser SpeechSynthesis fallback), collection of (text, tone) pairs, and admin generation of high-quality OpenAI TTS mp3s.
Before You Start
Read the integration guide: apps/web/.claude/reference/tts-audio-system.md
It contains the full API reference, patterns, anti-patterns, and existing implementations.
Your Job
- •Understand what text the feature needs spoken and when
- •Create a feature-specific audio hook
- •Wire it into the component
- •Verify with TypeScript
Step 1: Design the Utterances
For each piece of audio the feature needs, determine:
- •What text to speak (static string or dynamic from state/props)
- •When to speak it (on mount, on state change, on user action, on completion)
- •How it should sound (tone — write as voice-actor stage directions)
Step 2: Create a Feature Audio Hook
Create a hook in the feature's hooks/ directory. This hook owns text construction, tone strings, auto-play logic, and cleanup.
Read the reference implementation first:
apps/web/src/components/practice/hooks/usePracticeAudioHelp.ts
Key rules:
- •Tone strings must be module-level constants — never compute them dynamically per render
- •Always clean up on unmount — call
stop()in a cleanup effect - •Use refs to track previous values — prevents re-playing on every render
- •Guard with
isEnabled— respect the user's audio toggle
Template:
'use client'
import { useEffect, useRef } from 'react'
import { useTTS } from '@/hooks/useTTS'
import { useAudioManager } from '@/hooks/useAudioManager'
// Stable tone constants — changing these creates new clips
const INSTRUCTION_TONE =
'Patiently guiding a young child. Clear, slow, friendly.'
const CELEBRATION_TONE =
'Warmly congratulating a child. Genuinely encouraging and happy.'
interface UseMyFeatureAudioHelpOptions {
currentStep: string
isComplete: boolean
}
export function useMyFeatureAudioHelp({
currentStep,
isComplete,
}: UseMyFeatureAudioHelpOptions) {
const { isEnabled, stop } = useAudioManager()
// Declare utterances
const sayInstruction = useTTS(currentStep, { tone: INSTRUCTION_TONE })
const sayCelebration = useTTS(
isComplete ? 'Well done!' : '',
{ tone: CELEBRATION_TONE },
)
// Auto-play when step changes
const prevStepRef = useRef<string>('')
useEffect(() => {
if (!isEnabled || !currentStep || currentStep === prevStepRef.current) return
prevStepRef.current = currentStep
sayInstruction()
}, [isEnabled, currentStep, sayInstruction])
// Auto-play celebration on completion
useEffect(() => {
if (!isEnabled || !isComplete) return
sayCelebration()
}, [isEnabled, isComplete, sayCelebration])
// Stop audio on unmount
useEffect(() => {
return () => stop()
}, [stop])
return { replay: sayInstruction }
}
Step 3: Wire Into the Component
import { useMyFeatureAudioHelp } from './hooks/useMyFeatureAudioHelp'
import { useAudioManager } from '@/hooks/useAudioManager'
function MyFeature() {
const { isEnabled, isPlaying } = useAudioManager()
const { replay } = useMyFeatureAudioHelp({
currentStep: 'Tap the bead to move it up',
isComplete: false,
})
return (
<div>
{isEnabled && (
<button onClick={replay} disabled={isPlaying}>
{isPlaying ? 'Speaking...' : 'Replay'}
</button>
)}
</div>
)
}
Step 4: Verify
cd apps/web && npx tsc --noEmit
Common Patterns
Dynamic text from state
const text = useMemo(
() => (terms ? termsToSentence(terms) : ''),
[terms],
)
const sayProblem = useTTS(text, { tone: MATH_TONE })
One-shot playback (play once, don't repeat)
const playedRef = useRef(false)
useEffect(() => {
if (!shouldPlay || playedRef.current) return
playedRef.current = true
sayIt()
}, [shouldPlay, sayIt])
// Reset when trigger resets
useEffect(() => {
if (!shouldPlay) playedRef.current = false
}, [shouldPlay])
Multiple utterances — play the right one
const sayStep1 = useTTS('First, look at the abacus', { tone: INST })
const sayStep2 = useTTS('Now tap the bead', { tone: INST })
// speak() stops previous before starting
if (step === 0) sayStep1()
if (step === 1) sayStep2()
Tone String Guidelines
Write tones as voice-actor stage directions. Be specific about emotion, pace, and audience.
Good examples:
- •
'Speaking clearly and steadily, reading a math problem to a young child. Pause slightly between each number and operator.' - •
'Warmly congratulating a child. Genuinely encouraging and happy.' - •
'Gently guiding a child after a wrong answer. Kind, not disappointed.' - •
'Patiently guiding a young child through an abacus tutorial. Clear, slow, friendly.'
Bad examples:
- •
'Read this text'— too vague - •
`Speaking ${mood}`— dynamic per render, creates new clips every time
Anti-Patterns to Avoid
- •Never use raw
speechSynthesis— always go throughuseTTSso the voice chain and collection work - •Never forget cleanup — always
useEffect(() => () => stop(), [stop]) - •Never use dynamic tone strings — keep them as module-level constants
- •Never call
speak()unconditionally in render — always guard with refs andisEnabled
Key Files
| File | Role |
|---|---|
src/hooks/useTTS.ts | Primary hook — declare (text, tone), get speak function |
src/hooks/useAudioManager.ts | Reactive state — isEnabled, isPlaying, volume, stop() |
src/lib/audio/TtsAudioManager.ts | Core engine — voice chain, playback, collection |
src/contexts/AudioManagerContext.tsx | React context — singleton manager |
src/lib/audio/termsToSentence.ts | [5, 3] → "five plus three" |
src/lib/audio/buildFeedbackText.ts | Correct/incorrect feedback sentences |
src/lib/audio/numberToEnglish.ts | 42 → "forty two" |
Reference Implementations
| Hook | Location | What it does |
|---|---|---|
usePracticeAudioHelp | src/components/practice/hooks/ | Reads math problems, correct/incorrect feedback |
useTutorialAudioHelp | src/components/tutorial/hooks/ | Speaks tutorial step instructions |
Follow usePracticeAudioHelp as the most complete example.