TTS LiveKit Plugin Skill
This skill provides a complete solution for building self-hosted Text-to-Speech (TTS) systems integrated with LiveKit voice agents.
What This Skill Does
- •
Creates a Self-Hosted TTS API Server
- •FastAPI-based REST API
- •Uses MeloTTS model from Hugging Face
- •Supports streaming audio responses
- •Multi-language and multi-voice support
- •Production-ready with proper error handling
- •
Builds a LiveKit TTS Plugin
- •Fully compatible with LiveKit agents framework
- •Implements standard TTS interface
- •Streaming audio support
- •Proper error handling and retries
- •Drop-in replacement for commercial TTS providers
- •
Provides Complete Testing
- •Comprehensive test suite for API
- •Integration tests for plugin
- •No mocked functions - all real implementations
- •Performance and concurrency tests
- •
Includes Full Documentation
- •API documentation with examples
- •Plugin usage guide
- •Deployment guide for production
- •Multiple usage examples
Components
API Server (api/)
- •server.py: FastAPI server with MeloTTS integration
- •requirements.txt: Python dependencies
- •Endpoints:
- •
GET /: Health check - •
GET /voices: List available voices - •
POST /synthesize: Full audio synthesis - •
POST /synthesize/stream: Streaming synthesis
- •
LiveKit Plugin (plugin/)
- •melotts_plugin.py: LiveKit TTS plugin implementation
- •Extends
livekit.agents.tts.TTSbase class - •Implements
ChunkedStreamfor audio streaming - •Uses aiohttp for HTTP requests
- •Proper exception handling (APIConnectionError, APITimeoutError, APIStatusError)
Tests (tests/)
- •
test_api.py: API server tests
- •Health checks
- •Voice listing
- •Synthesis (streaming and non-streaming)
- •Multiple languages
- •Error handling
- •Concurrency
- •
test_plugin.py: Plugin integration tests
- •Plugin initialization
- •Synthesis with real API
- •Multiple languages
- •Error handling
- •Concurrency
- •Timeout handling
Examples (examples/)
- •test_api_client.py: Standalone API testing script
- •simple_agent.py: Basic LiveKit agent example
- •voice_assistant.py: Complete voice assistant implementation
Documentation (docs/)
- •API.md: Complete API reference
- •PLUGIN.md: Plugin usage guide
- •DEPLOYMENT.md: Production deployment guide
Quick Start
1. Start the TTS API Server
cd api pip install -r requirements.txt python -m unidic download python server.py
Server runs on http://localhost:8000
2. Test the API
cd examples python test_api_client.py
3. Use in LiveKit Agent
from melotts_plugin import TTS
tts = TTS(
api_base_url="http://localhost:8000",
language="EN",
speaker="EN-US",
speed=1.0
)
stream = tts.synthesize("Hello from LiveKit!")
Features
- •✅ Self-hosted (no external API dependencies)
- •✅ High-quality natural speech (MeloTTS)
- •✅ 6 languages: English, Spanish, French, Chinese, Japanese, Korean
- •✅ Multiple voices per language
- •✅ Streaming audio for low latency
- •✅ CPU-friendly (optimized for real-time inference)
- •✅ GPU support (automatic if available)
- •✅ LiveKit agents framework compatible
- •✅ Production-ready error handling
- •✅ Comprehensive test coverage
- •✅ Full documentation
Architecture
┌─────────────────┐ HTTP POST ┌──────────────────┐
│ LiveKit Agent │ ──────────────────► │ TTS API │
│ │ │ Server │
│ ┌───────────┐ │ │ │
│ │ MeloTTS │ │ Audio Stream │ ┌────────────┐ │
│ │ Plugin │ │ ◄────────────────── │ │ MeloTTS │ │
│ └───────────┘ │ (WAV chunks) │ │ Model │ │
└─────────────────┘ │ └────────────┘ │
└──────────────────┘
Why MeloTTS?
- •High Quality: Natural-sounding speech
- •Fast: Optimized for real-time inference
- •CPU-Friendly: Works well even without GPU
- •Multi-lingual: 6 languages supported
- •Low Latency: ~150-200ms TTFB
- •Open Source: Free to use and modify
Performance
- •Latency: 150-200ms time-to-first-byte
- •CPU Usage: Optimized for real-time on CPUs
- •GPU Support: Automatic acceleration if available
- •Streaming: Chunked delivery for low latency
- •Concurrent Requests: Handles multiple simultaneous requests
Supported Languages
| Language | Code | Speakers |
|---|---|---|
| English | EN | EN-US, EN-BR, EN-AU, EN-IN |
| Spanish | ES | ES |
| French | FR | FR |
| Chinese | ZH | ZH |
| Japanese | JP | JP |
| Korean | KR | KR |
Testing
All tests use real implementations - no mocks:
# Start API server cd api && python server.py # Run API tests cd tests && pytest test_api.py -v # Run plugin tests cd tests && pytest test_plugin.py -v
Deployment
Multiple deployment options:
- •Standalone: Run directly with Python/Uvicorn
- •Docker: Containerized deployment
- •Kubernetes: Scalable cloud deployment
- •Cloud: AWS, GCP, Azure support
See docs/DEPLOYMENT.md for detailed guides.
Integration with LiveKit
The plugin is a drop-in replacement for other TTS providers:
# Instead of: # from livekit.plugins import openai # tts = openai.TTS() # Use: from melotts_plugin import TTS tts = TTS(api_base_url="http://localhost:8000") # Same interface, self-hosted!
Use Cases
- •Voice assistants
- •Interactive voice response (IVR) systems
- •Accessibility tools
- •Educational applications
- •Multilingual customer service bots
- •Real-time voice agents
- •Live streaming with voice synthesis
Requirements
API Server:
- •Python 3.9+
- •2GB+ RAM
- •FastAPI, MeloTTS, Uvicorn
- •Optional: GPU for faster inference
LiveKit Plugin:
- •Python 3.9+
- •livekit-agents >= 0.8.0
- •aiohttp >= 3.9.0
Security
For production:
- •Add API authentication
- •Enable HTTPS/TLS
- •Implement rate limiting
- •Configure CORS
- •Set up monitoring
See docs/DEPLOYMENT.md#security for details.
When to Use This Skill
Use this skill when you need to:
- •Build a self-hosted TTS solution
- •Create LiveKit voice agents with custom TTS
- •Avoid commercial TTS API costs
- •Have full control over voice synthesis
- •Support multiple languages
- •Deploy TTS in private/air-gapped environments
- •Build voice assistants
- •Integrate TTS into existing applications
Troubleshooting
Server won't start:
- •Run
python -m unidic download - •Check port 8000 is available
- •Verify dependencies installed
Plugin connection errors:
- •Ensure API server is running
- •Check
api_base_urlconfiguration - •Verify network connectivity
Audio quality issues:
- •Try different voices/speakers
- •Adjust speed parameter
- •Check sample rate configuration
See documentation for more troubleshooting tips.
Resources
License
Apache 2.0 License
Support
- •Check the documentation in
docs/ - •Review examples in
examples/ - •Run the test suite to verify setup
- •Check logs for error messages
This skill provides everything needed for production-ready, self-hosted TTS with LiveKit integration. All code is fully functional with no mocks or placeholders.