Development Patterns
Config-driven development
All tunable values live in config/*.yaml. Load via the Config class:
python
from utils.config import Config
config = Config()
# Dot-notation access
url = config.get('data.prometheus.url')
window_size = config.get('windowing.window_size', 20) # with default
queries = config.get('data.metrics.queries', None)
Never hardcode values that could change between environments or experiments.
Train/inference parity checklist
When modifying data processing, ensure both pipelines stay in sync:
- •
scripts/train.py-- training data path - •
scripts/inference.py-- inference data path + synthetic fallback - •
src/data/preprocessor.py-- shared preprocessor (saved/loaded via joblib) - •
config/data.yaml-- shared config source
Docker workflow
bash
# Start dev stack
docker-compose --profile dev up -d
# Rebuild after code changes
docker-compose build anomaly-detection
# Run training in container
docker-compose run --rm anomaly-detection python scripts/train.py
# Check inference logs
docker logs tv-anomaly-detector --since 5m
# Inject anomaly for testing
curl -X POST http://localhost:8000/anomaly -H "Content-Type: application/json" -d '{"type": "latency_spike", "duration": 120}'
# Stop stack (data persists in prometheus_data volume)
docker-compose --profile dev down
Testing anomaly detection
The mock service at localhost:8000 supports anomaly injection:
| Type | Duration | Effect |
|---|---|---|
latency_spike | 1-3600s | Latency jumps to 2-10s |
error_burst | 1-3600s | Error rate jumps to 30% |
memory_spike | 1-3600s | Memory jumps to 3-4GB |
traffic_drop | 1-3600s | 90% traffic reduction |
Note: rate()[5m] smoothing means anomalies take ~5 min to fully appear in Prometheus and ~5 min to clear after stopping.
File naming conventions
- •Models:
models/lstm_autoencoder.weights.h5,models/lstm_autoencoder_config.json - •Preprocessor:
models/preprocessor.joblib - •Threshold:
models/anomaly_threshold.npy - •Config:
config/{domain}.yaml(data, model, windowing, alerting)