Audio Analyzer
A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.
Quick Start
python
from scripts.audio_analyzer import AudioAnalyzer
# Analyze an audio file
analyzer = AudioAnalyzer("song.mp3")
analyzer.analyze()
# Get all analysis results
results = analyzer.get_results()
print(f"BPM: {results['tempo']['bpm']}")
print(f"Key: {results['key']['key']} {results['key']['mode']}")
# Generate visualizations
analyzer.plot_waveform("waveform.png")
analyzer.plot_spectrogram("spectrogram.png")
# Full report
analyzer.save_report("analysis_report.json")
Features
- •Tempo/BPM Detection: Accurate beat tracking with confidence score
- •Key Detection: Musical key and mode (major/minor) identification
- •Frequency Analysis: Spectrum, dominant frequencies, frequency bands
- •Loudness Metrics: RMS, peak, LUFS, dynamic range
- •Waveform Visualization: Multi-channel waveform plots
- •Spectrogram: Time-frequency visualization with customization
- •Chromagram: Pitch class visualization for harmonic analysis
- •Beat Grid: Visual beat markers overlaid on waveform
- •Export Formats: JSON report, PNG/SVG visualizations
API Reference
Initialization
python
# From file
analyzer = AudioAnalyzer("audio.mp3")
# With custom sample rate
analyzer = AudioAnalyzer("audio.wav", sr=44100)
Analysis Methods
python
# Run full analysis analyzer.analyze() # Individual analyses analyzer.analyze_tempo() # BPM and beat positions analyzer.analyze_key() # Musical key detection analyzer.analyze_loudness() # RMS, peak, LUFS analyzer.analyze_frequency() # Spectrum analysis analyzer.analyze_dynamics() # Dynamic range
Results Access
python
# Get all results as dict
results = analyzer.get_results()
# Individual results
tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]}
key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72}
loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0}
freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}
Visualization Methods
python
# Waveform
analyzer.plot_waveform(
output="waveform.png",
figsize=(12, 4),
color="#1f77b4",
show_rms=True
)
# Spectrogram
analyzer.plot_spectrogram(
output="spectrogram.png",
figsize=(12, 6),
cmap="magma", # viridis, plasma, inferno, magma
freq_scale="log", # linear, log, mel
max_freq=8000 # Hz
)
# Chromagram (pitch classes)
analyzer.plot_chromagram(
output="chromagram.png",
figsize=(12, 4)
)
# Onset strength / beat grid
analyzer.plot_beats(
output="beats.png",
figsize=(12, 4),
show_strength=True
)
# Combined dashboard
analyzer.plot_dashboard(
output="dashboard.png",
figsize=(14, 10)
)
Export
python
# JSON report with all analysis
analyzer.save_report("report.json")
# Summary text
summary = analyzer.get_summary()
print(summary)
Analysis Details
Tempo Detection
Uses beat tracking algorithm to detect:
- •BPM: Beats per minute (tempo)
- •Beat positions: Timestamps of detected beats
- •Confidence: Reliability score (0-1)
python
tempo = analyzer.get_tempo()
# {
# 'bpm': 128.0,
# 'confidence': 0.89,
# 'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds
# 'beat_count': 256
# }
Key Detection
Analyzes harmonic content to identify:
- •Key: Root note (C, C#, D, etc.)
- •Mode: Major or minor
- •Confidence: Detection confidence
- •Key profile: Correlation with each key
python
key = analyzer.get_key()
# {
# 'key': 'A',
# 'mode': 'minor',
# 'confidence': 0.76,
# 'profile': {'C': 0.12, 'C#': 0.08, ...}
# }
Loudness Metrics
Comprehensive loudness analysis:
- •RMS dB: Root mean square level
- •Peak dB: Maximum sample level
- •LUFS: Integrated loudness (broadcast standard)
- •Dynamic Range: Difference between loud and quiet sections
python
loudness = analyzer.get_loudness()
# {
# 'rms_db': -14.2,
# 'peak_db': -0.3,
# 'lufs': -14.0,
# 'dynamic_range_db': 12.5,
# 'crest_factor': 8.2
# }
Frequency Analysis
Spectrum analysis including:
- •Dominant frequency: Strongest frequency component
- •Frequency bands: Energy in bass, mid, treble
- •Spectral centroid: "Brightness" of audio
- •Spectral rolloff: Frequency below which 85% of energy exists
python
freq = analyzer.get_frequency()
# {
# 'dominant_freq': 440.0,
# 'spectral_centroid': 2150.3,
# 'spectral_rolloff': 4200.5,
# 'bands': {
# 'sub_bass': -28.5, # 20-60 Hz
# 'bass': -18.2, # 60-250 Hz
# 'low_mid': -12.1, # 250-500 Hz
# 'mid': -10.8, # 500-2000 Hz
# 'high_mid': -14.3, # 2000-4000 Hz
# 'high': -22.1 # 4000-20000 Hz
# }
# }
CLI Usage
bash
# Full analysis with all visualizations python audio_analyzer.py --input song.mp3 --output-dir ./analysis/ # Just tempo and key python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json # Generate specific visualization python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png # Dashboard view python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png # Batch analyze directory python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/
CLI Arguments
| Argument | Description | Default |
|---|---|---|
--input | Input audio file | Required |
--input-dir | Directory of audio files | - |
--output | Output file path | - |
--output-dir | Output directory | . |
--analyze | Analysis types: tempo, key, loudness, frequency, all | all |
--plot | Plot type: waveform, spectrogram, chromagram, beats, dashboard | - |
--format | Output format: json, txt | json |
--sr | Sample rate for analysis | 22050 |
Examples
Song Analysis
python
analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()
print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")
analyzer.plot_dashboard("track_analysis.png")
Podcast Quality Check
python
analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()
loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
print("Warning: Audio may be too loud for podcast standards")
elif loudness['lufs'] < -20:
print("Warning: Audio may be too quiet")
else:
print("Loudness is within podcast standards (-16 to -20 LUFS)")
Batch Analysis
python
import os
from scripts.audio_analyzer import AudioAnalyzer
results = []
for filename in os.listdir("./songs"):
if filename.endswith(('.mp3', '.wav', '.flac')):
analyzer = AudioAnalyzer(f"./songs/{filename}")
analyzer.analyze()
results.append({
'file': filename,
'bpm': analyzer.get_tempo()['bpm'],
'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
'lufs': analyzer.get_loudness()['lufs']
})
# Sort by BPM for DJ set
results.sort(key=lambda x: x['bpm'])
Supported Formats
Input formats (via librosa/soundfile):
- •MP3
- •WAV
- •FLAC
- •OGG
- •M4A/AAC
- •AIFF
Output formats:
- •JSON (analysis report)
- •PNG (visualizations)
- •SVG (visualizations)
- •TXT (summary)
Dependencies
code
librosa>=0.10.0 soundfile>=0.12.0 matplotlib>=3.7.0 numpy>=1.24.0 scipy>=1.10.0
Limitations
- •Key detection works best with melodic content (less accurate for drums/percussion)
- •BPM detection may struggle with free-tempo or complex time signatures
- •Very short clips (<5 seconds) may have reduced accuracy
- •LUFS calculation is simplified (not full ITU-R BS.1770-4)