ML Configuration System

Plain dataclass hierarchy for ML experiments. No mixins, no magic — just inheritance and composition.

Hierarchy

code

MLBaseConfig                    (name, seed, device, output_dir)
├── SLConfig                    (epochs, batch_size, lr, optimizer, scheduler, early stopping)
├── RLConfig                    (timesteps, gamma, num_envs, normalization)
│   └── PPOConfig               (clip_epsilon, gae_lambda, entropy_coef, value_coef)
└── (your task config inherits from any of these)

Composable Pieces

Standalone dataclasses attached via fields — not part of the hierarchy:

Piece	Fields	Purpose
`OutputDir`	`base_dir`, `save_config`, `timestamp_format`, `subdirs`	Timestamped run directory
`ConsoleLogging`	`enabled`, `filename`, `tee_to_console`, `separate_streams`	Console output capture
`Checkpointing`	`enabled`, `save_best`, `save_last`, `save_frequency`, `metric`, `mode`, filenames	Model saving
`TensorBoard`	`enabled`, `log_dir`, `flush_secs`, `log_interval`	Metric logging

Add more composable pieces as needed (e.g., WandbConfig, EvalConfig).

Default Output Directory Structure

Every experiment creates a timestamped run directory:

code

{output.base_dir}/{config.name}_{YYYYMMDD_HHMMSS}/
├── config.json           # full config snapshot
├── console.log           # captured stdout/stderr
├── checkpoints/          # model weights
│   ├── model_best.pt
│   └── model_last.pt
└── tensorboard/          # tfevents files
    └── events.out.tfevents...

Created automatically by setup_output_dir(cfg).

Files

code

ml-config-system/
    SKILL.md                    -- This file (overview + field reference)
    base_config_template.py     -- MLBaseConfig, composable pieces, helpers
    sl_config_template.py       -- SLConfig (supervised learning)
    rl_config_template.py       -- RLConfig, PPOConfig (reinforcement learning)

How to Create a Task-Specific Config

•Pick a parent class (SLConfig, RLConfig, PPOConfig, or MLBaseConfig)
•Inherit from it
•Add composable pieces as fields
•Add task-specific fields
•Override defaults as needed

python

from dataclasses import dataclass, field

@dataclass
class MyTaskConfig(PPOConfig):
    # Override parent defaults
    name: str = "my_task"
    total_timesteps: int = 2_000_000
    num_envs: int = 8

    # Attach composable pieces (always include output + console)
    output: OutputDir = field(default_factory=OutputDir)
    console: ConsoleLogging = field(default_factory=ConsoleLogging)
    checkpointing: Checkpointing = field(default_factory=Checkpointing)
    tensorboard: TensorBoard = field(default_factory=TensorBoard)

    # Task-specific fields
    reward_scale: float = 1.0
    use_curriculum: bool = False

Setting Up a Run

python

# Create config
cfg = MyTaskConfig(name="ppo_snake_v2")

# Set up output directory (creates timestamped dir + subdirs + saves config.json)
run_dir = setup_output_dir(cfg)
# -> output/ppo_snake_v2_20260221_143000/

# Set up console logging (captures stdout/stderr to console.log)
cleanup = setup_console_logging(cfg, run_dir)

# ... train ...

# Restore original stdout/stderr
cleanup()

Saving and Loading

python

from dataclasses import asdict

# Save (also done automatically by setup_output_dir)
save_config(cfg, "output/config.json")

# Load
cfg = load_config(MyTaskConfig, "output/config.json")

# Manual serialization
d = asdict(cfg)  # standard dataclasses.asdict

Field Reference

MLBaseConfig

Field	Type	Default	Description
`name`	`str`	`"experiment"`	Experiment name (used in run directory)
`seed`	`int`	`42`	Random seed
`device`	`str`	`"auto"`	`"auto"`, `"cpu"`, `"cuda"`, `"cuda:0"`
`output_dir`	`str`	`"output"`	Output directory (fallback if no `OutputDir` piece)

OutputDir (composable)

Field	Type	Default	Description
`base_dir`	`str`	`"output"`	Parent directory for all runs
`save_config`	`bool`	`True`	Save config.json to run directory
`timestamp_format`	`str`	`"%Y%m%d_%H%M%S"`	Timestamp format for directory naming
`subdirs`	`Dict[str, str]`	`{"tensorboard": "tensorboard", "checkpoints": "checkpoints"}`	Subdirectories to create

ConsoleLogging (composable)

Field	Type	Default	Description
`enabled`	`bool`	`True`	Enable console capture
`filename`	`str`	`"console.log"`	Log file name in run directory
`separate_streams`	`bool`	`False`	Split stdout/stderr into separate files
`stdout_filename`	`str`	`"stdout.log"`	Stdout file (when `separate_streams=True`)
`stderr_filename`	`str`	`"stderr.log"`	Stderr file (when `separate_streams=True`)
`tee_to_console`	`bool`	`True`	Also print to terminal
`line_timestamps`	`bool`	`False`	Prefix each line with timestamp
`timestamp_format`	`str`	`"%H:%M:%S"`	Timestamp format for line prefixes
`flush_frequency`	`int`	`1`	Flush every N writes

Checkpointing (composable)

Field	Type	Default	Description
`enabled`	`bool`	`True`	Enable checkpointing
`save_best`	`bool`	`True`	Save best model
`save_last`	`bool`	`True`	Save last model
`save_frequency`	`int`	`0`	Save every N epochs; `0` = disabled
`metric`	`str`	`"loss"`	Metric to track for best model
`mode`	`str`	`"min"`	`"min"` or `"max"`
`best_filename`	`str`	`"model_best.pt"`	Best model filename
`last_filename`	`str`	`"model_last.pt"`	Last model filename
`epoch_filename_format`	`str`	`"model_epoch_{epoch}.pt"`	Periodic save filename

TensorBoard (composable)

Field	Type	Default	Description
`enabled`	`bool`	`True`	Enable TensorBoard logging
`log_dir`	`str`	`"tensorboard"`	Log directory (relative to run dir)
`flush_secs`	`int`	`120`	Flush interval
`log_interval`	`int`	`100`	Steps between log writes

SLConfig (extends MLBaseConfig)

Field	Type	Default	Description
`num_epochs`	`int`	`100`	Training epochs
`batch_size`	`int`	`32`	Batch size
`learning_rate`	`float`	`1e-3`	Learning rate
`weight_decay`	`float`	`0.0`	L2 regularization
`optimizer`	`str`	`"Adam"`	`"Adam"`, `"AdamW"`, `"SGD"`
`scheduler`	`Optional[str]`	`None`	`"cosine"`, `"linear"`, `"step"`, `None`
`scheduler_min_lr`	`float`	`1e-6`	Minimum LR for scheduler
`grad_clip_norm`	`Optional[float]`	`None`	Max gradient norm; `None` = disabled
`dropout`	`float`	`0.0`	Dropout rate
`early_stopping_patience`	`int`	`0`	Epochs without improvement; `0` = disabled

RLConfig (extends MLBaseConfig)

Field	Type	Default	Description
`total_timesteps`	`int`	`1_000_000`	Total training timesteps
`gamma`	`float`	`0.99`	Discount factor
`learning_rate`	`float`	`3e-4`	Learning rate
`num_envs`	`int`	`1`	Parallel environments
`normalize_obs`	`bool`	`False`	Normalize observations
`normalize_reward`	`bool`	`False`	Normalize rewards

PPOConfig (extends RLConfig)

Field	Type	Default	Description
`frames_per_batch`	`int`	`2048`	Frames per rollout batch
`num_epochs`	`int`	`10`	PPO epochs per batch
`mini_batch_size`	`int`	`64`	Mini-batch size
`clip_epsilon`	`float`	`0.2`	PPO clipping range
`gae_lambda`	`float`	`0.95`	GAE lambda
`normalize_advantage`	`bool`	`True`	Normalize advantages
`value_coef`	`float`	`0.5`	Value loss coefficient
`entropy_coef`	`float`	`0.01`	Entropy bonus coefficient
`max_grad_norm`	`float`	`0.5`	Gradient clipping norm
`target_kl`	`Optional[float]`	`None`	KL early stopping; `None` = disabled

Extending the Hierarchy

To add a new algorithm (e.g., SAC):

python

@dataclass
class SACConfig(RLConfig):
    tau: float = 0.005              # soft update coefficient
    alpha: float = 0.2              # entropy temperature
    auto_alpha: bool = True         # auto-tune alpha
    buffer_size: int = 1_000_000    # replay buffer size
    batch_size: int = 256
    learning_rate: float = 3e-4
    num_epochs: int = 1             # gradient steps per env step

To add a new composable piece:

python

@dataclass
class WandbConfig:
    enabled: bool = False
    project: str = "my-project"
    entity: Optional[str] = None
    log_interval: int = 100