TOML Config
All prime-rl commands use pydantic-settings with TOML configs and CLI overrides.
Running with configs
bash
# Load a config file with @ syntax uv run inference @ configs/debug/infer.toml uv run sft @ configs/debug/sft/train.toml uv run rl @ configs/debug/rl/train.toml # CLI overrides (take precedence over TOML) uv run inference @ config.toml --model.name Qwen/Qwen3-0.6B --server.port 8001 # Boolean flags: no value needed uv run inference --model.enforce_eager # sets to true uv run inference --no-model.enforce_eager # sets to false # CLI-only (no TOML file) uv run inference --model.name Qwen/Qwen3-0.6B --model.max_model_len 2048
TOML structure
Top-level fields must come before any [section] header — this is a TOML rule.
toml
# Top-level fields first gpu_memory_utilization = 0.5 seed = 42 # Then sections [model] name = "Qwen/Qwen3-0.6B" max_model_len = 4096 [server] port = 8000
Putting a top-level field after a section header nests it inside that section, which causes validation errors.
Config inheritance
Configs can inherit from other TOML files:
toml
toml_files = ["base.toml"] [model] name = "Qwen/Qwen3-0.6B" # overrides base
Paths in toml_files are relative to the file containing the field.
Setting None
Use the string "None" in TOML to set a field to None:
toml
max_model_len = "None"
Available commands
All accept @ config.toml and CLI overrides:
| Command | Config class | Description |
|---|---|---|
uv run rl | full RL pipeline | Orchestrator + inference + trainer |
uv run inference | InferenceConfig | vLLM inference server |
uv run trainer | trainer config | RL trainer |
uv run orchestrator | orchestrator config | Rollout orchestrator |
uv run env-server | env server config | Environment server |
uv run sft | SFT config | Supervised fine-tuning |
Key files
- •
src/prime_rl/utils/pydantic_config.py—parse_argv,BaseSettings,@syntax parsing - •
configs/— all config files, organized by task