Kebab GEMM Config Tuning
When to Use This Skill
- •User asks to change GEMM benchmark parameters
- •User asks to compare layout modes (
RR,RC,CR,CC) - •User asks to test precision (
float16,bfloat16,float32) or kernel version lists - •User asks for tile-size and occupancy-oriented tuning plans
Key Configuration Surface
Main file: config.yaml
Important keys under operators.gemm:
- •
impl - •
versions - •
matrix_sizes - •
modes - •
precisions - •
tile_sizes - •
init_method
Step-by-Step Workflow
- •Edit
config.yamlwith one controlled change at a time. - •Build and run benchmark:
- •
make build - •
make bench-gemm
- •
- •For deeper analysis, run:
- •
make tune-gemm-cute(or cuda/ref)
- •
- •Save and compare outputs from
bench_results/,profiling/, andreports/.
Tuning Rules from Repository Context
- •Hopper/WGMMA paths generally expect
sm_90a. - •The documented tile constraint notes K dimension compatibility requirements for GMMA pathways.
- •Keep experiment dimensions and precision explicit to ensure reproducibility.
Troubleshooting
| Issue | Mitigation |
|---|---|
| Unexpected perf regression | Revert to previous versions/tile_sizes baseline and isolate one variable |
| Precision mismatch | Verify selected kernel path supports requested precision |
| Layout-specific anomaly | Test all modes and compare independently |
References
- •
config.yaml - •
docs/optimization_plan.md - •
docs/TILE_SIZE_CONFIGURATION.md - •
MakefileGEMM benchmark/profile targets