Kebab Microbenchmark and Dump

Name: kebab-microbench-and-dump
Rating: 87
Author: LancerLab

When to Use This Skill

•User requests microbenchmark execution (mbench-*)
•User needs SASS/PTX dumps for kernels
•User wants to compare implementations like native/vectorized/PTX/CuTe copy paths

•
Build:
- •make build
•
Run one microbenchmark:
- •make mbench-copy-gmem-to-smem
- •make mbench-mma-wgmma
- •make mbench-hgemm

Outputs are under dump/operator/.

Issue	Mitigation
`cuobjdump not found`	Ensure CUDA toolkit bin directory is available and retry
No PTX generated	Some binaries may not embed PTX; SASS output is still usable
Dump files too large	Focus on split kernel files instead of `all_kernels.*`