[ROCm]: Update rope+kvcache fusion conditions and disable custom op by default (#36716)

Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
This commit is contained in:
Rohan Potdar
2026-03-25 15:58:44 -05:00
committed by GitHub
parent 70a2152830
commit a0e8c74005
5 changed files with 42 additions and 18 deletions

View File

@@ -56,7 +56,6 @@ Fusions:
- `-cc.pass_config.fuse_norm_quant=True`*
- `-cc.pass_config.fuse_act_quant=True`*
- `-cc.pass_config.fuse_act_padding=True`
- `-cc.pass_config.fuse_rope_kvcache=True`† (will be moved to O2)
\* These fusions are only enabled when either op is using a custom kernel, otherwise Inductor fusion is better.</br>
† These fusions are ROCm-only and require AITER.
@@ -71,6 +70,9 @@ Settings (on top of `-O1`):
- `-cc.cudagraph_mode=FULL_AND_PIECEWISE`
- `-cc.pass_config.fuse_allreduce_rms=True`
- `-cc.pass_config.fuse_rope_kvcache=True`
† These fusions are ROCm-only and require AITER.
### `-O3`: Aggressive Optimization