[ROCm]: Update rope+kvcache fusion conditions and disable custom op by default (#36716)
Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
This commit is contained in:
@@ -56,7 +56,6 @@ Fusions:
|
||||
- `-cc.pass_config.fuse_norm_quant=True`*
|
||||
- `-cc.pass_config.fuse_act_quant=True`*
|
||||
- `-cc.pass_config.fuse_act_padding=True`†
|
||||
- `-cc.pass_config.fuse_rope_kvcache=True`† (will be moved to O2)
|
||||
|
||||
\* These fusions are only enabled when either op is using a custom kernel, otherwise Inductor fusion is better.</br>
|
||||
† These fusions are ROCm-only and require AITER.
|
||||
@@ -71,6 +70,9 @@ Settings (on top of `-O1`):
|
||||
|
||||
- `-cc.cudagraph_mode=FULL_AND_PIECEWISE`
|
||||
- `-cc.pass_config.fuse_allreduce_rms=True`
|
||||
- `-cc.pass_config.fuse_rope_kvcache=True`†
|
||||
|
||||
† These fusions are ROCm-only and require AITER.
|
||||
|
||||
### `-O3`: Aggressive Optimization
|
||||
|
||||
|
||||
Reference in New Issue
Block a user