[LoRA] Update LoRA expand kernel block_n calculation (#32621)

Signed-off-by: Xin Yang <xyangx@amazon.com>
This commit is contained in:
Xin Yang
2026-02-23 23:17:53 -08:00
committed by GitHub
parent 6af03f2394
commit c870eb9e0f

View File

@@ -251,7 +251,7 @@ def get_lora_op_configs(
else:
default = {
"block_m": 64,
"block_n": max(64, next_power_of_2(128 // num_slices)),
"block_n": 64 if num_slices > 1 else 128,
"block_k": 16,
"num_warps": 4,
"num_ctas": 1,