[fix] Remove trtllm ragged mla prefills (#36540)

Signed-off-by: Olya Kozlova <okozlova@nvidia.com>
This commit is contained in:
Olya Kozlova
2026-03-31 21:30:27 +02:00
committed by GitHub
parent b779eb3363
commit 598190aac3
8 changed files with 185 additions and 35 deletions

View File

@@ -73,7 +73,8 @@ TORCH_LIBRARY_EXPAND(TORCH_EXTENSION_NAME, ops) {
" Tensor prefix_output,"
" Tensor prefix_lse,"
" Tensor suffix_output,"
" Tensor suffix_lse) -> ()");
" Tensor suffix_lse,"
" int!? prefill_tokens_with_context) -> ()");
ops.impl("merge_attn_states", torch::kCUDA, &merge_attn_states);
#ifndef USE_ROCM
ops.def(