[Kernel] Triton implementation of causal-conv1d for Mamba-based models (#18218)

Signed-off-by: Tuan M. Hoang-Trong <tmhoangt@us.ibm.com>
Co-authored-by: Tuan M. Hoang-Trong <tmhoangt@us.ibm.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
This commit is contained in:
Tuan, Hoang-Trong
2025-07-09 15:53:55 -04:00
committed by GitHub
parent 31b96d1c64
commit 47043eb678
15 changed files with 1117 additions and 1142 deletions

View File

@@ -159,7 +159,7 @@ class MambaMixer(CustomOp):
hidden_states = causal_conv1d_fn(
hidden_states,
conv_weights,
self.conv1d.bias,
bias=self.conv1d.bias,
activation=self.activation,
conv_states=mamba_cache_params.conv_state,
has_initial_state=attn_metadata.context_lens_tensor > 0,