vllm/vllm/model_executor/layers/mamba at 16786da7357327f44f3a8f23d17e3c84235d2952 - vllm

Files

Andreas Karatzas 3e472e81f9 [ROCm][Bugfix][CI] Fix hybrid models and their tests (Mamba/Jamba/Bamba) (#32710 )

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
Co-authored-by: Matthew Wong <Matthew.Wong2@amd.com>

2026-02-05 10:01:23 +00:00

ops

[Performance] Tune Mamba selective scan kernel for B200 (#32873 )

2026-01-26 05:56:54 -08:00

__init__.py

[Kernel/Model] Migrate mamba_ssm and causal_conv1d kernels to vLLM (#7651 )

2024-08-28 15:06:52 -07:00

abstract.py

[V1][Hybrid] Mamba Prefix Caching with align mode (#30877 )

2026-01-23 09:56:48 -08:00

linear_attn.py

[1/N][Attention] Restructure attention: move files (#31916 )

2026-01-09 13:10:24 -08:00

mamba_mixer2.py

[MISC] Fix Tensor Parallelism for Quantized Mamba Models with n_groups=1 (#33257 )

2026-02-03 15:10:31 -05:00

mamba_mixer.py

[ROCm][Bugfix][CI] Fix hybrid models and their tests (Mamba/Jamba/Bamba) (#32710 )

2026-02-05 10:01:23 +00:00

mamba_utils.py

[PERF] Change GDN Attention State Layout from [N, HV, K, V] to [N, HV, V, K] (#33291 )

2026-02-04 11:20:52 +00:00

short_conv.py

[1/N][Attention] Restructure attention: move files (#31916 )

2026-01-09 13:10:24 -08:00