Commit Graph

11 Commits

Author SHA1 Message Date
Russell Bryant
37e8182bfe [v1] Add Whisper model support (encoder-decoder) (#21088)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
2025-09-10 13:53:35 -07:00
Asaf Joseph Gardin
3663870c72 [V1][Mamba1] - Full CUDA and Piecewise CUDA Graphs Support (#23035)
Signed-off-by: asafg <asafg@ai21.com>
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
Co-authored-by: asafg <asafg@ai21.com>
2025-08-20 20:08:51 -07:00
Asaf Joseph Gardin
3d232dbd19 [Mamba] - refactor: Renamed mamba_attn to mamba2_attn (#22818)
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
2025-08-15 06:38:05 +00:00
Chen Zhang
fceafaf582 [Bugfix][mamba] Fix type annotation of Mamba2Metadata (#22787)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-13 06:07:09 -07:00
Thomas Parnell
61f67d8acd [V1] [Hybrid] Enable Full CUDA Graph (decode-only) for Mamba layers (#21401)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-09 20:16:11 -07:00
Thomas Parnell
4abfd8796f [V1] [Hybrid] Validate compatibility of attention backend batch reordering at init time (#21557)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-02 05:29:40 -07:00
Chen Zhang
555e7225bc [v1][attention] Support Hybrid Allocator + FlashInfer (#21412)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-07-30 01:45:29 +00:00
Lucas Wilkinson
76b494444f [Attention] Refactor attention metadata builder interface (#20466)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-07-17 04:44:25 +00:00
nopperl
5d09152ff1 [V1] Enable Mamba2 layers other than MambaMixer2 in the v1 engine (#20660)
Signed-off-by: nopperl <54780682+nopperl@users.noreply.github.com>
2025-07-11 05:53:31 +00:00
Tuan, Hoang-Trong
47043eb678 [Kernel] Triton implementation of causal-conv1d for Mamba-based models (#18218)
Signed-off-by: Tuan M. Hoang-Trong <tmhoangt@us.ibm.com>
Co-authored-by: Tuan M. Hoang-Trong <tmhoangt@us.ibm.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-07-09 12:53:55 -07:00
Chen Zhang
a89209b78d [v1] Support mamba2 (#19327)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-06-18 20:34:15 +00:00