Huamin Li
|
157722da75
|
[perf] Use pinned memory for async H2D transfer in do_mamba_copy_block (#35480)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2026-02-28 01:50:37 +08:00 |
|
Thomas Parnell
|
d5fe3f702c
|
[Hybrid] Enable mamba prefix cache "align" mode with async scheduling (#33997)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2026-02-14 13:15:56 -08:00 |
|
Harry Huang
|
c027541eaf
|
[Hybrid] Enable spec decoding in mamba cache align mode (#33705)
Signed-off-by: huanghaoyan.hhy <huanghaoyan.hhy@alibaba-inc.com>
|
2026-02-13 13:02:28 -08:00 |
|
Harry Huang
|
5206e5e28c
|
[V1][Hybrid] Mamba Prefix Caching with align mode (#30877)
Signed-off-by: huanghaoyan.hhy <huanghaoyan.hhy@alibaba-inc.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
|
2026-01-23 09:56:48 -08:00 |
|