Matthew Bonanni
|
dc5fa77a4e
|
[Bugfix][MTP][Sparse MLA] Allow sparse MLA with MTP to run with FULL cudagraphs (#34457)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2026-02-17 14:01:27 -05:00 |
|
Matthew Bonanni
|
20228cb851
|
[3/N][Attention] Move AttentionMetadata-related code from utils.py to backend.py (#32054)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-12 09:13:56 -08:00 |
|
Didier Durand
|
1a55cfafcb
|
[Doc]: fixing typos in various files (#30540)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Didier Durand <2927957+didier-durand@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-12-14 02:14:37 -08:00 |
|
Lucas Wilkinson
|
56539cddac
|
[Core] Refactor padding logic and pad for CUDA graphs before attention metadata building (#28579)
|
2025-11-26 14:07:13 -05:00 |
|
Didier Durand
|
63fed55506
|
[Doc]: fix typos in various files (#28811)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-11-16 14:30:06 +00:00 |
|
Benjamin Chislett
|
304419576a
|
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (#28479)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-11-13 01:56:40 +09:00 |
|
Harry Mellor
|
a742134cc5
|
Remove deprecated fields from CompilationConfig (#27593)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 16:10:28 +00:00 |
|
fhl2000
|
85fee74b33
|
[Bugfix][CI] Move resolving cudagraph_mode before initializing attn_metadata_builder (#27427)
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
|
2025-10-23 20:31:14 -07:00 |
|
Harry Mellor
|
4ffd6e8942
|
[Docs] Reduce custom syntax used in docs (#27009)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-16 20:05:34 -07:00 |
|
Morrison Turnansky
|
96b9aa5aa0
|
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): name change compilation level to compilation mode, deprecation compilation level (#26355)
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-15 02:51:16 +00:00 |
|
Cyrus Leung
|
ef9676a1f1
|
[Doc] ruff format some Python examples (#26767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-14 03:21:53 -07:00 |
|
fhl2000
|
63773a6200
|
[Docs] add docs for cuda graph v1 (#24374)
Signed-off-by: fhl <2410591650@qq.com>
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-07 05:25:05 -07:00 |
|