Commit Graph

9 Commits

Author SHA1 Message Date
Lu Fang
c6703d1e0d [MISC] Remove unused variableds in C++ (#19609)
Signed-off-by: Lu Fang <lufang@fb.com>
2025-06-15 20:05:28 -07:00
Lu Fang
8c0d15d5c5 [Misc][Easy] Annotate unused vars in the csrc files (#14798)
Signed-off-by: Lu Fang <lufang@fb.com>
2025-03-15 12:40:09 +08:00
Elfie Guo
0794e7446e [Misc] Add multipstep chunked-prefill support for FlashInfer (#10467) 2025-01-15 12:47:49 +08:00
Pavani Majety
b6dde33019 [Core] Flashinfer - Remove advance step size restriction (#10282) 2024-11-13 16:29:32 +08:00
Varun Sundar Rabindranath
afb050b29d [Core] CUDA Graphs for Multi-Step + Chunked-Prefill (#8645)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-10-02 19:44:39 +00:00
Varun Sundar Rabindranath
c2ec430ab5 [Core] Multi-Step + Single Step Prefills via Chunked Prefill code path (#8378)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-09-27 13:32:07 -07:00
Tyler Michael Smith
71d21c73ab [Bugfix] Fixup advance_step.cu warning (#8815) 2024-09-26 16:23:45 -07:00
William Lin
a6c0f3658d [multi-step] add flashinfer backend (#7928) 2024-09-12 11:16:22 -07:00
Alexander Matveev
e76466dde2 [Core] draft_model_runner: Implement prepare_inputs on GPU for advance_step (#6338) 2024-07-17 14:30:28 -07:00