Wentao Ye
|
3352bf8b03
|
[CI Bug] Fix pre-commit issue in main (#39347)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-04-08 14:10:05 -07:00 |
|
Rishi Puri
|
ad05edfbca
|
tests/v1/e2e/spec_decode: assert async scheduling is used (#39206)
Signed-off-by: Rishi Puri <riship@nvidia.com>
Signed-off-by: Rishi Puri <puririshi98@berkeley.edu>
Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Benjamin Chislett <chislett.ben@gmail.com>
Co-authored-by: Flora Feng <4florafeng@gmail.com>
|
2026-04-08 20:30:03 +00:00 |
|
Benjamin Chislett
|
494636b29d
|
[Feat][Spec Decode] DFlash (#36847)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-03-30 15:03:15 -04:00 |
|
Andrii Skliar
|
cd7643015e
|
[Feature] Support per-draft-model MoE backend via --speculative-config (#37880)
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Signed-off-by: [Andrii Skliar] <askliar@nvidia.com>
Co-authored-by: Andrii Skliar <askliar@nvidia.com>
|
2026-03-25 14:31:52 +00:00 |
|
Kevin H. Luu
|
f1816fb192
|
[CI] Split V1 e2e + engine (1 GPU) into separate jobs (#36945)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-13 14:16:02 -07:00 |
|