Lucas Kabela
|
ea6d067a2a
|
[Misc][LLaMa4] Compile LLaMa Vision Encoder (#30709)
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
|
2026-01-09 22:01:38 -05:00 |
|
Matthew Bonanni
|
2612ba9285
|
[1/N][Attention] Restructure attention: move files (#31916)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-09 13:10:24 -08:00 |
|
Robert Shaw
|
5825bbc1f7
|
[Quantization] Deprecate Long Tail of Schemes (#31688)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-01-08 19:07:45 -05:00 |
|
Lucas Wilkinson
|
6cdf015c3c
|
[Misc] Fix Current vLLM config is not set. warnings, assert to avoid issues in the future (#31747)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-01-08 15:20:49 -08:00 |
|
DevByteAI
|
1f214290d6
|
fix(compile): apply partition wrapper when loading AOT cached functions (#31536)
Signed-off-by: Devbyteai <abud6673@gmail.com>
Signed-off-by: DevByteAI <161969603+devbyteai@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-01-08 17:27:26 +08:00 |
|
Angela Yi
|
9a1d20a89c
|
[CI] Add warmup run in test_fusion_attn (#31183)
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-01-07 00:31:52 +00:00 |
|
Charlie Fu
|
c07163663d
|
[ROCm][CI] Fix tests/compile unit tests (#28895)
Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Signed-off-by: Charlie Fu <Charlie.Fu@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-01-06 18:50:43 +00:00 |
|
wangxiyuan
|
bb4337b34c
|
[Platform] Deprecate seed_everything (#31659)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2026-01-04 18:34:04 -08:00 |
|
Boyuan Feng
|
2f12cd32c0
|
[BugFix] Fix cache issue in compilation_config (#31376)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
|
2025-12-27 09:30:39 -05:00 |
|
baonudesifeizhai
|
8711b21676
|
Fix/get raw stream patch #30905 (#30912)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-26 20:08:47 -08:00 |
|
vllmellm
|
f32cfd7d97
|
[ROCm][FEAT] Support AITER RMSNorm quantization fusion pass (#26575)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
|
2025-12-23 02:07:54 -08:00 |
|
Cyrus Leung
|
8cef137689
|
[Chore] Update more locations to use attention_config.backend (#31153)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-22 19:19:50 -08:00 |
|
Angela Yi
|
612d5ffdab
|
[ci] Fix Pytorch compilation test oom in 2.10 (#31194)
Signed-off-by: angelayi <yiangela7@gmail.com>
|
2025-12-23 01:56:47 +00:00 |
|
Lucas Wilkinson
|
30bb19a760
|
[BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support) (#30910)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-17 23:50:15 -08:00 |
|
Zhengxu Chen
|
5f2f3fba1d
|
[compile] Fix CI for test_gpt2_cache_hit (#30902)
Signed-off-by: zhxchen17 <zhxchen17@fb.com>
|
2025-12-17 20:22:23 -08:00 |
|
Matthew Bonanni
|
7eb6cb6c18
|
[Attention] Update tests to remove deprecated env vars (#30563)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-12-17 09:49:59 -08:00 |
|
Cyrus Leung
|
44d3b1df3d
|
[CI/Build] Fix compatibility between #30244 and #30396 (#30787)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-16 20:21:19 -08:00 |
|
Wentao Ye
|
b6ec077e05
|
[CI] Skip ci failure test (#30804)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-16 22:47:53 +00:00 |
|
ElizaWszola
|
994acec0cc
|
[Bugfix] Fix fusion for VL models (#30244)
Signed-off-by: ElizaWszola <ewszola@redhat.com>
|
2025-12-14 21:22:37 +08:00 |
|
Laith Sakka
|
f569c654e1
|
enable unbacked with aot_compile (#30462)
Signed-off-by: Laith Sakka <lsakka@meta.com>
|
2025-12-14 08:14:06 +00:00 |
|
Roberto L. Castro
|
4fa7ce46f3
|
[Feature] Add SM103 (Blackwell Ultra) Support to vLLM (#30484)
Signed-off-by: LopezCastroRoberto <robertol.c510@gmail.com>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-12-12 19:34:23 -08:00 |
|
rasmith
|
48661d275f
|
[CI/Build][AMD] Skip tests in test_fusions_e2e and test_dbo_dp_ep_gsm8k that require non-existing imports for ROCm (#30417)
Signed-off-by: Randall Smith <ransmith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>
|
2025-12-12 00:24:20 +00:00 |
|
Harry Mellor
|
cf3eacfe58
|
Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim (#30389)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-11 20:45:23 +00:00 |
|
Cyrus Leung
|
5a87d8b9b1
|
[Deprecation] Remove deprecated plugin and compilation fields for v0.13 release (#30396)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-10 19:59:35 -08:00 |
|
Charlie Fu
|
3c680f4a17
|
[Rocm][torch.compile] Adding layernorm + fp8 block quant and silu + fp8 block quant for Aiter (#25693)
Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Signed-off-by: Charlie Fu <Charlie.Fu@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: wuhuikx <hattie.wu@amd.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
|
2025-12-09 22:39:26 +00:00 |
|
Ilya Markov
|
0b6a8a304c
|
[BugFix] Fix non detected failing tests (#30277)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
|
2025-12-09 17:57:55 +00:00 |
|
Yanan Cao
|
7b35011ad1
|
Mark qwen2_5_vl as xfail (#30283)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2025-12-09 01:14:10 +00:00 |
|
Laith Sakka
|
87aee9ed2b
|
Add evaluate_guards option to DynamicShapesConfig (#27432)
Signed-off-by: Laith Sakka <lsakka@meta.com>
|
2025-12-08 10:46:15 -05:00 |
|
ElizaWszola
|
af0444bf40
|
[Performance] Fused blockwise quant RMS norm (#27883)
Signed-off-by: ElizaWszola <ewszola@redhat.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-07 16:38:04 +00:00 |
|
Cyrus Leung
|
e83b7e379c
|
Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145)" (#30199)
|
2025-12-07 00:00:22 -08:00 |
|
Cyrus Leung
|
27f4c2fd46
|
[Renderer] Separate out RendererConfig from ModelConfig (#30145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-06 23:15:42 -08:00 |
|
Wentao Ye
|
17eb25e327
|
[Perf] Enable cuda graph for deepepHT, 5.3% throughput improvement, 4.4% TTFT improvement (#29558)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-07 04:44:50 +00:00 |
|
Ilya Markov
|
4e26d3b09e
|
[Compile] Conditional compilation. Introduce compile_ranges (#24252)
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Luka Govedič <luka.govedic@gmail.com>
Signed-off-by: ProExpertProg <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Luka Govedič <luka.govedic@gmail.com>
|
2025-12-05 18:17:32 +00:00 |
|
Matthew Bonanni
|
66e674cdd5
|
[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2025-12-05 09:48:43 -08:00 |
|
Laith Sakka
|
1f0d184590
|
[aot_compile]change VLLM backend to read fake args from example_value (#29104)
Signed-off-by: Laith Sakka <lsakka@meta.com>
|
2025-12-04 17:33:45 -05:00 |
|
Arpit Khandelwal
|
dfdda96747
|
[Core] Remove forced None assignment for deprecated PassConfig flags (#29994)
Signed-off-by: arpitkh101 <arpit5khandelwal@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-04 09:15:04 +00:00 |
|
Arpit Khandelwal
|
d7284a2604
|
[Core] Rename PassConfig flags as per RFC #27995 (#29646)
Signed-off-by: arpitkh101 <arpit5khandelwal@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-03 03:38:55 +00:00 |
|
Harry Mellor
|
951445a52d
|
Remove default values from InitVars so that they're not stored (#29859)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 12:16:37 +00:00 |
|
Cyrus Leung
|
653591d5e7
|
[Chore] Move tokenizer initialization methods (#29793)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 13:33:37 +08:00 |
|
Yanan Cao
|
3461e7efd8
|
[Frontend] Remap -O to -cc commandline flag (#29557)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2025-11-28 21:51:12 +00:00 |
|
Morrison Turnansky
|
0838b52e2e
|
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): Set up -O infrastructure (#26847)
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: adabeyta <aabeyta@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Co-authored-by: adabeyta <aabeyta@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-27 01:55:58 -08:00 |
|
Matthew Bonanni
|
430dd4d9eb
|
[Attention] Remove imports from vllm/attention/__init__.py (#29342)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-11-26 10:53:15 -07:00 |
|
Huamin Li
|
70d5953f82
|
Revert "[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841)" (#29483)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2025-11-26 22:27:26 +08:00 |
|
Harry Mellor
|
51fc9e017a
|
Scheduled removal of CompilationConfig.use_inductor (#29323)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-25 12:55:42 +00:00 |
|
elvischenv
|
6330f9477d
|
[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-11-25 07:59:40 +00:00 |
|
Laith Sakka
|
7a228b5305
|
Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. (#26199)
Signed-off-by: Laith Sakka <lsakka@meta.com>
|
2025-11-24 10:12:41 -05:00 |
|
Copilot
|
61728cd1df
|
Re-enable FlashInfer for Llama4 on Blackwell in e2e fusion tests (#28966)
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-11-19 13:32:19 -05:00 |
|
Harry Mellor
|
a8b70304d6
|
Update rope_scaling to rope_parameters in preparation for Transformers v5 (#28542)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-19 09:06:36 -08:00 |
|
Yanan Cao
|
2c8b9182b5
|
[CI] Reorganize compile tests so new tests are automatically included in CI (#28625)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2025-11-19 06:13:50 -08:00 |
|
Angela Yi
|
f36292dbee
|
[compile] Enable sequence parallelism matching w/o custom ops enabled (#27126)
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: ProExpertProg <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <luka.govedic@gmail.com>
|
2025-11-15 11:46:12 +00:00 |
|