Alexei-V-Ivanov-AMD
|
5f67361fd1
|
Reverting re-direction to amd_mi355_X. (#29914)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-12-03 00:40:02 +00:00 |
|
maang-h
|
5d91d2b292
|
[Doc] Add allocate_slots parameter docs (#29777)
Signed-off-by: maang <maang_h@163.com>
Signed-off-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
|
2025-12-02 23:23:09 +00:00 |
|
Micah Williamson
|
c014de1ec7
|
[ROCm][CI] Fix test_cudagraph_mode.py Failure For AMD CI (#29808)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2025-12-02 22:54:36 +00:00 |
|
Julien Denize
|
1b1e35aaf9
|
[BUGFIX] Fix regex pattern for Mistral Tool Call (#29918)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:58 -08:00 |
|
Julien Denize
|
5e5646e206
|
[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention (#29908)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:20 -08:00 |
|
Chauncey
|
0a9caca9f5
|
[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine (#29764)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-12-02 22:42:28 +00:00 |
|
Sage Moore
|
e6f114ac25
|
[Bugfix][EPLB] Prevent user-provided EPLB config from being overwritten with defaults (#29911)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-12-02 13:20:22 -09:00 |
|
Harry Mellor
|
6fc5841db1
|
Fix some more Transformers nightly tests (#29872)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 21:49:44 +00:00 |
|
dependabot[bot]
|
3ff5b53bc2
|
Bump actions/setup-python from 6.0.0 to 6.1.0 (#29768)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2025-12-02 21:29:32 +00:00 |
|
jthomson04
|
1528e079e2
|
[Perf] Avoid pageable HtoD transfer in MinTokensLogitsProcessor (#29826)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
|
2025-12-02 21:25:52 +00:00 |
|
Divakar Verma
|
afb1e5b380
|
[CI][ROCm][tests/v1/e2e] Fix multiprocessing launch for the test (#29123)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-12-02 20:46:10 +00:00 |
|
Copilot
|
1c593e117d
|
Fix boolean nested params, add dict format support, and enhance plotting for vllm bench sweep (#29025)
Signed-off-by: Luka Govedič <luka.govedic@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <luka.govedic@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-02 20:40:56 +00:00 |
|
Navanit Dubey
|
a2b053dc85
|
feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE (#29896)
Signed-off-by: navanit-git <navanitdubey@gmail.com>
|
2025-12-02 19:28:35 +00:00 |
|
Matthew Bonanni
|
1d93f11675
|
[Attention][CUDAGraph] Remove CG padding from attention backends (#29352)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-12-02 13:48:08 -05:00 |
|
Benjamin Bartels
|
2d613de9ae
|
[CI/Build] Fixes missing runtime dependencies (#29822)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-02 10:21:49 -08:00 |
|
Alexei-V-Ivanov-AMD
|
c77b9929a0
|
Update AMD-CI testing mirror (as of 2025-12-02) (#29898)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-12-02 08:52:54 -09:00 |
|
Isotr0py
|
63b1da76ba
|
[Chore]: Reorganize gguf utils funtions under transformers_utils (#29891)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-02 17:33:23 +00:00 |
|
Andrew Xia
|
52cb349fc0
|
[responsesAPI][3] ResponsesParser to set up non harmony MCP (#29413)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 11:24:45 -05:00 |
|
Isotr0py
|
0ec8422171
|
[Bugfix] Fix incorrect channel order for idefics3 in edge case (#29881)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-02 16:03:52 +00:00 |
|
wang.yuqi
|
2eb4fe9129
|
[examples] Resettle pooling examples. (#29365)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 15:54:28 +00:00 |
|
Matthew Bonanni
|
51c57b51dd
|
[Bugfix] Fix DeepSeek R1 MTP weight loading (#29545)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-12-02 15:52:18 +00:00 |
|
ImaGoodFella
|
60c3d413af
|
[Multimodal][Core] Optimize multimodal preprocessing cache by hashing image bytes instead of pixel values (#29621)
Signed-off-by: Rahul Steiger <rasteiger@ethz.ch>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-02 21:49:02 +08:00 |
|
Cyrus Leung
|
68ffbca7e4
|
[Chore] Use tokenizer.encode and tokenizer.decode directly (#29851)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 12:30:40 +00:00 |
|
Harry Mellor
|
951445a52d
|
Remove default values from InitVars so that they're not stored (#29859)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 12:16:37 +00:00 |
|
Julien Denize
|
d8c6210eea
|
Add Mistral Large 3 and Ministral 3 (#29757)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Mickael Seznec <mickael@mistral.ai>
|
2025-12-02 10:29:00 +00:00 |
|
Louie Tsai
|
8bbcf8b6e7
|
[vLLM Benchmark Suite] Add default parameters section and update CPU benchmark cases (#29381)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
Signed-off-by: Louie Tsai <louie.tsai@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Li, Jiang <bigpyj64@gmail.com>
|
2025-12-02 09:00:23 +00:00 |
|
Boyuan Feng
|
70fb77b4dc
|
[BugFix] add max-num-batched-token to scheduler hash (#29829)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
|
2025-12-02 08:55:02 +00:00 |
|
杰兮
|
48d15a32aa
|
[CI] Fix Bad_words test for tokenizer encode/decode asymmetry (#28193)
Signed-off-by: zhyajie <yajizhan@amd.com>
Co-authored-by: zhyajie <yajizhan@amd.com>
|
2025-12-02 00:02:12 -08:00 |
|
Boyuan Feng
|
3b221cb661
|
[BugFix] respect VLLM_LOGGING_LEVEL in logger (#29761)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
|
2025-12-02 07:49:16 +00:00 |
|
Wushi Dong
|
0037b5746a
|
[Core] Eliminate redundant is_encoder_decoder lookups (20-40us/step) (#29800)
Signed-off-by: Wushi Dong <dongws@meta.com>
|
2025-12-02 07:08:07 +00:00 |
|
Harry Mellor
|
f5b0846ba0
|
Fix some Transformers nightly tests (#29802)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 07:05:27 +00:00 |
|
Zhang Xiangze
|
13ea39bc09
|
[CPU]Parallelize over tokens in int4 moe (#29600)
Signed-off-by: Zhang Xiangze <Xiangze.Zhang@arm.com>
|
2025-12-02 06:21:39 +00:00 |
|
Shengqi Chen
|
4b612664fd
|
[CI] Renovation of nightly wheel build & generation (take 2) (#29838)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
|
2025-12-01 22:17:10 -08:00 |
|
Cyrus Leung
|
653591d5e7
|
[Chore] Move tokenizer initialization methods (#29793)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 13:33:37 +08:00 |
|
Divakar Verma
|
e2fbfc955e
|
[CI][AMD] spec_decode:eagle skip FLASH_ATTN for deepseek on ROCm (#29827)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-12-02 05:27:46 +00:00 |
|
Divakar Verma
|
a690fb5bd6
|
[CI][ROCm] Fix test_correctness_sliding_window (#29243)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-02 04:53:27 +00:00 |
|
usberkeley
|
81fe3f82af
|
[BugFix] Fix index error in ngram_proposer (#29779)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
|
2025-12-02 04:48:11 +00:00 |
|
Zuyi Zhao
|
53bf71b0f0
|
[Misc] Update conftest for entrypoints/sagemaker test folder (#29799)
Signed-off-by: Zuyi Zhao <zhaozuy@amazon.com>
|
2025-12-01 18:56:39 -09:00 |
|
Johnny Yang
|
f441d36cee
|
Add missing return in _check_vllm_model_embed_input_ids (#29834)
Signed-off-by: Johnny Yang <johnnyyang@google.com>
|
2025-12-01 19:22:50 -08:00 |
|
Seiji Eicher
|
22274b2184
|
[Misc] Add ReplicaId to Ray metrics (#24267)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: rongfu.leng <1275177125@qq.com>
|
2025-12-02 03:21:44 +00:00 |
|
Wei Wei
|
fc95521ba5
|
[Misc] Throw error on unintended access to scheduler_config.max_model_len (#29771)
Signed-off-by: Wei Wei <wwei6@meta.com>
|
2025-12-02 10:58:44 +08:00 |
|
Zhuohan Li
|
d0cd728907
|
[Core] Support reseting all running requests' KV while calling reset_prefix_cache (#28827)
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-12-02 02:25:05 +00:00 |
|
Andrew Xia
|
fa8804ad9c
|
[responsesAPI][4] fix responseOutputItem Kimi K2 thinking bug (#29555)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 02:11:35 +00:00 |
|
Divakar Verma
|
4b40924998
|
[ROCm] Fallback pytorch GELU with tanh approximation to GELU() (#29244)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Signed-off-by: Divakar Verma <137818590+divakar-amd@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-02 02:02:22 +00:00 |
|
Hendrik Holtmann
|
c0dfc89485
|
SM120 / NVFP4: add device guard and runtime SM dispatch to cutlass_scaled_fp4_mm (#29711)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-12-01 17:24:18 -08:00 |
|
Nick Hill
|
44822d7ff2
|
[BugFix] Preserve spec decoding uniform decode when scheduling (#29759)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-12-01 17:15:52 -08:00 |
|
Alexei-V-Ivanov-AMD
|
342c4f1472
|
Updated CI mirror 2025-11-25 (#29434)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
Signed-off-by: Alexei-V-Ivanov-AMD <156011006+Alexei-V-Ivanov-AMD@users.noreply.github.com>
Co-authored-by: Kevin H. Luu <khluu000@gmail.com>
|
2025-12-01 23:44:33 +00:00 |
|
Kevin H. Luu
|
1336a1ea24
|
Revert #29787 and #29690 (#29815)
|
2025-12-01 13:42:03 -08:00 |
|
Nengjun Ma
|
eaf81485ed
|
[Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode (#28935)
Signed-off-by: leo-pony <nengjunma@outlook.com>
|
2025-12-01 15:02:18 -05:00 |
|
Finbarr Timbers
|
38caf7fa1a
|
Update FAQ on interleaving sliding windows support (#29796)
Signed-off-by: Finbarr Timbers <finbarrtimbers@gmail.com>
|
2025-12-01 19:15:19 +00:00 |
|