Isotr0py
|
a21cd9ed23
|
[Bugfix] Fix incorrect image_grid_thw rank for HunyuanOCR from missing merge_by_field_config=True (#29950)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 10:05:10 +00:00 |
|
WeiQing Chen
|
7fe9c1a223
|
[CI] Add Async Eplb nightly CI tests (#29385)
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-03 09:51:08 +00:00 |
|
Chauncey
|
3f42b05fbc
|
[Refactor] [1/N] to simplify the vLLM serving architecture (#28040)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-03 01:26:39 -08:00 |
|
Yong Hoon Shin
|
69520bc695
|
Add logging for cudagraph related info (#29825)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-12-03 01:01:48 -08:00 |
|
Andrew Xia
|
3a7751485b
|
[responsesAPI] support input output messages for non harmony models (#29549)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 23:59:23 -08:00 |
|
Cyrus Leung
|
bbfb55c29e
|
[Misc] Allow fetch_* utils to access local files by default (#29932)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-03 15:49:34 +08:00 |
|
JackieWu
|
0bec63fa31
|
[BugFix] fix imgs_pos in hunyuan_vl (#29879)
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 06:20:37 +00:00 |
|
elvischenv
|
c719c40540
|
[Bugfix] Defunctionalize TRTLLM AR+Norm op for avoiding extra clone kernel before it (#29631)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-03 05:15:50 +00:00 |
|
Russell Bryant
|
b08025a83b
|
[Docs] Discuss api key limitations in security guide (#29922)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-12-02 20:57:28 -08:00 |
|
Arpit Khandelwal
|
d7284a2604
|
[Core] Rename PassConfig flags as per RFC #27995 (#29646)
Signed-off-by: arpitkh101 <arpit5khandelwal@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-03 03:38:55 +00:00 |
|
Andreas Karatzas
|
506ed87e87
|
[ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues (#29909)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-03 10:36:49 +08:00 |
|
Roger Wang
|
4dd7978374
|
[Bugfix] Fix regression on pooling models from PR#29621 (#29921)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-03 10:33:45 +08:00 |
|
Lucas Wilkinson
|
5cdd664509
|
[BugFix] Fix assert in build_for_cudagraph_capture (#29893)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-02 16:56:54 -08:00 |
|
Alexei-V-Ivanov-AMD
|
5f67361fd1
|
Reverting re-direction to amd_mi355_X. (#29914)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-12-03 00:40:02 +00:00 |
|
maang-h
|
5d91d2b292
|
[Doc] Add allocate_slots parameter docs (#29777)
Signed-off-by: maang <maang_h@163.com>
Signed-off-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
|
2025-12-02 23:23:09 +00:00 |
|
Micah Williamson
|
c014de1ec7
|
[ROCm][CI] Fix test_cudagraph_mode.py Failure For AMD CI (#29808)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2025-12-02 22:54:36 +00:00 |
|
Julien Denize
|
1b1e35aaf9
|
[BUGFIX] Fix regex pattern for Mistral Tool Call (#29918)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:58 -08:00 |
|
Julien Denize
|
5e5646e206
|
[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention (#29908)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:20 -08:00 |
|
Chauncey
|
0a9caca9f5
|
[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine (#29764)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-12-02 22:42:28 +00:00 |
|
Sage Moore
|
e6f114ac25
|
[Bugfix][EPLB] Prevent user-provided EPLB config from being overwritten with defaults (#29911)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-12-02 13:20:22 -09:00 |
|
Harry Mellor
|
6fc5841db1
|
Fix some more Transformers nightly tests (#29872)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 21:49:44 +00:00 |
|
dependabot[bot]
|
3ff5b53bc2
|
Bump actions/setup-python from 6.0.0 to 6.1.0 (#29768)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2025-12-02 21:29:32 +00:00 |
|
jthomson04
|
1528e079e2
|
[Perf] Avoid pageable HtoD transfer in MinTokensLogitsProcessor (#29826)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
|
2025-12-02 21:25:52 +00:00 |
|
Divakar Verma
|
afb1e5b380
|
[CI][ROCm][tests/v1/e2e] Fix multiprocessing launch for the test (#29123)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-12-02 20:46:10 +00:00 |
|
Copilot
|
1c593e117d
|
Fix boolean nested params, add dict format support, and enhance plotting for vllm bench sweep (#29025)
Signed-off-by: Luka Govedič <luka.govedic@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <luka.govedic@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-02 20:40:56 +00:00 |
|
Navanit Dubey
|
a2b053dc85
|
feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE (#29896)
Signed-off-by: navanit-git <navanitdubey@gmail.com>
|
2025-12-02 19:28:35 +00:00 |
|
Matthew Bonanni
|
1d93f11675
|
[Attention][CUDAGraph] Remove CG padding from attention backends (#29352)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-12-02 13:48:08 -05:00 |
|
Benjamin Bartels
|
2d613de9ae
|
[CI/Build] Fixes missing runtime dependencies (#29822)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-02 10:21:49 -08:00 |
|
Alexei-V-Ivanov-AMD
|
c77b9929a0
|
Update AMD-CI testing mirror (as of 2025-12-02) (#29898)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-12-02 08:52:54 -09:00 |
|
Isotr0py
|
63b1da76ba
|
[Chore]: Reorganize gguf utils funtions under transformers_utils (#29891)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-02 17:33:23 +00:00 |
|
Andrew Xia
|
52cb349fc0
|
[responsesAPI][3] ResponsesParser to set up non harmony MCP (#29413)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 11:24:45 -05:00 |
|
Isotr0py
|
0ec8422171
|
[Bugfix] Fix incorrect channel order for idefics3 in edge case (#29881)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-02 16:03:52 +00:00 |
|
wang.yuqi
|
2eb4fe9129
|
[examples] Resettle pooling examples. (#29365)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 15:54:28 +00:00 |
|
Matthew Bonanni
|
51c57b51dd
|
[Bugfix] Fix DeepSeek R1 MTP weight loading (#29545)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-12-02 15:52:18 +00:00 |
|
ImaGoodFella
|
60c3d413af
|
[Multimodal][Core] Optimize multimodal preprocessing cache by hashing image bytes instead of pixel values (#29621)
Signed-off-by: Rahul Steiger <rasteiger@ethz.ch>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-02 21:49:02 +08:00 |
|
Cyrus Leung
|
68ffbca7e4
|
[Chore] Use tokenizer.encode and tokenizer.decode directly (#29851)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 12:30:40 +00:00 |
|
Harry Mellor
|
951445a52d
|
Remove default values from InitVars so that they're not stored (#29859)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 12:16:37 +00:00 |
|
Julien Denize
|
d8c6210eea
|
Add Mistral Large 3 and Ministral 3 (#29757)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Mickael Seznec <mickael@mistral.ai>
|
2025-12-02 10:29:00 +00:00 |
|
Louie Tsai
|
8bbcf8b6e7
|
[vLLM Benchmark Suite] Add default parameters section and update CPU benchmark cases (#29381)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
Signed-off-by: Louie Tsai <louie.tsai@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Li, Jiang <bigpyj64@gmail.com>
|
2025-12-02 09:00:23 +00:00 |
|
Boyuan Feng
|
70fb77b4dc
|
[BugFix] add max-num-batched-token to scheduler hash (#29829)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
|
2025-12-02 08:55:02 +00:00 |
|
杰兮
|
48d15a32aa
|
[CI] Fix Bad_words test for tokenizer encode/decode asymmetry (#28193)
Signed-off-by: zhyajie <yajizhan@amd.com>
Co-authored-by: zhyajie <yajizhan@amd.com>
|
2025-12-02 00:02:12 -08:00 |
|
Boyuan Feng
|
3b221cb661
|
[BugFix] respect VLLM_LOGGING_LEVEL in logger (#29761)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
|
2025-12-02 07:49:16 +00:00 |
|
Wushi Dong
|
0037b5746a
|
[Core] Eliminate redundant is_encoder_decoder lookups (20-40us/step) (#29800)
Signed-off-by: Wushi Dong <dongws@meta.com>
|
2025-12-02 07:08:07 +00:00 |
|
Harry Mellor
|
f5b0846ba0
|
Fix some Transformers nightly tests (#29802)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 07:05:27 +00:00 |
|
Zhang Xiangze
|
13ea39bc09
|
[CPU]Parallelize over tokens in int4 moe (#29600)
Signed-off-by: Zhang Xiangze <Xiangze.Zhang@arm.com>
|
2025-12-02 06:21:39 +00:00 |
|
Shengqi Chen
|
4b612664fd
|
[CI] Renovation of nightly wheel build & generation (take 2) (#29838)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
|
2025-12-01 22:17:10 -08:00 |
|
Cyrus Leung
|
653591d5e7
|
[Chore] Move tokenizer initialization methods (#29793)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 13:33:37 +08:00 |
|
Divakar Verma
|
e2fbfc955e
|
[CI][AMD] spec_decode:eagle skip FLASH_ATTN for deepseek on ROCm (#29827)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-12-02 05:27:46 +00:00 |
|
Divakar Verma
|
a690fb5bd6
|
[CI][ROCm] Fix test_correctness_sliding_window (#29243)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-02 04:53:27 +00:00 |
|
usberkeley
|
81fe3f82af
|
[BugFix] Fix index error in ngram_proposer (#29779)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
|
2025-12-02 04:48:11 +00:00 |
|