Commit Graph

400 Commits

Author SHA1 Message Date
Andrey Talman
2111997f96 [release 2.11] Update to torch 2.11 (#34644) 2026-04-07 18:55:48 -07:00
Wentao Ye
419e73cdfa [Bug] Fix mistral version dependency (#39086)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2026-04-06 13:31:19 -04:00
Julien Denize
fef56c1855 [Mistral Grammar] Support Grammar Factory (#38150)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
2026-04-06 10:28:51 -04:00
Andreas Karatzas
5875bb2e9c [ROCm][CI] Added back missing common deps (#38937)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-04-03 15:58:57 -07:00
Qiming Zhang
6b4872240f [XPU] bump up xpu-kernel v0.1.5, transpose moe weights (#38342)
Signed-off-by: mayuyuace <qiming1.zhang@intel.com>
Signed-off-by: Qiming Zhang <qiming1.zhang@intel.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
2026-04-03 14:10:02 +00:00
Lucas Wilkinson
cb3935a8fc [FA4] Update flash-attention to latest upstream FA4 (#38690)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2026-04-02 17:02:37 +00:00
Run Yu
07edd551cc [CI/Build] Resolve a dependency deadlock when installing the test dependencies used in CI (#37766)
Signed-off-by: Run Yu <yurun00@gmail.com>
2026-03-31 18:05:14 +00:00
sihao_li
d71a15041f [XPU]move testing dependencies from Dockerfile to xpu-test.in (#38596)
Signed-off-by: sihao.li <sihao.li@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
2026-03-31 12:49:43 +00:00
Johnny
b4a2f3ac36 [NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (#38423)
Signed-off-by: johnnynunez <johnnynuca14@gmail.com>
Signed-off-by: Johnny <johnnynuca14@gmail.com>
2026-03-30 09:36:18 -07:00
Andreas Karatzas
db01535e2b [ROCm][CI] Add uv pip compile workflow for rocm-test.txt lockfile (#37930)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-26 12:44:01 -05:00
Ben Browning
1ac2ef2e53 [CI/Docs] Improve aarch64/DGX Spark support for dev setup (#38057)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-25 09:24:42 -07:00
Michael Goin
0f0e03890e [UX] Add flashinfer-cubin as CUDA default dep (#37233)
Signed-off-by: mgoin <mgoin64@gmail.com>
2026-03-24 14:13:08 -07:00
Andreas Karatzas
ffc8531524 [ROCm][CI] Added missing resampy dependency for MM audio tests (#37778)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-22 16:02:41 +08:00
Isotr0py
c7f98b4d0a [Frontend] Remove librosa from audio dependency (#37058)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-21 11:36:15 +08:00
Kunshang Ji
bdf6a0a57b [XPU] bump vllm-xpu-kernels to v0.1.4 (#37641)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-03-20 15:04:38 +08:00
sihao_li
9dade5da3a [XPU]Unify xpu test dependencies in dockerfile.xpu (#36477)
Signed-off-by: sihao.li <sihao.li@intel.com>
2026-03-19 08:12:07 +08:00
Chauncey
b322b197f1 [Build] Bump python openai version (#32316)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2026-03-18 18:20:10 +08:00
Brian Dellabetta
fa75204b16 bump compressed-tensors version to 0.14.0.1 (#36988)
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
2026-03-17 15:36:19 -04:00
Flora Feng
384dc7f77b [Refactor] Relocate completion and chat completion tests (#37125)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
2026-03-17 11:31:23 +08:00
SoluMilken
d8f8a7aad2 [Misc] Sync pre-commit to 4.5.1 in workflows and docs (#36675)
Signed-off-by: SoluMilken <ypiheyn.imm02g@g2.nctu.edu.tw>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-16 10:03:21 +00:00
Andreas Karatzas
57a314d155 [CI][Bugfix] Fix 500 errors from priority overflow and TemplateError subclasses in schema fuzz tests (#37127)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-16 05:27:21 +00:00
Hari
a3e2e250f0 [Feature] Add Azure Blob Storage support for RunAI Model Streamer (#34614)
Signed-off-by: hasethuraman <hsethuraman@microsoft.com>
2026-03-15 19:38:21 +08:00
Isotr0py
143e4dccdf [Misc] Add online audio_in_video test (#36775)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-15 00:14:11 -07:00
Russell Bryant
b3debb7e77 [Build] Upgrade xgrammar to get a security fix (#36168)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2026-03-15 03:13:48 +00:00
arlo
8c29042bb9 [Feature] Add InstantTensor weight loader (#36139) 2026-03-14 18:05:23 +01:00
Julien Denize
e42b49bd69 Mistral common v10 (#36971)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: root <root@h200-bar-196-227.slurm-bar-compute.tenant-slurm.svc.cluster.local>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-03-14 07:26:43 -07:00
Dimitrios Bariamis
cc16b24b17 Update Flashinfer to 0.6.6 (#36768)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
2026-03-12 13:19:19 -04:00
Louie Tsai
17852aa503 more models for vLLM Benchmark Suite (#35086)
Signed-off-by: louie-tsai <louie.tsai@intel.com>
2026-03-12 11:36:51 +08:00
typer-J
4184653775 feat: add RISC-V support for CPU backend (v2) (#36578)
Signed-off-by: typer-J <2236066784@qq.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-03-10 21:51:39 -07:00
Kevin H. Luu
82b110d50e [ci] Bound nvidia-cudnn-frontend version (#36719)
Signed-off-by: khluu <khluu000@gmail.com>
2026-03-11 12:17:35 +08:00
Chang Su
507ddbe992 feat(grpc): extract gRPC servicer into smg-grpc-servicer package, add --grpc flag to vllm serve (#36169)
Signed-off-by: Chang Su <chang.s.su@oracle.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2026-03-10 03:29:59 -07:00
Kevin H. Luu
aaf5fa9abf [ci] Bound openai dependency to 2.24.0 (#36471)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
2026-03-09 03:43:26 -07:00
Wentao Ye
384425f84e [Dependency] Remove default ray dependency (#36170)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-08 20:06:22 -07:00
Andreas Karatzas
807d680337 [ROCm][CI] Fix tool use test stability - disable skinny GEMM, prefix caching, eliminate batch variance (#35553)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-06 15:15:12 +08:00
Andreas Karatzas
639680d220 [ROCm][CI] Adding missing dependencies for Multi-modal models tests (#36177)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-06 12:23:10 +08:00
Cyrus Leung
ead7bde1ab [Bugfix] Make kaldi_native_fbank optional (#35996)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-03-04 06:47:32 -08:00
Kunshang Ji
a8f66cbde8 [XPU] bump vllm-xpu-kernels to v0.1.3 (#35984)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-03-04 18:23:31 +08:00
AllenDou
c1d963403c [model] support FireRedASR2 (#35727)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-03 19:41:30 -08:00
Nicolò Lucchesi
c212202d93 [Misc] Bound NIXL upper bound version (#35495)
Signed-off-by: NickLucche <nlucches@redhat.com>
2026-03-02 16:57:07 +08:00
Hongxia Yang
f26650d649 [ROCm] add amd-quark package in requirements for rocm to use quantized models (#35658)
Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>
Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com>
2026-03-02 06:02:43 +00:00
Lucas Wilkinson
8b5014d3dd [Attention] FA4 integration (#32974)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2026-03-01 23:44:57 +00:00
Sage Moore
9e2cabdf9c [ROCm] Update the torch version in rocm_build.txt to use the official 2.10 release (#34387)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2026-02-26 16:28:45 +00:00
Wentao Ye
d24bdd7c4b [CI] Bump mteb version to mteb[bm25s]>=2, <3 for pooling model unit tests (#34961)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2026-02-21 20:23:24 -08:00
Andreas Karatzas
d403c1da1c [CI] Stabilizing ROCm amd-ci signal and minor name fix in upstream (#35008)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-02-22 04:01:10 +00:00
Cyrus Leung
965fe45935 [CI/Build] Fix gRPC version mismatch (#35013)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-21 12:14:41 -07:00
BADAOUI Abdennacer
8dc8a99b56 [ROCm] Enable bitsandbytes quantization support on ROCm (#34688)
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
2026-02-21 00:34:55 -08:00
Vlad Tiberiu Mihailescu
e739c29ea4 [CI/Build] Add opentelemetry libs in default vllm build (requirements/common.txt) (#34466)
Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com>
2026-02-20 19:54:55 -08:00
Wei Zhao
ea5f903f80 Bump Flashinfer Version and Re-enable DeepSeek NVFP4 AR+Norm Fusion (#34899)
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-02-20 13:37:31 -08:00
Harry Mellor
6ce80f7071 Ensure that MkDocs v2 does not get installed (#34958)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-02-20 15:38:11 +00:00
Kyle Sayers
64ac1395e8 [Docs] Clean up speculators docs (#34065)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2026-02-18 13:48:11 -08:00