Commit Graph

84 Commits

Author SHA1 Message Date
Satyajith Chilappagari
043e4c4955 Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357)
Signed-off-by: Satyajith Chilappagari <satchill@amazon.com>
Co-authored-by: Aaron Dou <yzdou@amazon.com>
Co-authored-by: Shashwat Srijan <sssrijan@amazon.com>
Co-authored-by: Chongming Ni <chongmni@amazon.com>
Co-authored-by: Amulya Ballakur <amulyaab@amazon.com>
Co-authored-by: Patrick Lange <patlange@amazon.com>
Co-authored-by: Elaine Zhao <elaineyz@amazon.com>
Co-authored-by: Lin Lin Pan <tailinpa@amazon.com>
Co-authored-by: Navyadhara Gogineni <navyadha@amazon.com>
Co-authored-by: Yishan McNabb <yishanm@amazon.com>
Co-authored-by: Mrinal Shukla <181322398+mrinalks@users.noreply.github.com>
2025-05-07 00:07:30 -07:00
Yang Wang
6de3e13413 Add logging for torch nightly version (#17669)
Signed-off-by: Yang Wang <elainewy@meta.com>
2025-05-07 00:45:51 +00:00
Harry Mellor
022afbeb4e Fix doc build performance (#17748)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-07 00:36:41 +00:00
Jevin Jiang
621ca2c0ab [TPU] Increase block size and reset block shapes (#16458) 2025-05-06 13:55:04 -04:00
Isotr0py
cc05b90d86 [Doc] Fix broken cuda installation doc rendering (#17654)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-05-05 17:52:40 +00:00
Harry Mellor
d6484ef3c3 Add full API docs and improve the UX of navigating them (#17485)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-03 19:42:43 -07:00
22quinn
d47b605eca Update test requirements to CUDA 12.8 (#17576)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-05-02 21:40:15 -07:00
Liangfu Chen
22c6f6397f [Neuron][Build] Require setuptools >= 77.0.3 for PEP 639 (#17603)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
2025-05-03 02:41:59 +00:00
Yang Wang
b8b0859b5c add more pytorch related tests for torch nightly (#17422)
Signed-off-by: Yang Wang <elainewy@meta.com>
2025-05-02 03:29:59 -07:00
Cyrus Leung
f2e7af9b86 [CI/Build] Remove awscli dependency (#17532)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-01 09:20:54 -07:00
Russell Bryant
7a0a146c54 [Build] Require setuptools >= 77.0.3 for PEP 639 (#17389)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-30 23:25:36 -07:00
Rahul Tuli
200bbf92e8 Bump Compressed Tensors version to 0.9.4 (#17478)
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-04-30 15:24:45 -07:00
Gregory Shtrasberg
584f5fb4c6 [Bugfix][ROCm] Restrict ray version due to a breaking release (#17480)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-04-30 09:59:06 -07:00
Kunshang Ji
ed6cfb90c8 [Hardware][Intel GPU] Upgrade to torch 2.7 (#17444)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Qiming Zhang <qiming1.zhang@intel.com>
2025-04-30 00:03:58 -07:00
Kunshang Ji
6ed9f6047e [Intel GPU] [CI]Fix XPU ci, setuptools >=80.0 have build issue (#17298)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-04-29 22:54:10 -07:00
Huy Do
2c4f59afc3 Update PyTorch to 2.7.0 (#16859) 2025-04-29 19:08:04 -07:00
Aaron Pham
b37685afbb [CI] Uses Python 3.11 for TPU (#17359)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2025-04-29 17:39:16 +00:00
Harry Mellor
4a5e13149a Update docs requirements (#17379)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-29 11:35:47 +00:00
Gregory Shtrasberg
4464109219 [Build][Bugfix] Restrict setuptools version to <80 (#17320)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-04-29 00:17:23 -07:00
Agata Dobrzyniewicz
c48334d405 [Hardware][Intel-Gaudi] Update hpu-extension and update bucketing system for HPU device (#17186)
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
2025-04-26 05:55:14 -07:00
Cyrus Leung
9d98ab5ec6 [Misc] Inline Molmo requirements (#17190)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-25 16:41:44 +00:00
Harry Mellor
0bd7f8fca5 Bump Transformers to 4.51.3 (#17116)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-25 08:34:34 -07:00
Harry Mellor
0422ce109f Add :markdownhelp: to EngineArgs docs so markdown docstrings render properly (#17124)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-24 10:28:45 -07:00
Eyshika Agarwal
47bdee409c Molmo Requirements (#17026)
Signed-off-by: Eyshika Agarwal <eyshikaengineer@gmail.com>
Signed-off-by: eyshika <eyshikaengineer@gmail.com>
2025-04-24 10:08:37 -07:00
Yang Wang
f67e9e9f22 add Dockerfile build vllm against torch nightly (#16936)
Signed-off-by: Yang Wang <elainewy@meta.com>
2025-04-22 19:08:27 -07:00
Isotr0py
83f3c3bd91 [Model] Refactor Phi-4-multimodal to use merged processor and support V1 (#15477)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-19 02:26:11 -07:00
Tarun Kumar
e37073efd7 Add property-based testing for vLLM endpoints using an API defined by an OpenAPI 3.1 schema (#16721)
Signed-off-by: Tarun Kumar <takumar@redhat.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-04-17 21:08:27 -07:00
Nick Hill
05fcd1b430 [V1][Perf] Faster incremental detokenization (#15137)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-17 07:45:24 -07:00
Sage Moore
44fa4d556c [ROCM] Bind triton version to 3.2 in requirements-built.txt (#16664)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-04-16 14:05:28 +08:00
Shinichi Hemmi
3badb0213b [Model] Add PLaMo2 (#14323)
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: shemmi <shemmi@preferred.jp>
Co-authored-by: Kento Nozawa <nzw0301@preferred.jp>
Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp>
Co-authored-by: Calvin Metzger <metzger@preferred.jp>
2025-04-15 19:31:30 -07:00
Taneem Ibrahim
70e7ed841d [BugFix]: Update minimum pyzmq version (#16549)
Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
2025-04-14 20:06:03 -07:00
Siyuan Liu
c64ee87267 [Hardware][TPU] Add torchvision to tpu dependency file (#16616)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
2025-04-14 17:50:46 -04:00
courage17340
b1308b84a3 [Model][VLM] Add Kimi-VL model support (#16387)
Signed-off-by: courage17340 <courage17340@163.com>
2025-04-14 21:41:48 +00:00
Harry Mellor
9883a18859 Fix triton install condition on CPU (#16600)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-14 17:06:01 +00:00
Harry Mellor
51baa9c333 Don't install triton on ppc64le platform (#16470)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-11 10:11:00 +00:00
cyyever
ec1f9c8c91 Update Numba to 0.61.2 (#16376)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-10 07:59:37 +00:00
Roger Wang
0d4d06fe2f [CI][Bugfix] Pin triton version for CPU (#16384)
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-04-10 04:35:00 +00:00
Russell Bryant
fee5b8d37f [Build/CI] Add tracing deps to vllm container image (#15224)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-09 19:14:06 +00:00
Chendi.Xue
566f10a929 [CI]Fix hpu docker and numpy version for CI (#16355)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-04-09 17:52:26 +00:00
Chengji Yao
b1eb4ca152 [TPU] Update PyTorch/XLA (#16288)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-04-09 14:46:32 +08:00
Russell Bryant
cb84e45ac7 [Core] Upgrade to xgrammar 0.1.18, add cache size limit (#16283)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-08 19:13:22 -07:00
Chauncey
102bf967f0 [Model] Add smolvlm support (#16017)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-04-08 19:12:17 -07:00
Harry Mellor
90cb44eb02 Update to transformers==4.51.1 (#16257)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-08 06:53:39 -07:00
Isotr0py
f6b32efb7f [Bugfix] Fix and reorganize broken GGUF tests and bump gguf version (#16194)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-04-08 13:38:13 +08:00
Miles Williams
f94ab12f79 [Misc] Update compressed-tensors to version 0.9.3 (#16196)
Signed-off-by: Miles Williams <42222518+mlsw@users.noreply.github.com>
2025-04-07 19:09:06 -07:00
Nick Hill
7f6d47c1a2 [V1][BugFix] Exit properly if engine core fails during startup (#16137)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-07 15:30:15 -07:00
Lu Fang
55dcce91df Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-07 08:06:27 -07:00
Chengji Yao
9bde5ba127 [TPU] Update PyTorch/XLA (#16130)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-04-06 18:25:55 +00:00
yarongmu-google
45b1ff7a25 [Misc][Performance] Advance tpu.txt to the most recent nightly torch … (#16024) 2025-04-03 17:32:54 +00:00
Harry Mellor
55acf86bf8 Fix huggingface-cli[hf-xet] -> huggingface-cli[hf_xet] (#15969)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-02 23:37:30 +00:00