Robert Shaw
|
2d9045fce8
|
[TPU][CI] Fix TPUModelRunner Test (#15667)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2025-03-28 00:01:26 -07:00 |
|
Robert Shaw
|
8a49eea74b
|
[CI][TPU] Temporarily Disable Quant Test on TPU (#15649)
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
|
2025-03-27 19:45:05 -07:00 |
|
Nick Hill
|
15dac210f0
|
[V1] AsyncLLM data parallel (#13923)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-03-27 16:14:41 -07:00 |
|
Nicolò Lucchesi
|
4098b72210
|
[Bugfix][TPU][V1] Fix recompilation (#15553)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-03-27 19:15:06 +00:00 |
|
Alexander Matveev
|
9d119a86ae
|
[V1] TPU CI - Fix test_compilation.py (#15570)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-26 21:51:54 +00:00 |
|
Alexei-V-Ivanov-AMD
|
dd8a29da99
|
Applying some fixes for K8s agents in CI (#15493)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-03-26 20:35:11 +00:00 |
|
Varun Sundar Rabindranath
|
ff38f0a32c
|
[CI/Build] LoRA: Delete long context tests (#15503)
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2025-03-25 17:18:34 -07:00 |
|
yarongmu-google
|
0a049c7d86
|
[CI/Build] Add tests for the V1 tpu_model_runner. (#14843)
Signed-off-by: Yarong Mu <ymu@google.com>
|
2025-03-25 12:27:16 -04:00 |
|
Thien Tran
|
4f044b1d67
|
[Kernel][CPU] CPU MLA (#14744)
Signed-off-by: Thien Tran <gau.nernst@yahoo.com.sg>
|
2025-03-25 09:34:59 +00:00 |
|
Siyuan Liu
|
23fdab00a8
|
[Hardware][TPU] Skip failed compilation test (#15421)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
|
2025-03-24 23:28:57 +00:00 |
|
Robin
|
d6cd59f122
|
[Frontend] Support tool calling and reasoning parser (#14511)
Signed-off-by: WangErXiao <863579016@qq.com>
|
2025-03-23 14:00:07 -07:00 |
|
youkaichao
|
f68cce8e64
|
[ci/build] fix broken tests in LLM.collective_rpc (#15350)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-03-23 14:49:48 +08:00 |
|
youkaichao
|
09b6a95551
|
[ci/build] update torch nightly version for GH200 (#15135)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-03-23 14:04:13 +08:00 |
|
hijkzzz
|
0661cfef7a
|
Fix v1 supported oracle for worker-cls and worker-extension-cls (#15324)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-03-23 10:23:35 +08:00 |
|
Russell Bryant
|
b877031d80
|
Remove openvino support in favor of external plugin (#15339)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-22 14:06:39 -07:00 |
|
Russell Bryant
|
790b79750b
|
[Build/CI] Fix env var typo (#15305)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-21 22:28:46 +00:00 |
|
Siyuan Liu
|
b15fd2be2a
|
[Hardware][TPU] Add check for no additional graph compilation during runtime (#14710)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
|
2025-03-21 03:05:28 +00:00 |
|
Chi Zhang
|
086b56824c
|
[ci] feat: make the test_torchrun_example run with tp=2, external_dp=2 (#15172)
Signed-off-by: Chi Zhang <zhangchi.usc1992@bytedance.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-03-21 00:30:04 +08:00 |
|
Kevin H. Luu
|
3d45e3d749
|
[release] Tag vllm-cpu with latest upon new version released (#15193)
|
2025-03-20 01:19:10 -07:00 |
|
Jovan Sardinha
|
70e500cad9
|
Fix broken tests (#14713)
Signed-off-by: JovanSardinha <jovan.sardinha@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-03-20 02:06:49 +00:00 |
|
Kunshang Ji
|
68cf1601d3
|
[CI][Intel GPU] update XPU dockerfile and CI script (#15109)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-03-19 01:29:25 -07:00 |
|
Alexander Matveev
|
72a8639b68
|
[V1] TPU - CI/CD use smaller model (#15054)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-18 21:39:21 +00:00 |
|
Alexander Matveev
|
18551e820c
|
[V1] TPU - Fix CI/CD runner (#14974)
|
2025-03-17 21:07:07 +00:00 |
|
Aaron Pham
|
c0efdd655b
|
[Fix][Structured Output] using vocab_size to construct matcher (#14868)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2025-03-17 11:42:45 -04:00 |
|
Cyrus Leung
|
6eaf1e5c52
|
[Misc] Add --seed option to offline multi-modal examples (#14934)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-17 03:00:17 -07:00 |
|
Sibi
|
a73e183e36
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
Signed-off-by: sibi <85477603+t-sibiraj@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-03-16 20:35:57 -07:00 |
|
Robert Shaw
|
aecc780dba
|
[V1] Enable Entrypoints Tests (#14903)
|
2025-03-16 17:56:16 -07:00 |
|
Kunshang Ji
|
f58aea002c
|
[CI][Intel GPU] refine intel GPU ci docker build (#14860)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-03-15 11:58:53 +00:00 |
|
Robert Shaw
|
d4d93db2c5
|
[V1] V1 Enablement Oracle (#13726)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2025-03-14 22:02:20 -07:00 |
|
Liangfu Chen
|
9f37422779
|
[Neuron][CI] update docker run command (#14829)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
|
2025-03-14 18:51:35 -07:00 |
|
Richard Liu
|
40677783aa
|
[CI] Add TPU v1 test (#14834)
Signed-off-by: Richard Liu <ricliu@google.com>
|
2025-03-14 17:13:30 -04:00 |
|
Alexei-V-Ivanov-AMD
|
270a5da495
|
Re-enable the AMD Entrypoints Test (#14711)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-03-14 12:18:13 -07:00 |
|
Kevin H. Luu
|
7097b4cc1c
|
[release] Remove log cleanup commands from TPU job (#14838)
|
2025-03-14 11:59:52 -07:00 |
|
Liangfu Chen
|
d3d4956261
|
[Neuron] flatten test parameterization for neuron attention kernels (#14712)
|
2025-03-13 20:46:56 -07:00 |
|
Kevin H. Luu
|
f1f632d9ec
|
[ci] Reduce number of tests in fastcheck (#14782)
|
2025-03-13 20:43:45 -07:00 |
|
Kevin H. Luu
|
ce20124671
|
[release] Add force remove for TPU logs (#14697)
|
2025-03-12 22:35:18 +00:00 |
|
Li, Jiang
|
ff47aab056
|
[CPU] Upgrade CPU backend to torch-2.6 (#13381)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-03-12 10:41:13 +00:00 |
|
Kevin H. Luu
|
9f583e360c
|
[release] Add commands to clean up logs on TPU release node (#14642)
|
2025-03-12 00:14:50 +00:00 |
|
Richard Liu
|
d374f04a33
|
Fix run_tpu_test (#14641)
Signed-off-by: <ricliu@google.com>
Signed-off-by: Richard Liu <ricliu@google.com>
|
2025-03-11 21:14:33 +00:00 |
|
Harry Mellor
|
206e2577fa
|
Move requirements into their own directory (#12547)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-08 16:44:35 +00:00 |
|
Aaron Pham
|
80e9afb5bc
|
[V1][Core] Support for Structured Outputs (#12388)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-03-07 07:19:11 -08:00 |
|
Thomas Parnell
|
8ca2b21c98
|
[CI] Disable spawn when running V1 Test (#14345)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-03-06 21:52:46 +00:00 |
|
youkaichao
|
151b08e0fe
|
[RLHF] use worker_extension_cls for compatibility with V0 and V1 (#14185)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-03-07 00:32:46 +08:00 |
|
Russell Bryant
|
ffad94397d
|
[CI/Build] Use spawn multiprocessing mode for V1 test pipeline (#14243)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-05 17:08:02 -08:00 |
|
Sage Moore
|
3e1d223626
|
[ROCm] Disable a few more kernel tests that are broken on ROCm (#14145)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-03-04 23:37:55 +00:00 |
|
TJian
|
848a6438ae
|
[ROCm] Faster Custom Paged Attention kernels (#12348)
|
2025-03-03 09:24:45 -08:00 |
|
Jee Jee Li
|
cc5e8f6db8
|
[Model] Add LoRA support for TransformersModel (#13770)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-03-02 09:17:34 +08:00 |
|
Sage Moore
|
38acae6e97
|
[ROCm] Fix the Kernels, Core, and Prefix Caching AMD CI groups (#13970)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-02-27 20:31:47 +00:00 |
|
Michael Goin
|
ca377cf1b9
|
Use CUDA 12.4 as default for release and nightly wheels (#12098)
|
2025-02-26 19:06:37 -08:00 |
|
Huy Do
|
e7ef74e26e
|
Fix some issues with benchmark data output (#13641)
Signed-off-by: Huy Do <huydhn@gmail.com>
|
2025-02-24 10:23:18 +08:00 |
|