biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
wliao2	4dfad17ed1	replace cuda_device_count_stateless() to current_platform.device_count() (#37841 ) Signed-off-by: Liao, Wei <wei.liao@intel.com> Signed-off-by: wliao2 <wei.liao@intel.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2026-03-31 22:32:54 +08:00
Kyle Sayers	d28d86e8a3	[QeRL] Fix online quantized reloading (#38442 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2026-03-29 14:56:41 -06:00
Kyle Sayers	648edcf729	[QeRL] Compose online quantization with quantized reloading (#38032 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2026-03-27 13:22:33 -07:00
Roy Wang	821eb80c0d	[Performance][Model Loader] Skip non-local expert weights during EP model loading (#37136 ) Signed-off-by: esmeetu <jasonailu87@gmail.com>	2026-03-16 01:33:36 -07:00
Hari	a3e2e250f0	[Feature] Add Azure Blob Storage support for RunAI Model Streamer (#34614 ) Signed-off-by: hasethuraman <hsethuraman@microsoft.com>	2026-03-15 19:38:21 +08:00
arlo	8c29042bb9	[Feature] Add InstantTensor weight loader (#36139 )	2026-03-14 18:05:23 +01:00
Kunshang Ji	53ec16a705	[Hardware] Replace torch.cuda.device_count/current_device/set_device API (#36145 ) Signed-off-by: Kunshang Ji <jikunshang95@gmail.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2026-03-12 07:57:47 -07:00
Kunshang Ji	16d2ad1d38	[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` (#30681 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-04 09:49:47 +00:00
Cyrus Leung	7fcb705b80	[CI/Build] Skip GCS test (#34057 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-07 08:52:38 -08:00
Micah Williamson	6c64c41b4a	[ROCm][CI] Force max_num_seqs=1 on ROCm In test_sharded_state_loader to reduce flakiness (#33277 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2026-01-31 12:28:29 +08:00
Kyle Sayers	f857a03f6b	[QeRL] Layerwise Reloading (#32133 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2026-01-30 08:50:05 -07:00
Cyrus Leung	aafd4d2354	[Chore] Try remove `init_cached_hf_modules` (#31786 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-07 12:34:04 +08:00
Noa Neria	6366c098d7	Validating Runai Model Streamer Integration with S3 Object Storage (#29320 ) Signed-off-by: Noa Neria <noa@run.ai>	2025-12-04 18:04:43 +08:00
Cyrus Leung	aab0102a26	[V0 deprecation] Remove more V0 references (#29088 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-21 11:56:59 +00:00
TJian	82b05b15e6	[BugFix] [FEAT] Enable fastsafetensors for ROCm platform (#28225 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-11-20 16:34:11 +00:00
Alexis MacAskill	a47d94f18c	Add runai model streamer e2e test for GCS (#28079 ) Signed-off-by: Alexis MacAskill <amacaskill@google.com>	2025-11-07 03:07:54 +00:00
Zhewen Li	0291fbf65c	[CI/Build] Fix amd model executor test (#27612 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-28 08:58:11 +00:00
Nick Hill	647214f3d5	[V0 Deprecation] Remove V0 executors (#27142 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-10-21 11:09:37 -07:00
iAmir97	7a6c8c3fa1	[Chore] Separate out `vllm.utils.network_utils` (#27164 ) Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com> Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>	2025-10-19 03:06:32 -07:00
Cyrus Leung	4d4d6bad19	[Chore] Separate out `vllm.utils.importlib` (#27022 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-17 00:48:59 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Harry Mellor	4e256cadc2	Remove all references to `yapf` as it's no longer used (#26251 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 09:18:11 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
pwschuurman	be22bb6f3d	Run:ai model streamer add GCS package support (#24909 ) Signed-off-by: Peter Schuurman <psch@google.com>	2025-10-01 20:59:13 -07:00
Aaron Pham	6a113d9aed	[V0 Deprecation] Remove `vllm.worker` and update according imports (#25901 )	2025-09-29 23:26:11 +00:00
Cyrus Leung	d346ec695e	[CI/Build] Consolidate model loader tests and requirements (#25765 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-26 21:45:20 -07:00
Cyrus Leung	bc9d7b5595	[CI/Build] Split up Distributed Tests (#25572 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-26 14:49:33 +02:00
Harry Mellor	f36355abfd	Move `LoadConfig` from `config/__init__.py` to `config/load.py` (#24566 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-10 06:14:18 -07:00
22quinn	610852a423	[Core] Support model loader plugins (#21067 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-07-24 01:49:44 -07:00

29 Commits