wliao2
|
4dfad17ed1
|
replace cuda_device_count_stateless() to current_platform.device_count() (#37841)
Signed-off-by: Liao, Wei <wei.liao@intel.com>
Signed-off-by: wliao2 <wei.liao@intel.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-31 22:32:54 +08:00 |
|
Kyle Sayers
|
d28d86e8a3
|
[QeRL] Fix online quantized reloading (#38442)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
2026-03-29 14:56:41 -06:00 |
|
Kyle Sayers
|
648edcf729
|
[QeRL] Compose online quantization with quantized reloading (#38032)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
2026-03-27 13:22:33 -07:00 |
|
Roy Wang
|
821eb80c0d
|
[Performance][Model Loader] Skip non-local expert weights during EP model loading (#37136)
Signed-off-by: esmeetu <jasonailu87@gmail.com>
|
2026-03-16 01:33:36 -07:00 |
|
Hari
|
a3e2e250f0
|
[Feature] Add Azure Blob Storage support for RunAI Model Streamer (#34614)
Signed-off-by: hasethuraman <hsethuraman@microsoft.com>
|
2026-03-15 19:38:21 +08:00 |
|
arlo
|
8c29042bb9
|
[Feature] Add InstantTensor weight loader (#36139)
|
2026-03-14 18:05:23 +01:00 |
|
Kunshang Ji
|
53ec16a705
|
[Hardware] Replace torch.cuda.device_count/current_device/set_device API (#36145)
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-12 07:57:47 -07:00 |
|
Kunshang Ji
|
16d2ad1d38
|
[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-04 09:49:47 +00:00 |
|
Cyrus Leung
|
7fcb705b80
|
[CI/Build] Skip GCS test (#34057)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-07 08:52:38 -08:00 |
|
Micah Williamson
|
6c64c41b4a
|
[ROCm][CI] Force max_num_seqs=1 on ROCm In test_sharded_state_loader to reduce flakiness (#33277)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-01-31 12:28:29 +08:00 |
|
Kyle Sayers
|
f857a03f6b
|
[QeRL] Layerwise Reloading (#32133)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
2026-01-30 08:50:05 -07:00 |
|
Cyrus Leung
|
aafd4d2354
|
[Chore] Try remove init_cached_hf_modules (#31786)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-07 12:34:04 +08:00 |
|
Noa Neria
|
6366c098d7
|
Validating Runai Model Streamer Integration with S3 Object Storage (#29320)
Signed-off-by: Noa Neria <noa@run.ai>
|
2025-12-04 18:04:43 +08:00 |
|
Cyrus Leung
|
aab0102a26
|
[V0 deprecation] Remove more V0 references (#29088)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-21 11:56:59 +00:00 |
|
TJian
|
82b05b15e6
|
[BugFix] [FEAT] Enable fastsafetensors for ROCm platform (#28225)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-11-20 16:34:11 +00:00 |
|
Alexis MacAskill
|
a47d94f18c
|
Add runai model streamer e2e test for GCS (#28079)
Signed-off-by: Alexis MacAskill <amacaskill@google.com>
|
2025-11-07 03:07:54 +00:00 |
|
Zhewen Li
|
0291fbf65c
|
[CI/Build] Fix amd model executor test (#27612)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-10-28 08:58:11 +00:00 |
|
Nick Hill
|
647214f3d5
|
[V0 Deprecation] Remove V0 executors (#27142)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-21 11:09:37 -07:00 |
|
iAmir97
|
7a6c8c3fa1
|
[Chore] Separate out vllm.utils.network_utils (#27164)
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
|
2025-10-19 03:06:32 -07:00 |
|
Cyrus Leung
|
4d4d6bad19
|
[Chore] Separate out vllm.utils.importlib (#27022)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-17 00:48:59 +00:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Harry Mellor
|
4e256cadc2
|
Remove all references to yapf as it's no longer used (#26251)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 09:18:11 -07:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
pwschuurman
|
be22bb6f3d
|
Run:ai model streamer add GCS package support (#24909)
Signed-off-by: Peter Schuurman <psch@google.com>
|
2025-10-01 20:59:13 -07:00 |
|
Aaron Pham
|
6a113d9aed
|
[V0 Deprecation] Remove vllm.worker and update according imports (#25901)
|
2025-09-29 23:26:11 +00:00 |
|
Cyrus Leung
|
d346ec695e
|
[CI/Build] Consolidate model loader tests and requirements (#25765)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-26 21:45:20 -07:00 |
|
Cyrus Leung
|
bc9d7b5595
|
[CI/Build] Split up Distributed Tests (#25572)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-26 14:49:33 +02:00 |
|
Harry Mellor
|
f36355abfd
|
Move LoadConfig from config/__init__.py to config/load.py (#24566)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-10 06:14:18 -07:00 |
|
22quinn
|
610852a423
|
[Core] Support model loader plugins (#21067)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-07-24 01:49:44 -07:00 |
|