biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Lucas Wilkinson	15d76f74e2	Revert "[Misc] Enable weights loading tracking for quantized models" (#35309 )	2026-02-25 09:20:15 -08:00
Isotr0py	4572a06afe	[Misc] Enable weights loading tracking for quantized models (#35074 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-02-24 22:11:03 -08:00
emricksini-h	325ab6b0a8	[Feature] OTEL tracing during loading (#31162 )	2026-02-05 16:59:28 -08:00
weiyu	e7596371a4	[Refactor][TPU] Remove torch_xla path and use tpu-inference (#30808 ) Signed-off-by: Wei-Yu Lin <weiyulin@google.com> Signed-off-by: weiyu <62784299+weiyu0824@users.noreply.github.com>	2026-01-07 16:07:16 +08:00
Isotr0py	f946a8d743	[Chore]: Reorganize model repo operating functions in `transformers_utils` (#29680 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-28 08:46:51 -08:00
Julien Denize	57430fc95c	Default model load/config/tokenizer to `mistral` format if relevant files exist (#28659 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-11-21 13:58:59 -08:00
liangel-02	1d642872a2	[torchao] fix safetensors for sharding (#28169 ) Signed-off-by: Angel Li <liangel@meta.com>	2025-11-19 16:39:45 -08:00
Jerry Zhang	da94c7c0eb	Move online quantization to `model.load_weights` (#26327 ) Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>	2025-11-18 16:52:41 -08:00
Wentao Ye	52efc34ebf	[Log] Optimize Startup Log (#26740 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-10-24 19:27:04 -04:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Utkarsh Sharma	335b28f7d1	[TPU] Rename tpu_commons to tpu_inference (#26279 ) Signed-off-by: Utkarsh Sharma <utksharma@google.com> Co-authored-by: Utkarsh Sharma <utksharma@google.com> Co-authored-by: Chengji Yao <chengjiyao@google.com>	2025-10-07 23:30:52 -07:00
liangel-02	b32260ab85	[torchao] safetensors integration (#25969 ) Signed-off-by: Angel Li <liangel@meta.com>	2025-10-07 20:12:35 -06:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Jerry Zhang	c31246800c	Support RL online quantization with torchao (#23014 ) Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>	2025-10-01 16:39:29 -07:00
Nicolò Lucchesi	4cf71cc88a	[TPU] Deprecate `xm.mark_step` in favor of ``torch_xla.sync` (#25254 ) Signed-off-by: NickLucche <nlucches@redhat.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>	2025-09-22 10:12:57 +00:00
shengshiqi-google	41329a0ff9	[Core] feat: Add --safetensors-load-strategy flag for faster safetensors loading from Lustre (#24469 ) Signed-off-by: Shiqi Sheng <shengshiqi@google.com> Signed-off-by: shengshiqi-google <160179165+shengshiqi-google@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-10 23:10:01 -07:00
Harry Mellor	f36355abfd	Move `LoadConfig` from `config/__init__.py` to `config/load.py` (#24566 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-10 06:14:18 -07:00
Yang Kaiyong	43d9ad03ba	[Model loader]: support multi-thread model weight loading (#23928 ) Signed-off-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com> Signed-off-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Simon Mo <simon.mo@hey.com>	2025-09-08 18:49:39 +00:00
Li Wang	5e537f45b4	[Bugfix] Fix get_quant_config when using modelscope (#24421 ) Signed-off-by: wangli <wangli858794774@gmail.com>	2025-09-08 11:03:02 +00:00
Didier Durand	02d411fdb2	[Doc]: fix typos in Python comments (#24115 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-09-02 21:14:07 -07:00
Chengji Yao	e9d6a3db69	[TPU] make ptxla not imported when using tpu_commons (#23081 ) Signed-off-by: Chengji Yao <chengjiyao@gmail.com> Signed-off-by: Chengji Yao <chengjiyao@google.com> Co-authored-by: Chengji Yao <chengjiyao@gmail.com>	2025-08-19 11:46:42 +08:00
Ning Xie	adaf2c6d4f	[Bugfix] fix modelscope snapshot_download serialization (#21536 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-07-24 22:44:38 -07:00
22quinn	610852a423	[Core] Support model loader plugins (#21067 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-07-24 01:49:44 -07:00
Woosuk Kwon	4de7146351	[V0 deprecation] Remove V0 HPU backend (#21131 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-17 16:37:36 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
22quinn	9760fd8f6a	[Core] Support inplace model weights loading (#18745 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-06-02 17:38:50 +08:00
Mengqing Cao	6ab681bcbe	[Misc][ModelScope] Change to use runtime VLLM_USE_MODELSCOPE (#18655 ) Signed-off-by: Mengqing Cao <cmq0113@163.com> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-05-25 04:51:21 +00:00
Mark McLoughlin	c6b636f9fb	[V1][Spec Decoding] Use model_loader.get_model() to load models (#18273 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-23 02:05:44 +00:00
Harry Mellor	07ad27121f	Update deprecated type hinting in `model_loader` (#18130 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-15 04:00:21 -07:00
Jee Jee Li	822de7fb94	[Misc] Split model loader (#17712 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-05-07 12:42:26 +08:00

30 Commits