biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Kunshang Ji	cb9574eb85	[XPU][9/N] clean up existing ipex code/doc (#34111 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2026-02-11 00:27:15 -08:00
AllenDou	21dfb842d7	[model] support FunASR model (#33247 ) Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com> Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>	2026-02-11 07:37:09 +00:00
Hashem Hashemi	1b3540e6c6	Threshold fix wvSplitk for occasional CI fails (#34013 ) Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>	2026-02-11 03:59:14 +00:00
Cyrus Leung	c9a1923bb4	[Plugin] Simplify IO Processor Plugin interface (#34236 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-10 19:47:39 -08:00
Cyrus Leung	b5dcb372e4	[Misc] Clean up validation logic in input processor (#34144 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-10 19:29:29 -08:00
Richard Zou	e30cedd44b	[torch.compile] Stop doing unnecessary FakeTensorProp in PiecewiseCompileInterpreter (#34093 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2026-02-10 19:15:40 -08:00
bnellnm	d1481ba783	[MoE Refactor] Introduce MoERunner abstraction and move execution logic from FusedMoE to DefaultMoERunner (#32344 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2026-02-10 19:51:07 -05:00
Ilya Markov	67132945bb	[Perf] Move eplb rebalance algo to async thread (#30888 ) Signed-off-by: ilmarkov <markovilya197@gmail.com> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>	2026-02-10 22:19:10 +00:00
Gregory Shtrasberg	f0ca0671c7	[Feature] Warn about unrecognized environment variables (#33581 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>	2026-02-10 15:45:38 -06:00
Pavani Majety	578977bb5e	[SM100] Resubmit FMHA FP8 prefill for MLA (#31195 ) Signed-off-by: Pavani Majety <pmajety@nvidia.com>	2026-02-10 16:18:43 -05:00
junuxyz	c5a66d1697	[Core][BugFix] Fix PP KV cache sharding memory validation (#33698 ) Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com>	2026-02-10 10:46:24 -05:00
Roberto L. Castro	afdce12c89	[Perf][Kernel] Add faster topKperRow decode kernel for DeepSeek-V3.2 sparse attention (#33680 ) Signed-off-by: LopezCastroRoberto <rocastro@redhat.com> Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-10 10:29:52 -05:00
xuebwang-amd	b129136c7a	[ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008 ) Signed-off-by: xuebwang-amd <xuebwang@amd.com> Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2026-02-10 10:08:05 -05:00
Fan Yang	a1946570d8	add --insecure arg to the vllm bench to skip TLS (#34026 ) Signed-off-by: Fan Yang <yan9fan@meta.com> Co-authored-by: Fan Yang <yan9fan@meta.com>	2026-02-10 22:23:52 +08:00
Krish Gupta	748625cdaf	[V1][BugFix] Fix EAGLE3 encoder cache miss with disable_chunked_mm_input (#34220 ) Signed-off-by: KrxGu <krishom70@gmail.com>	2026-02-10 13:05:32 +00:00
Harry Mellor	61413973e8	Stop testing for slow tokenizers as they will not exist soon (#34235 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-02-10 12:08:20 +00:00
Chen Zhang	97fa8f6590	[BugFix] Avoid prefix cache hit in the same schedule step for mamba layers (#29387 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2026-02-10 07:41:16 +00:00
wang.yuqi	dab1de9f38	[Frontend][CI] Consolidate instrumentator entrypoints (#34123 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2026-02-10 07:30:19 +00:00
Andrew Xia	9608844f96	[responsesAPI] fix simpleContext streaming output_messages (#34188 ) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>	2026-02-09 22:53:07 -08:00
Cyrus Leung	ab97bcf662	[CI/Build] Relax `test_mcp_tool_call` (#34204 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-10 05:18:57 +00:00
Roger Wang	8a5e0e2b2b	[Bugfix][Core] Fix CPU memory leak from Request reference cycle in prefix caching (#34183 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2026-02-10 13:03:32 +08:00
Charlie Fu	bb9f97308d	[torch.compile][Fusion] Fix attention fusion pass removing kv_udpate op. (#33945 ) Signed-off-by: charlifu <charlifu@amd.com>	2026-02-09 16:15:43 -05:00
Mohammad Miadh Angkad	d4f123cc48	[Kernel] FlashInfer: switch allreduce fusion to unified API (#33985 ) Signed-off-by: Mohammad Miadh Angkad <176301910+mmangkad@users.noreply.github.com>	2026-02-09 15:43:24 +00:00
JJJYmmm	9562912cea	[MODEL] Adding Support for Qwen3.5 Models (#34110 ) Signed-off-by: JJJYmmm <1650675829@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2026-02-09 21:12:58 +08:00
Andreas Karatzas	3025b3cebb	[CI] Remove empty image_size_factors for fuyu, glm4_1v, glm_ocr (#34107 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2026-02-09 17:37:04 +08:00
Jee Jee Li	978a37c823	[Model] GLM adaptation (#34124 )	2026-02-09 17:32:52 +08:00
wang.yuqi	22b64948f6	[Frontend][last/5] Make pooling entrypoints request schema consensus. (#31127 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2026-02-09 06:42:38 +00:00
Andrey Talman	f97ca67176	[Release 2.10] Update to Torch 2.10 - final release (#30525 )	2026-02-08 13:51:09 -08:00
Reagan Lee	c4df59ad43	Add embedding input functionality for disabled modalities [remake] (#32493 ) Signed-off-by: Reagan Lee <“reaganjlee@gmail.com”> Signed-off-by: Reagan Lee <reaganjlee@gmail.com> Signed-off-by: Reagan Lee <96998476+reaganjlee@users.noreply.github.com> Co-authored-by: Reagan Lee <“reaganjlee@gmail.com”> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-08 04:57:16 -08:00
Nick Hill	a96197f564	[Perf] Simplify DeepseekV32 tokenizer, ensure fast detokenization used (#33855 ) Signed-off-by: Nick Hill <nickhill123@gmail.com>	2026-02-08 07:16:34 +00:00
Cyrus Leung	7fcb705b80	[CI/Build] Skip GCS test (#34057 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-07 08:52:38 -08:00
Hashem Hashemi	ed17f54c8b	Perf tuning and expansion of cases covered for wvSplitKrc (#33493 ) Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>	2026-02-07 05:33:11 -08:00
Jee Jee Li	db4ede9743	[Model] Enable Step3p5ForCausalLM testing (#33755 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2026-02-07 05:25:24 -08:00
Pooya Davoodi	2cb2340f7a	[Frontend]Add support for transcriptions and translations to run_batch (#33934 ) Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2026-02-07 05:24:57 -08:00
Richard Zou	81fe69cae5	[torch.compile] Stop compiling identical artifacts (#34003 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2026-02-07 05:24:48 -08:00
Cyrus Leung	edb359cce4	[Renderer] Define `render_cmpl` and `render_chat` (#34039 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-07 05:24:40 -08:00
lukec	15a0b9e570	Fix spelling errors (#33978 )	2026-02-06 23:58:50 -08:00
Cyrus Leung	48312e579a	[Misc] Make `PlaceholderRange.get_num_embeds` a method (#34035 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-07 05:30:17 +00:00
Ikenna	906077181b	[Bugfix] Fix QK Norm+RoPE fusion pattern matching on B200+FP8 (#33967 ) Signed-off-by: Ikenna <ikennachifo@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2026-02-07 02:27:33 +00:00
Aaron Hao	89a385d79f	[Feat][RL] Pause and Resume with keep requests for single engine (#32351 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com> Signed-off-by: Aaron Hao <ahao@anyscale.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2026-02-07 00:08:58 +00:00
kourosh hakhamaneshi	4a2d00eafd	[bugfix] [ROCm] Fix premature CUDA initialization in platform detection (#33941 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2026-02-06 16:17:55 -06:00
Sumanth R Hegde	ae2e93f89b	[Fix] Fix `logprobs=0` handling for `/inference/v1/generate` endpoint (#34010 ) Signed-off-by: SumanthRH <sumanthrh99@gmail.com>	2026-02-06 20:33:40 +00:00
Wentao Ye	77c09e1130	[Refactor] Remove align block size logic in `moe_permute` (#33449 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2026-02-06 10:57:06 -08:00
Seiji Eicher	aca5967416	[KV Connector] Add missing method overrides to MultiConnector (#33292 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2026-02-06 12:58:21 -05:00
Cyrus Leung	cd8b405bd0	[Refactor] Consolidate sequence normalization and enc-dec parsing (#33928 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-02-06 15:43:47 +00:00
Andreas Karatzas	350ca72c04	[ROCm][AITER] Fix AITER import regression for explicit backend selection (#33749 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2026-02-06 15:08:16 +00:00
Raushan Turganbay	85ee1d962b	[Bugfix] Fix models and tests for transformers v5 (#33977 ) Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-02-06 21:47:41 +08:00
Kurt Shuster	2991dd3d22	[Bugfix][Model] Support LoRA on Qwen3 Output Embedding (#29816 ) Signed-off-by: kurt <kurt@thinkingmachines.ai>	2026-02-06 20:25:31 +08:00
Luka Govedič	ac32e66cf9	[torch.compile] Reorganize vllm/compilation and tests/compile (0/N for vLLM IR) (#33731 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: ProExpertProg <luka.govedic@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2026-02-06 04:19:49 -08:00
Xinyu Chen	e969a169ef	support view_from_cpu_tensor on XPU (#33868 ) Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>	2026-02-06 08:34:20 +00:00

... 3 4 5 6 7 ...

4625 Commits