Commit Graph - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Fork 0

Commit Graph

Select branches

Hide Pull Requests

cmm

main

ci/build/22474

submission

v0.1.0

v0.1.1

v0.1.2

v0.1.3

v0.1.4

v0.1.5

v0.1.6

v0.1.7

v0.10.0

v0.10.0rc1

v0.10.0rc2

v0.10.1

v0.10.1.1

v0.10.1rc1

v0.10.2

v0.10.2rc1

v0.10.2rc2

v0.10.2rc3

v0.11.0

v0.11.0rc1

v0.11.0rc2

v0.11.0rc3

v0.11.0rc4

v0.11.0rc5

v0.11.0rc6

v0.11.1

v0.11.1rc0

v0.11.1rc1

v0.11.1rc2

v0.11.1rc3

v0.11.1rc4

v0.11.1rc5

v0.11.1rc6

v0.11.1rc7

v0.11.2

v0.12.0

v0.13.0

v0.13.0rc1

v0.13.0rc2

v0.13.0rc3

v0.13.0rc4

v0.14.0

v0.14.0rc0

v0.14.0rc1

v0.14.0rc2

v0.14.1

v0.15.0

v0.15.0rc0

v0.15.0rc1

v0.15.0rc2

v0.15.0rc3

v0.15.1

v0.15.1rc0

v0.15.1rc1

v0.15.2rc0

v0.16.0

v0.16.0rc0

v0.16.0rc1

v0.16.0rc2

v0.16.0rc3

v0.16.1rc0

v0.17.0

v0.17.0rc0

v0.17.0rc1

v0.17.1

v0.17.1rc0

v0.17.2rc0

v0.18.0

v0.18.0rc0

v0.18.0rc1

v0.18.0rc2

v0.18.1

v0.18.1rc0

v0.18.2rc0

v0.19.0

v0.19.0rc0

v0.19.0rc1

v0.19.1rc0

v0.2.0

v0.2.1

v0.2.1.post1

v0.2.2

v0.2.3

v0.2.4

v0.2.5

v0.2.6

v0.2.7

v0.3.0

v0.3.1

v0.3.2

v0.3.3

v0.4.0

v0.4.0.post1

v0.4.1

v0.4.2

v0.4.3

v0.5.0

v0.5.0.post1

v0.5.1

v0.5.2

v0.5.3

v0.5.3.post1

v0.5.4

v0.5.5

v0.6.0

v0.6.1

v0.6.1.post1

v0.6.1.post2

v0.6.2

v0.6.3

v0.6.3.post1

v0.6.4

v0.6.4.post1

v0.6.5

v0.6.6

v0.6.6.post1

v0.7.0

v0.7.1

v0.7.2

v0.7.3

v0.8.0

v0.8.0rc1

v0.8.0rc2

v0.8.1

v0.8.2

v0.8.3

v0.8.3rc1

v0.8.4

v0.8.5

v0.8.5.post1

v0.9.0

v0.9.0.1

v0.9.1

v0.9.1rc1

v0.9.1rc2

v0.9.2

v0.9.2rc1

v0.9.2rc2

42d9a2c4c7 doc: fix bug report Github template formatting (#17486) David Xia 2025-04-30 13:03:20 -04:00
2ac74d098e [doc] add install tips (#17373) Reid 2025-05-01 01:02:41 +08:00
584f5fb4c6 [Bugfix][ROCm] Restrict ray version due to a breaking release (#17480) Gregory Shtrasberg 2025-04-30 12:59:06 -04:00
d586ddc691 [BugFix] Fix authorization of openai_transcription_client.py (#17321) zh Wang 2025-05-01 00:51:05 +08:00
0b7e701dd4 [Docs] Update optimization.md doc (#17482) Michael Goin 2025-04-30 10:34:02 -06:00
947f2f5375 [V1] Allow turning off pickle fallback in vllm.v1.serial_utils (#17427) Russell Bryant 2025-04-30 12:10:54 -04:00
739e03b344 [Bugfix] Fixed mistral tokenizer path when pointing to file (#17457) Pete Savage 2025-04-30 16:08:37 +01:00
da4e7687b5 [Fix] Support passing args to logger (#17425) Aaron Pham 2025-04-30 11:06:58 -04:00
39317cf42b [Docs] Add command for running mypy tests from CI (#17475) Russell Bryant 2025-04-30 11:06:09 -04:00
2990cee95b [Feature] The Qwen3 reasoning parser supports guided decoding (#17466) Chauncey 2025-04-30 22:48:21 +08:00
0be6d05b5e [V1][Metrics] add support for kv event publishing (#16750) Alec 2025-04-30 16:44:45 +02:00
77073c77bc [Core] Prevent side-channel attacks via cache salting (#17045) Marko Rosenmueller 2025-04-30 14:27:21 +02:00
a7d5b016bd [TPU][V1][CI] Update regression test baseline for v6 CI (#17064) Nicolò Lucchesi 2025-04-30 13:03:22 +02:00
d803786731 [V1][Bugfix]: vllm v1 verison metric num_gpu_blocks is None (#15755) rongfu.leng 2025-04-30 18:20:39 +08:00
1534d389af [Misc] Remove deprecated files (#17447) Chauncey 2025-04-30 16:52:19 +08:00
ece5a8b0b6 Make the _apply_rotary_emb compatible with dynamo (#17435) Lu Fang 2025-04-30 00:52:48 -07:00
54072f315f [MODEL ADDITION] Ovis2 Model Addition (#15826) Marco 2025-04-30 09:33:29 +02:00
be633fba0f [Bugfix] Fix AttributeError: 'State' object has no attribute 'engine_client' (#17434) Chauncey 2025-04-30 15:11:04 +08:00
ed6cfb90c8 [Hardware][Intel GPU] Upgrade to torch 2.7 (#17444) Kunshang Ji 2025-04-30 15:03:58 +08:00
6ed9f6047e [Intel GPU] [CI]Fix XPU ci, setuptools >=80.0 have build issue (#17298) Kunshang Ji 2025-04-30 13:54:10 +08:00
a44c4f1d2f Support LoRA for Mistral3 (#17428) Michael Goin 2025-04-29 22:10:30 -06:00
88fcf00dda Fix some speculative decode tests with tl.dot (#17371) Huy Do 2025-04-29 19:41:02 -07:00
d1f569b1b9 Fix call to logger.info_once (#17416) Harry Mellor 2025-04-30 03:39:18 +01:00
13698db634 Improve configs - ModelConfig (#17130) Harry Mellor 2025-04-30 03:38:22 +01:00
2c4f59afc3 Update PyTorch to 2.7.0 (#16859) Huy Do 2025-04-29 19:08:04 -07:00
1c2bc7ead0 Truncation control for embedding models (#14776) Gabriel Marinho 2025-04-29 22:24:57 -03:00
4055130a85 [release] Always git fetch all to get latest tag on TPU release (#17322) Kevin H. Luu 2025-04-29 17:52:11 -07:00
34120f5acd [V1][Feature] Enable Speculative Decoding with Structured Outputs (#14702) Benjamin Chislett 2025-04-29 17:02:10 -07:00
7489ec0bab Remove Bamba 9B from CI (#17407) Harry Mellor 2025-04-29 22:10:31 +01:00
70788bdbdc [V1][Spec Decode] Apply torch.compile & cudagraph to EAGLE (#17211) Bryan Lu 2025-04-29 14:10:00 -07:00
c9c1b59e59 Fix: Python package installation for opentelmetry (#17049) Dilip Gowda Bhagavan 2025-04-30 01:50:24 +05:30
0350809f3a Remove Falcon3 2x7B from CI (#17404) Harry Mellor 2025-04-29 20:52:25 +01:00
a6977dbd15 Simplify (and fix) passing of guided decoding backend options (#17008) Harry Mellor 2025-04-29 20:02:23 +01:00
2fa2a50bf9 [Bugfix] Fix Minicpm-O-int4 GPTQ model inference (#17397) Isotr0py 2025-04-30 02:21:42 +08:00
08e15defa9 [CI/Build] Add retry mechanism for add-apt-repository (#17107) Reid 2025-04-30 01:40:52 +08:00
b37685afbb [CI] Uses Python 3.11 for TPU (#17359) Aaron Pham 2025-04-29 13:39:16 -04:00
792595b59d [TPU][V1][CI] Replace python3 setup.py develop with standard pip install --e on TPU (#17374) Nicolò Lucchesi 2025-04-29 19:36:48 +02:00
0c1c788312 [Doc][Typo] Fixing label in new model requests link in overview.md (#17400) casinca 2025-04-29 19:29:48 +02:00
56d64fbe30 [Docs] Propose a deprecation policy for the project (#17063) Russell Bryant 2025-04-29 13:29:44 -04:00
608968b7c5 Enabling multi-group kernel tests. (#17115) Alexei-V-Ivanov-AMD 2025-04-29 12:27:27 -05:00
06ffc7e1d3 [Misc][ROCm] Exclude cutlass_mla_decode for ROCm build (#17289) TY-AMD 2025-04-30 01:26:42 +08:00
d3cf61b89b fix gemma3 results all zero (#17364) Qiming Zhang 2025-04-29 09:40:25 -07:00
a39203f99e [Bugfix] add qwen3 reasoning-parser fix content is None when disable … (#17369) mofanke 2025-04-30 00:32:40 +08:00
24e6ad3f16 [V1] Remove num_input_tokens from attn_metadata (#17193) Chen Zhang 2025-04-30 00:28:41 +08:00
2ef5d106bb Improve literal dataclass field conversion to argparse argument (#17391) Harry Mellor 2025-04-29 17:25:08 +01:00
0ed27ef66c Fix: Spelling of inference (#17387) a2q1p 2025-04-30 00:23:39 +08:00
900edfa8d4 Transformers backend tweaks (#17365) Harry Mellor 2025-04-29 17:08:03 +01:00
88ad9ec6b2 [Frontend] Support chat_template_kwargs in LLM.chat (#17356) Cyrus Leung 2025-04-29 22:03:35 +08:00
40896bdf3f pre-commit autoupdate (#17380) Harry Mellor 2025-04-29 14:46:55 +01:00
00ee37efa2 [Bugfix] Clean up MiniMax-VL and fix processing (#17354) Cyrus Leung 2025-04-29 20:42:16 +08:00
890f104cdf [Doc] Fix QWen3MOE info (#17381) Jee Jee Li 2025-04-29 20:38:32 +08:00
4a5e13149a Update docs requirements (#17379) Harry Mellor 2025-04-29 12:35:47 +01:00
97cc8729f0 [Model] Ignore rotary embed load for Cohere model (#17319) Ekagra Ranjan 2025-04-29 03:30:40 -04:00
4464109219 [Build][Bugfix] Restrict setuptools version to <80 (#17320) Gregory Shtrasberg 2025-04-29 03:17:23 -04:00
193e78e35d [Fix] Documentation spacing in compilation config help text (#17342) Hyogeun Oh (오효근) 2025-04-29 16:16:17 +09:00
bdb2cddafc [Misc]Use a platform independent interface to obtain the device attributes (#17100) ponix-j 2025-04-29 14:59:13 +08:00
ebb3930d28 [Misc] Move config fields to MultiModalConfig (#17343) Cyrus Leung 2025-04-29 14:37:21 +08:00
cde384cd92 [Model] support MiniMax-VL-01 model (#16328) qscqesze 2025-04-29 12:05:50 +08:00
96e06e3cb7 [Misc] Add a Jinja template to support Mistral3 function calling (#17195) Chauncey 2025-04-29 10:53:44 +08:00
17eb306fcc [Bugfix] Add contiguous call inside rope kernel wrapper (#17091) Zhengyuan Su (苏政渊) 2025-04-29 10:24:07 +08:00
165cb56329 Ignore '<string>' filepath (#17330) Richard Zou 2025-04-28 22:23:29 -04:00
d6da8a8ff2 [Bugfix] Fix numel() downcast in fused_layernorm_dynamic_per_token_quant.cu (#17316) Richard Barnes 2025-04-28 19:23:18 -07:00
b4ac4fa04d [model] make llama4 compatible with pure dense layers (#17315) Lucia Fang 2025-04-28 19:22:22 -07:00
e136000595 [V1][Spec Decode] Make Eagle model arch config driven (#17323) Ekagra Ranjan 2025-04-28 22:22:02 -04:00
86d9fc29cb implement Structural Tag with Guidance backend (#17333) Michał Moskal 2025-04-28 19:21:32 -07:00
506475de5f [Optim] Compute multimodal hash only once per item (#17314) Cyrus Leung 2025-04-29 09:40:35 +08:00
cfe4532093 [Benchmark] Add single turn MTBench to Serving Bench (#17202) Ekagra Ranjan 2025-04-28 19:46:15 -04:00
ba41cc90e8 [Model] Add tuned triton fused_moe configs for Qwen3Moe (#17328) v0.8.5 Michael Goin 2025-04-28 16:20:24 -06:00
8fc88d63f1 [Model] Add tuned triton fused_moe configs for Qwen3Moe (#17328) Michael Goin 2025-04-28 16:20:24 -06:00
6e74fd4945 Support loading transformers models with named parameters (#16868) Alex Wu 2025-04-28 15:15:58 -07:00
dcbac4cb4b [Model] Qwen3 Dense FP8 Compat Fixes (#17318) Simon Mo 2025-04-28 14:12:01 -07:00
ed2462030f [Bugfix] Fix moe weight losing all extra attrs after process_weights_after_loading. (#16854) Charlie Fu 2025-04-28 16:05:07 -05:00
cc5befbced [BugFix] Fix cascade attention - RuntimeError: scheduler_metadata must have shape (metadata_size) (#17283) Lucas Wilkinson 2025-04-28 16:55:50 -04:00
2c89cd96a8 [Chore] cleanup license indicators in light of SPDX (#17259) Aaron Pham 2025-04-28 15:43:52 -04:00
a0304dc504 [Security] Don't bind tcp zmq socket to all interfaces (#17197) Russell Bryant 2025-04-28 13:08:20 -04:00
c7941cca18 Explicitly explain quant method override ordering and ensure all overrides are ordered (#17256) Harry Mellor 2025-04-28 17:55:31 +01:00
b6dd32aa07 Make name of compressed-tensors quant method consistent across vLLM (#17255) Harry Mellor 2025-04-28 17:28:13 +01:00
f94886946e Improve conversion from dataclass configs to argparse arguments (#17303) Harry Mellor 2025-04-28 17:22:12 +01:00
72dfe4c74f [Docs] Add a security guide (#17230) Russell Bryant 2025-04-28 11:12:17 -04:00
8b464d9660 [Misc] Clean up Qwen2.5-Omni code (#17301) Cyrus Leung 2025-04-28 21:20:45 +08:00
889ebb2638 [Misc] Minor typo/grammar in platforms/interface.py (#17307) Nicolò Lucchesi 2025-04-28 14:45:42 +02:00
3ad986c28b [doc] update wrong model id (#17287) Reid 2025-04-28 19:20:51 +08:00
344e193b7d [Bugfix] Add missing get_language_model to new MLLMs (#17300) Cyrus Leung 2025-04-28 19:09:57 +08:00
fb1c933ade Add missing class docstring for PromptAdapterConfig (#17302) Harry Mellor 2025-04-28 12:06:59 +01:00
72c5b97231 Update tpu_worker.py 's typo (#17288) idouba 2025-04-28 19:01:15 +08:00
fa93cd9f60 [Model] Add Granite Speech Support (#16246) Alex Brooks 2025-04-28 04:05:00 -06:00
aec9674dbe [Core] Remove legacy input mapper/processor from V0 (#15686) Cyrus Leung 2025-04-28 15:38:48 +08:00
7fcc4223dc [Minor][Models] Pass partial_rotary_factor parameter to rope (#17266) Wanrui Dai 2025-04-28 12:28:59 +08:00
8262a3e23b [Misc] Validate stop_token_ids contents (#17268) Nick Hill 2025-04-27 20:54:05 -07:00
f211331c48 [Doc] small fix (#17277) Reid 2025-04-28 11:53:35 +08:00
9053d0b134 [Doc] Fix wrong github link in LMCache examples (#17274) Kuntai Du 2025-04-27 20:09:11 -07:00
cb3f2d8d10 [Bugfix] Fix Mistral3 spatial merge error (#17270) Michael Goin 2025-04-27 20:40:05 -06:00
c12df53b60 [Bugfix] Fix cutlass dispatch for fp8/int8 to properly invoke M<=16 c… (#16751) TherLF 2025-04-28 10:38:42 +08:00
d1aeea7553 [Bugfix] Fix missing ARG in Dockerfile for arm64 platforms (#17261) Lennart K. M. Schulz 2025-04-28 04:38:14 +02:00
d8bccde686 [BugFix] Fix vllm_flash_attn install issues (#17267) Lucas Wilkinson 2025-04-27 20:27:56 -04:00
20e489eaa1 [V1][Spec Decode] Make eagle compatible with prefix caching. (#17137) Lily Liu 2025-04-27 09:29:43 -07:00
4213475ec7 [Metrics] Fix minor inconsistencies in bucket progression (#17262) Cyrus Leung 2025-04-28 00:19:39 +08:00
d92879baf6 [doc] Add feature status legend (#17257) Reid 2025-04-27 23:17:02 +08:00
690fe019f0 [Feature] support sequence parallelism using compilation pass (#16155) cascade 2025-04-27 06:29:35 -07:00
ed7a29d9f8 [NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032) Kaixi Hou 2025-04-27 06:29:21 -07:00

... 96 97 98 99 100 ...