Michard Hugo
|
25708d317a
|
[Bugfix] Mistral crashes on tool with no description (#21167)
Signed-off-by: HugoMichard <hugo@harfanglab.fr>
|
2025-07-28 08:03:35 -07:00 |
|
Cyrus Leung
|
0e18a5d058
|
[Misc] Reduce logs for model resolution (#21765)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-28 07:59:56 -07:00 |
|
Michael Goin
|
34a20c49b3
|
[Logs] Change flashinfer sampler logs to once (#21759)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-28 06:59:51 -07:00 |
|
Isotr0py
|
31084b3b1f
|
[Bugfix][CI/Build] Update peft version in test requirement (#21729)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-28 06:17:43 -07:00 |
|
wuhang
|
bccc43c033
|
[Bugfix]check health for engine core process exiting unexpectedly (#21728)
Signed-off-by: wuhang <wuhang6@huawei.com>
|
2025-07-28 06:17:31 -07:00 |
|
Harry Mellor
|
1395dd9c28
|
[Docs] Add revision date to rendered docs (#21752)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-28 06:12:46 -07:00 |
|
Keyang Ru
|
9ace2eaf35
|
[Bugfix] Improve JSON extraction in LlamaToolParser (#19024)
Signed-off-by: keru <keyang.ru@oracle.com>
Co-authored-by: keru <keyang.ru@oracle.com>
|
2025-07-28 12:36:58 +00:00 |
|
Anton Vlasjuk
|
656c24f1b5
|
[Ernie 4.5] Name Change for Base 0.3B Model (#21735)
Signed-off-by: vasqu <antonprogamer@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-28 12:22:32 +00:00 |
|
Chauncey
|
63fe3a700f
|
[PD] let p2p nccl toy proxy handle /chat/completions (#21734)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-07-28 11:45:50 +00:00 |
|
Isotr0py
|
0ae970ed15
|
[Bugfix] Fix glm4.1v video_grid_thw tensor shape scheme (#21744)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-28 04:26:49 -07:00 |
|
Li, Jiang
|
65e8466c37
|
[Bugfix] Fix environment variable setting in CPU Dockerfile (#21730)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-28 11:02:39 +00:00 |
|
Jee Jee Li
|
1b769dccf3
|
[Bugfix] Fix Ernie4_5_MoeForCausalLM shared experts (#21717)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-28 11:02:25 +00:00 |
|
rongfu.leng
|
2cc571199b
|
[feature] add log non default args in LLM (#21680)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-07-28 02:21:22 -07:00 |
|
Cyrus Leung
|
a4ed731546
|
[Model] Prioritize Transformers fallback over suffix matching (#21719)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-28 02:15:31 -07:00 |
|
Benji Beck
|
d128d0d554
|
Migrate KeyeImageInputs and KeyeVideoInputs to TensorSchema (#21686)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-28 01:16:35 -07:00 |
|
Asaf Joseph Gardin
|
a6c050286a
|
[v1][mamba] Added mamba_type into MambaSpec (#21715)
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
|
2025-07-28 08:15:55 +00:00 |
|
Lucas Wilkinson
|
139a7f07bd
|
[BugFix] Fix ChunkedLocalAttention when the hybrid kv-cache is disabled (#21707)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-07-28 07:18:47 +00:00 |
|
Ning Xie
|
150d9e6337
|
[Bugfix] fix max-file-size type from str to int (#21675)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-07-28 00:06:52 -07:00 |
|
Cyrus Leung
|
139a97ec56
|
[Bugfix] Fix shape checking for Fuyu (#21709)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-28 00:05:56 -07:00 |
|
rongfu.leng
|
18cc33dd60
|
[bugfix] fix profile impact benchmark results (#21507)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-07-27 22:44:24 -07:00 |
|
Hongsheng Liu
|
7656cf4cf3
|
[Bugfix] [issue-21565] Fix the incompatibility issue with stream and named function calling when Thinking is disabled (#21573)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
|
2025-07-27 22:43:50 -07:00 |
|
Benji Beck
|
3ea57a56d9
|
Migrate Idefics3ImagePixelInputs and Idefics3ImageEmbeddingInputs to … (#21683)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-27 22:37:23 -07:00 |
|
Benji Beck
|
75856bc2cb
|
Migrate GraniteSpeechAudioInputs to TensorSchema (#21682)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-07-27 22:37:20 -07:00 |
|
Benji Beck
|
304dcdf575
|
Migrate GLMVImagePixelInputs to TensorSchema (#21679)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-27 22:36:11 -07:00 |
|
Benji Beck
|
88e46c7c8d
|
Migrate Glm4vImageInputs, Glm4vVideoInputs to TensorSchema (#21678)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-07-27 22:36:08 -07:00 |
|
Benji Beck
|
d8937de4c8
|
Migrate Gemma3ImagePixelInputs to TensorSchema (#21676)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-27 22:36:05 -07:00 |
|
TJian
|
e626d286f5
|
[FEAT] [ROCm] [AITER]: Add AITER HIP block quant kernel (#21242)
|
2025-07-28 05:07:06 +00:00 |
|
Shinichi Hemmi
|
c7ffe93d9c
|
[Model] Support TP/PP/mamba2 kernel for PLaMo2 (#19674)
Signed-off-by: Shinichi Hemmi <shemmi@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Co-authored-by: Calvin Metzger <metzger@preferred.jp>
Co-authored-by: Sixue Wang <cecilwang@preferred.jp>
|
2025-07-28 05:00:47 +00:00 |
|
Adeline
|
15a72ac478
|
[V1] Exception Handling when Loading KV Cache from Remote Store (#21534)
Signed-off-by: liuyumoye <adeline_ly2023@outlook.com>
Co-authored-by: liuyumoye <adeline_ly2023@outlook.com>
|
2025-07-27 20:34:17 -07:00 |
|
Jee Jee Li
|
04ff4be310
|
[Misc] Add fused_moe configs for Qwen3-Coder-480B-A35B-Instruct-FP8 (#21700)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-27 20:12:18 -07:00 |
|
Yuxuan Zhang
|
93269bb43e
|
Fix GLM tool parser (#21668)
Co-authored-by: Chenhui Zhang <zhang.chenhui@outlook.com>
|
2025-07-28 10:46:38 +08:00 |
|
Joachim Studnia
|
82acf2184d
|
Fix typo for limit-mm-per-prompt in docs (#21697)
Signed-off-by: Joachim Studnia <joachim@mistral.ai>
|
2025-07-27 19:45:37 -07:00 |
|
Cyrus Leung
|
86ae693f20
|
[Deprecation][2/N] Replace --task with --runner and --convert (#21470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-27 19:42:40 -07:00 |
|
Alexander Matveev
|
8f605ee309
|
[Attention] Make CutlassMLA the default backend for SM100 (blackwell) (#21626)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-07-27 20:13:00 +00:00 |
|
Ning Xie
|
a9b2a1d704
|
[Misc] Refactor vllm config str (#21666)
|
2025-07-27 09:51:44 -07:00 |
|
Caleb_Du
|
57c22e57f9
|
Fix CUDA permute/unpermute for use with DeepGemm Moe (#17934)
Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
|
2025-07-27 07:08:00 -07:00 |
|
Wentao Ye
|
bda9d0535f
|
[Refactor] Refactor MOE NVFP4 Code Base: ModelOpt + Compressed Tensor (#21631)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-07-27 05:25:21 -07:00 |
|
Isotr0py
|
3d847a3125
|
[VLM] Add video support for Intern-S1 (#21671)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-27 11:49:43 +00:00 |
|
Benji Beck
|
5f8c9a425e
|
Migrate Florence2ImagePixelInputs to TensorSchema (#21663)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-07-27 02:43:02 -07:00 |
|
Ning Xie
|
1cbf951ba2
|
[Misc] add default value for file pattern arg (#21659)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-07-27 05:14:51 +00:00 |
|
ZiTian.Zhao
|
a8936e5193
|
Refactor: Remove numpy dependency from LoggingStatLogger (#20529)
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
|
2025-07-27 04:06:21 +00:00 |
|
Ye (Charlotte) Qi
|
01a395e9e7
|
[CI/Build][Doc] Clean up more docs that point to old bench scripts (#21667)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-07-27 04:02:12 +00:00 |
|
Huy Do
|
971948b846
|
Handle non-serializable objects in vllm bench (#21665)
|
2025-07-27 03:35:22 +00:00 |
|
Isotr0py
|
eed2f463b2
|
[VLM] Support HF format Phi-4-MM model (#17121)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-07-26 20:07:57 -07:00 |
|
Benji Beck
|
20950b29fb
|
Migrate ChameleonImagePixelInputs to TensorSchema (#21657)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-26 19:34:25 -07:00 |
|
Benji Beck
|
3339cba3ff
|
Migrate FuyuImagePatchInputs to TensorSchema (#21662)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-26 19:34:14 -07:00 |
|
Benji Beck
|
0b8caf9095
|
Migrate DeepseekVL2ImageInputs to TensorSchema (#21658)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-26 19:34:11 -07:00 |
|
Benji Beck
|
ccf27cc4d4
|
Migrate Blip2ImagePixelInputs and Blip2ImageEmbeddingInputs to TensorSchema (#21656)
Signed-off-by: Benji Beck <benjibeck@meta.com>
|
2025-07-27 10:33:52 +08:00 |
|
Jinzhen Lin
|
c657369841
|
support torch.compile for bailing moe (#21664)
|
2025-07-26 23:54:32 +00:00 |
|
Wenchen Lo
|
6c66f28fa5
|
Remove xformers requirement for Mistral-format Pixtral and Mistral3 (#21154)
Signed-off-by: Wenchen Lo <charles761013@gmail.com>
|
2025-07-26 17:20:29 -06:00 |
|