Commit Graph

181 Commits

Author SHA1 Message Date
omer-dayan
5e5630a478 [Bugfix] Path join when building local path for S3 clone (#12353)
Signed-off-by: Omer Dayan (SW-GPU) <omer@run.ai>
2025-01-24 11:06:07 +08:00
Cyrus Leung
cd7b6f0857 [VLM] Avoid unnecessary tokenization (#12310)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-22 11:08:31 +00:00
Cyrus Leung
b37d82791e [Model] Upgrade Aria to transformers 4.48 (#12203)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-20 17:58:48 +08:00
Cyrus Leung
630eb5b5ce [Bugfix] Fix multi-modal processors for transformers 4.48 (#12187) 2025-01-18 19:16:34 -08:00
Isotr0py
02798ecabe [Model] Port deepseek-vl2 processor, remove dependency (#12169)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-18 13:59:39 +08:00
Kunshang Ji
54cacf008f [Bugfix] Mistral tokenizer encode accept list of str (#12149)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-01-17 16:47:53 +00:00
Joe Runde
edce722eaa [Bugfix] use right truncation for non-generative tasks (#12050)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2025-01-16 00:31:01 +08:00
Alex Brooks
5340a30d01 Fix Max Token ID for Qwen-VL-Chat (#11980)
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-01-13 08:37:48 +00:00
Isotr0py
f967e51f38 [Model] Initialize support for Deepseek-VL2 models (#11578)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-12 00:17:24 -08:00
Maximilien de Bayser
1fe554bac3 treat do_lower_case in the same way as the sentence-transformers library (#11815)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2025-01-09 11:05:43 +08:00
Cyrus Leung
eed11ebee9 [VLM] Merged multi-modal processors for LLaVA-NeXT-Video and LLaVA-OneVision (#11717)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-04 11:40:53 +00:00
Aurick Qiao
e1a5c2f0a1 [Model] Whisper model implementation (#11280)
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
2025-01-03 16:39:19 +08:00
youkaichao
328841d002 [bugfix] interleaving sliding window for cohere2 model (#11583)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-28 16:55:42 +00:00
Cyrus Leung
101418096f [VLM] Support caching in merged multi-modal processor (#11396)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-27 17:22:48 +00:00
Cyrus Leung
eec906d811 [Misc] Add placeholder module (#11501)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-26 13:12:51 +00:00
omer-dayan
995f56236b [Core] Loading model from S3 using RunAI Model Streamer as optional loader (#10192)
Signed-off-by: OmerD <omer@run.ai>
2024-12-20 16:46:24 +00:00
Cyrus Leung
cdf22afdda [Misc] Clean up and consolidate LRUCache (#11339)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-20 00:59:32 +08:00
Wallas Henrique
8b79f9e107 [Bugfix] Fix guided decoding with tokenizer mode mistral (#11046) 2024-12-17 22:34:08 -08:00
Flávia Béo
250ee65d72 [BUG] Remove token param #10921 (#11022)
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
2024-12-10 17:38:15 +00:00
Xin Yang
01d079fd8e [LoRA] Change lora_tokenizers capacity (#10796)
Signed-off-by: Xin Yang <xyang19@gmail.com>
2024-12-04 17:40:16 +00:00
shunxing12345
1209261e93 [Model] Support telechat2 (#10311)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: xiangw2 <xiangw2@chinatelecom.cn>
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-11-27 11:32:35 +00:00
Shane A
9db713a1dc [Model] Add OLMo November 2024 model (#10503) 2024-11-25 17:26:40 -05:00
zhou fan
b1d920531f [Model]: Add support for Aria model (#10514)
Signed-off-by: xffxff <1247714429@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-11-25 18:10:55 +00:00
Maximilien de Bayser
214efc2c3c Support Cross encoder models (#10400)
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Co-authored-by: Flavia Beo <flavia.beo@ibm.com>
2024-11-24 18:56:20 -08:00
Cyrus Leung
09dbf9ff16 [Bugfix] Handle conflicts between modern and legacy fields (#10471)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-20 14:45:08 +08:00
Guillaume Calmettes
691a3ec047 [Bugfix] Ensure special tokens are properly filtered out for guided structured output with MistralTokenizer (#10363)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
2024-11-15 14:50:40 +00:00
Patrick von Platen
11cd1ae6ad [Tool parsing] Improve / correct mistral tool parsing (#10333) 2024-11-15 00:42:49 +00:00
youkaichao
73b9083e99 [misc] improve cloudpickle registration and tests (#10202)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-11 00:10:53 +00:00
Krishna Mandal
b09895a618 [Frontend][Core] Override HF config.json via CLI (#5836)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-09 16:19:27 +00:00
Patrick von Platen
0535e5fe6c Fix edge case Mistral tokenizer (#10152) 2024-11-08 15:42:27 +00:00
Flávia Béo
aa9078fa03 Adds method to read the pooling types from model's files (#9506)
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
2024-11-07 08:42:40 +00:00
Cyrus Leung
db7db4aab9 [Misc] Consolidate ModelConfig code related to HF config (#10104)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-07 06:00:21 +00:00
Aaron Pham
21063c11c7 [CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2024-11-06 07:11:55 +00:00
Travis Johnson
2bcbae704c [Bugfix] Fix edge-case crash when using chat with the Mistral Tekken Tokenizer (#10051)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
2024-11-06 04:28:29 +00:00
shanshan wang
54597724f4 [Model] Add support for H2OVL-Mississippi models (#9747)
Signed-off-by: Shanshan Wang <shanshan.wang@h2o.ai>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-11-04 00:15:36 +00:00
Travis Johnson
1dd4cb2935 [Bugfix] Fix edge cases for MistralTokenizer (#9625)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-11-01 10:33:15 -07:00
Joe Runde
67bdf8e523 [Bugfix][Frontend] Guard against bad token ids (#9634)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-10-29 14:13:20 -07:00
tastelikefeet
08600ddc68 Fix the log to correct guide user to install modelscope (#9793)
Signed-off-by: yuze.zyz <yuze.zyz@alibaba-inc.com>
2024-10-29 10:36:59 -07:00
Aurick Qiao
23b899a8e6 [Bugfix] fix detokenizer shallow copy (#5919) 2024-10-22 15:38:12 -07:00
Woosuk Kwon
6c5af09b39 [V1] Implement vLLM V1 [1/N] (#9289) 2024-10-22 01:24:07 -07:00
Travis Johnson
b729901139 [Bugfix]: serialize config by value for --trust-remote-code (#6751)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-10-21 19:46:24 -07:00
sasha0552
337ed76671 [Bugfix] Fix offline mode when using mistral_common (#9457) 2024-10-18 18:12:32 -07:00
Michael Goin
3921a2f29e [Model] Support Pixtral models in the HF Transformers format (#9036) 2024-10-18 13:29:56 -06:00
Cyrus Leung
1bbbcc0b1d [CI/Build] Fix lint errors in mistral tokenizer (#9504) 2024-10-19 00:09:35 +08:00
sasha0552
5e443b594f [Bugfix] Allow prefill of assistant response when using mistral_common (#9446) 2024-10-17 15:06:37 +00:00
Cyrus Leung
7e7eae338d [Misc] Standardize RoPE handling for Qwen2-VL (#9250) 2024-10-16 13:56:17 +08:00
Prashant Gupta
d11b46f3a5 [bugfix] fix f-string for error (#9295)
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
2024-10-11 17:03:48 -07:00
sixgod
6cf1167c1a [Model] Add GLM-4v support and meet vllm==0.6.2 (#9242) 2024-10-11 17:36:13 +00:00
Cyrus Leung
151ef4efd2 [Model] Support NVLM-D and fix QK Norm in InternViT (#9045)
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2024-10-07 11:55:12 +00:00
Chen Zhang
cfadb9c687 [Bugfix] Deprecate registration of custom configs to huggingface (#9083) 2024-10-05 21:56:40 +08:00