Cyrus Leung
|
8d9b6721e7
|
[VLM] Abstract out multi-modal data parsing in merged processor (#11620)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-30 15:01:35 +00:00 |
|
youkaichao
|
b12e87f942
|
[platforms] enable platform plugins (#11602)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-30 20:24:45 +08:00 |
|
youkaichao
|
328841d002
|
[bugfix] interleaving sliding window for cohere2 model (#11583)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-28 16:55:42 +00:00 |
|
Roger Wang
|
b7dcc003dc
|
[Model] Remove hardcoded image tokens ids from Pixtral (#11582)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-12-28 10:54:23 +00:00 |
|
Isotr0py
|
d34be24bb1
|
[Model] Support InternLM2 Reward models (#11571)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-12-28 06:14:10 +00:00 |
|
Jee Jee Li
|
0240402c46
|
[Misc]Add BNB quantization for MolmoForCausalLM (#11551)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-12-27 18:48:24 +00:00 |
|
ErezSC42
|
55509c2114
|
[MODEL] LoRA support for Jamba model (#11209)
Signed-off-by: Erez Schwartz <erezs@ai21.com>
|
2024-12-27 17:58:21 +00:00 |
|
Cyrus Leung
|
101418096f
|
[VLM] Support caching in merged multi-modal processor (#11396)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-27 17:22:48 +00:00 |
|
Jee Jee Li
|
2c9b8ea2b0
|
[Bugfix] Fix TeleChat2ForCausalLM weights mapper (#11546)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-12-27 10:39:15 +00:00 |
|
Mengqing Cao
|
6c6f7fe8a8
|
[Platform] Move model arch check to platform (#11503)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2024-12-27 08:45:25 +00:00 |
|
Simon Mo
|
f49777ba62
|
Deepseek v3 (#11502)
Create Release / Create Release (push) Has been cancelled
Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Co-authored-by: robertgshaw2-neuralmagic <rshaw@neuralmagic.com>
|
2024-12-26 16:09:44 -08:00 |
|
Jee Jee Li
|
f57ee5650d
|
[Model] Modify MolmoForCausalLM MLP (#11510)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-12-26 13:12:05 +00:00 |
|
Cyrus Leung
|
3f3e92e1f2
|
[Model] Automatic conversion of classification and reward models (#11469)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-24 18:22:22 +00:00 |
|
Jee Jee Li
|
196c34b0ac
|
[Misc] Move weights mapper (#11443)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-12-24 13:05:25 +00:00 |
|
Jee Jee Li
|
b1b1038fbd
|
[Bugfix] Fix Qwen2-VL LoRA weight loading (#11430)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-12-24 09:56:10 +00:00 |
|
Michael Goin
|
60fb4f3bcf
|
[Bugfix] Add kv cache scales to gemma2.py (#11269)
|
2024-12-23 19:30:45 +00:00 |
|
Roger Wang
|
c2d1b075ba
|
[Bugfix] Fix issues for Pixtral-Large-Instruct-2411 (#11393)
Signed-off-by: ywang96 <ywang@example.com>
Co-authored-by: ywang96 <ywang@example.com>
|
2024-12-21 10:15:03 +00:00 |
|
Isotr0py
|
e24113a8fe
|
[Model] Refactor Qwen2-VL to use merged multimodal processor (#11258)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-19 16:28:00 +00:00 |
|
Roger Wang
|
7379b3d4b2
|
[V1] Fix multimodal profiling for Molmo (#11325)
Signed-off-by: ywang96 <ywang@example.com>
Co-authored-by: ywang96 <ywang@example.com>
|
2024-12-19 16:27:22 +00:00 |
|
Yehoshua Cohen
|
6c7f881541
|
[Model] Add JambaForSequenceClassification model (#10860)
Signed-off-by: Yehoshua Cohen <yehoshuaco@ai21.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-19 22:48:06 +08:00 |
|
Cyrus Leung
|
a0f7d53beb
|
[Bugfix] Cleanup Pixtral HF code (#11333)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-19 13:22:00 +00:00 |
|
Cyrus Leung
|
6142ef0ada
|
[VLM] Merged multimodal processor for Qwen2-Audio (#11303)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-19 06:14:17 +00:00 |
|
Isotr0py
|
996aa70f00
|
[Bugfix] Fix broken phi3-v mm_processor_kwargs tests (#11263)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-12-18 10:16:40 -08:00 |
|
Roger Wang
|
59c9b6ebeb
|
[V1][VLM] Proper memory profiling for image language models (#11210)
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: ywang96 <ywang@example.com>
|
2024-12-16 22:10:57 -08:00 |
|
Isotr0py
|
d927dbcd88
|
[Model] Refactor Ultravox to use merged input processor (#11198)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-12-16 10:09:53 +00:00 |
|
Jani Monoses
|
bddbbcb132
|
[Model] Support Cohere2ForCausalLM (Cohere R7B) (#11203)
|
2024-12-16 09:56:19 +00:00 |
|
Cyrus Leung
|
96d673e0f8
|
[Bugfix] Fix error handling of unsupported sliding window (#11213)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-15 10:59:42 -07:00 |
|
Cyrus Leung
|
93abf23a64
|
[VLM] Fully dynamic prompt replacement in merged input processor (#11199)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-14 17:52:18 +00:00 |
|
Roger Wang
|
969da7d70b
|
[V1][VLM] Fix edge case bug for InternVL2 (#11165)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-12-13 11:09:30 +00:00 |
|
Cyrus Leung
|
eeec9e3390
|
[Frontend] Separate pooling APIs in offline inference (#11129)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-13 10:40:07 +00:00 |
|
Jani Monoses
|
7cd7409142
|
PaliGemma 2 support (#11142)
|
2024-12-13 07:40:07 +00:00 |
|
youkaichao
|
be39e3cd18
|
[core] clean up cudagraph batchsize padding logic (#10996)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-13 06:57:50 +00:00 |
|
Dipika Sikka
|
3989a79824
|
[Bugfix] Update starcoder2 to remap k/v scale names for kv_cache quantization (#11148)
|
2024-12-13 05:07:20 +00:00 |
|
Pooya Davoodi
|
1efce68605
|
[Bugfix] Use runner_type instead of task in GritLM (#11144)
Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>
|
2024-12-13 04:09:53 +00:00 |
|
Jeff Cook
|
5d712571af
|
[Bugfix] Quick fix to make Pixtral-HF load correctly again after 39e227c7ae. (#11024)
|
2024-12-12 18:09:20 +00:00 |
|
Pooya Davoodi
|
1da8f0e1dd
|
[Model] Add support for embedding model GritLM (#10816)
Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>
|
2024-12-12 06:39:16 +00:00 |
|
B-201
|
2e32f5d28d
|
[Bugfix] Fix Idefics3 fails during multi-image inference (#11080)
Signed-off-by: B-201 <Joy25810@foxmail.com>
|
2024-12-11 01:27:07 -08:00 |
|
Mor Zusman
|
ffa48c9146
|
[Model] PP support for Mamba-like models (#10992)
Signed-off-by: mzusman <mor.zusmann@gmail.com>
|
2024-12-10 21:53:37 -05:00 |
|
Patrick von Platen
|
bc192a2b09
|
[Pixtral] Improve loading (#11040)
|
2024-12-10 06:09:32 +00:00 |
|
Isotr0py
|
d1f6d1c8af
|
[Model] Add has_weight to RMSNorm and re-enable weights loading tracker for Mamba (#10739)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-12-10 10:23:07 +08:00 |
|
Isotr0py
|
a811dd6608
|
[Model] merged input processor for Phi-3-Vision models (#10977)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-12-09 12:55:10 -08:00 |
|
Roger Wang
|
a11f326528
|
[V1] Initial support of multimodal models for V1 re-arch (#10699)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-12-08 12:50:51 +00:00 |
|
Cyrus Leung
|
c889d5888b
|
[Doc] Explicitly state that PP isn't compatible with speculative decoding yet (#10975)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-07 17:20:49 +00:00 |
|
Cyrus Leung
|
39e227c7ae
|
[Model] Update multi-modal processor to support Mantis(LLaVA) model (#10711)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-07 17:10:05 +00:00 |
|
Cyrus Leung
|
bf0e382e16
|
[Model] Composite weight loading for multimodal Qwen2 (#10944)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-07 07:22:52 -07:00 |
|
Cyrus Leung
|
955fa9533a
|
[3/N] Support and implement merged input processor for LLaVA model (#10676)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-12-07 00:50:58 -08:00 |
|
Isotr0py
|
10398b4706
|
[Model] Consolidate ViTs attention implementation without mask (#10893)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-12-04 18:11:08 +00:00 |
|
Cyrus Leung
|
3257d449fa
|
[Misc] Remove deprecated names (#10817)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-03 06:52:57 +00:00 |
|
youkaichao
|
dc5ce861bf
|
[torch.compile] remove compilation_context and simplify code (#10838)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-03 06:19:02 +00:00 |
|
zixuanzhang226
|
d746268e92
|
[Model] support bitsandbytes quantization with minicpm model (#10842)
Signed-off-by: Ubuntu <zixuanzhang@bytedance.com>
|
2024-12-03 03:06:41 +00:00 |
|