zifeitong
|
e3dd0692fa
|
[BugFix] Propagate 'trust_remote_code' setting in internvl and minicpmv (#8250)
|
2024-09-25 05:53:43 +00:00 |
|
Alex Brooks
|
8ff7ced996
|
[Model] Expose Phi3v num_crops as a mm_processor_kwarg (#8658)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-24 07:36:46 +00:00 |
|
Peter Salas
|
3f06bae907
|
[Core][Model] Support loading weights by ID within models (#7931)
|
2024-09-24 07:14:15 +00:00 |
|
Jani Monoses
|
f2bd246c17
|
[VLM] Fix paligemma, fuyu and persimmon with transformers 4.45 : use config.text_config.vocab_size (#8707)
|
2024-09-23 14:43:09 +00:00 |
|
Yanyi Liu
|
a79e522984
|
[Model] Support pp for qwen2-vl (#8696)
|
2024-09-23 13:46:59 +00:00 |
|
litianjian
|
5b59532760
|
[Model][VLM] Add LLaVA-Onevision model support (#8486)
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-22 10:51:44 -07:00 |
|
Cyrus Leung
|
06ed2815e2
|
[Model] Refactor BLIP/BLIP-2 to support composite model loading (#8407)
|
2024-09-22 12:24:21 +00:00 |
|
Isotr0py
|
13d88d4137
|
[Bugfix] Refactor composite weight loading logic (#8656)
|
2024-09-22 04:33:27 +00:00 |
|
Divakar Verma
|
9dc7c6c7f3
|
[dbrx] refactor dbrx experts to extend FusedMoe class (#8518)
|
2024-09-21 15:09:39 -06:00 |
|
Cyrus Leung
|
5e85f4f82a
|
[VLM] Use SequenceData.from_token_counts to create dummy data (#8687)
|
2024-09-20 23:28:56 -07:00 |
|
zyddnys
|
0f961b3ce9
|
[Bugfix] Fix incorrect llava next feature size calculation (#8496)
|
2024-09-20 22:48:32 +00:00 |
|
Niklas Muennighoff
|
3b63de9353
|
[Model] Add OLMoE (#7922)
|
2024-09-20 09:31:41 -07:00 |
|
Amit Garg
|
18ae428a0d
|
[Bugfix] Fix Phi3.5 mini and MoE LoRA inference (#8571)
|
2024-09-20 08:54:02 +08:00 |
|
Geun, Lim
|
e18749ff09
|
[Model] Support Solar Model (#8386)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-09-18 11:04:00 -06:00 |
|
Aaron Pham
|
9d104b5beb
|
[CI/Build] Update Ruff version (#8469)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-09-18 11:00:56 +00:00 |
|
Cyrus Leung
|
6ffa3f314c
|
[CI/Build] Avoid CUDA initialization (#8534)
|
2024-09-18 10:38:11 +00:00 |
|
Joe Runde
|
98f9713399
|
[Bugfix] Fix TP > 1 for new granite (#8544)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-09-17 23:17:08 +00:00 |
|
sroy745
|
1009e93c5d
|
[Encoder decoder] Add cuda graph support during decoding for encoder-decoder models (#7631)
|
2024-09-17 07:35:01 -07:00 |
|
Chris
|
3724d5f6b5
|
[Bugfix][Model] Fix Python 3.8 compatibility in Pixtral model by updating type annotations (#8490)
|
2024-09-15 04:20:05 +00:00 |
|
ywfang
|
8a0cf1ddc3
|
[Model] support minicpm3 (#8297)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-14 14:50:26 +00:00 |
|
Jee Jee Li
|
06311e2956
|
[Misc] Skip loading extra bias for Qwen2-VL GPTQ-Int8 (#8442)
|
2024-09-13 07:58:28 +00:00 |
|
Wenxiang
|
a480939e8e
|
[Bugfix] Fix weight loading issue by rename variable. (#8293)
|
2024-09-12 19:25:00 -04:00 |
|
Patrick von Platen
|
d31174a4e1
|
[Hotfix][Pixtral] Fix multiple images bugs (#8415)
|
2024-09-12 15:21:51 -07:00 |
|
Roger Wang
|
c16369455f
|
[Hotfix][Core][VLM] Disable chunked prefill by default and prefix caching for multimodal models (#8425)
|
2024-09-12 14:06:51 -07:00 |
|
Alex Brooks
|
c6202daeed
|
[Model] Support multiple images for qwen-vl (#8247)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-12 10:10:54 -07:00 |
|
Isotr0py
|
e56bf27741
|
[Bugfix] Fix InternVL2 inference with various num_patches (#8375)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-12 10:10:35 -07:00 |
|
Blueyo0
|
1bf2dd9df0
|
[Gemma2] add bitsandbytes support for Gemma2 (#8338)
|
2024-09-11 21:53:12 -07:00 |
|
Patrick von Platen
|
d394787e52
|
Pixtral (#8377)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-09-11 14:41:55 -07:00 |
|
bnellnm
|
73202dbe77
|
[Kernel][Misc] register ops to prevent graph breaks (#6917)
Co-authored-by: Sage Moore <sage@neuralmagic.com>
|
2024-09-11 12:52:19 -07:00 |
|
Yang Fan
|
3b7fea770f
|
[Model][VLM] Add Qwen2-VL model support (#7905)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-11 09:31:19 -07:00 |
|
Yangshen⚡Deng
|
6a512a00df
|
[model] Support for Llava-Next-Video model (#7559)
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-09-10 22:21:36 -07:00 |
|
Isotr0py
|
1230263e16
|
[Bugfix] Fix InternVL2 vision embeddings process with pipeline parallel (#8299)
|
2024-09-11 10:11:01 +08:00 |
|
Jee Jee Li
|
e497b8aeff
|
[Misc] Skip loading extra bias for Qwen2-MOE GPTQ models (#8329)
|
2024-09-10 20:59:19 -04:00 |
|
Cyrus Leung
|
da1a844e61
|
[Bugfix] Fix missing post_layernorm in CLIP (#8155)
|
2024-09-10 08:22:50 +00:00 |
|
Dipika Sikka
|
6cd5e5b07e
|
[Misc] Fused MoE Marlin support for GPTQ (#8217)
|
2024-09-09 23:02:52 -04:00 |
|
Vladislav Kruglikov
|
f9b4a2d415
|
[Bugfix] Correct adapter usage for cohere and jamba (#8292)
|
2024-09-09 11:20:46 -07:00 |
|
Isotr0py
|
36bf8150cc
|
[Model][VLM] Decouple weight loading logic for Paligemma (#8269)
|
2024-09-07 17:45:44 +00:00 |
|
Isotr0py
|
e807125936
|
[Model][VLM] Support multi-images inputs for InternVL2 models (#8201)
|
2024-09-07 16:38:23 +08:00 |
|
Cyrus Leung
|
2f707fcb35
|
[Model] Multi-input support for LLaVA (#8238)
|
2024-09-07 02:57:24 +00:00 |
|
Patrick von Platen
|
29f49cd6e3
|
[Model] Allow loading from original Mistral format (#8168)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-09-06 17:02:05 -06:00 |
|
Alex Brooks
|
9da25a88aa
|
[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) (#8029)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-05 12:48:10 +00:00 |
|
manikandan.tm@zucisystems.com
|
8685ba1a1e
|
Inclusion of InternVLChatModel In PP_SUPPORTED_MODELS(Pipeline Parallelism) (#7860)
|
2024-09-05 11:33:37 +00:00 |
|
wnma
|
d3311562fb
|
[Bugfix] remove post_layernorm in siglip (#8106)
|
2024-09-04 18:55:37 +08:00 |
|
Peter Salas
|
2be8ec6e71
|
[Model] Add Ultravox support for multiple audio chunks (#7963)
|
2024-09-04 04:38:21 +00:00 |
|
Isotr0py
|
ec266536b7
|
[Bugfix][VLM] Add fallback to SDPA for ViT model running on CPU backend (#8061)
|
2024-09-03 21:37:52 +08:00 |
|
Isotr0py
|
dd2a6a82e3
|
[Bugfix] Fix internlm2 tensor parallel inference (#8055)
|
2024-09-02 23:48:56 +08:00 |
|
Shawn Tan
|
f8d60145b4
|
[Model] Add Granite model (#7436)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-09-01 18:37:18 -07:00 |
|
Roger Wang
|
5b86b19954
|
[Misc] Optional installation of audio related packages (#8063)
|
2024-09-01 14:46:57 -07:00 |
|
Cyrus Leung
|
d05f0a9db2
|
[Bugfix] Fix import error in Phi-3.5-MoE (#8052)
|
2024-08-30 22:26:55 -07:00 |
|
Wenxiang
|
1248e8506a
|
[Model] Adding support for MSFT Phi-3.5-MoE (#7729)
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Zeqi Lin <zelin@microsoft.com>
Co-authored-by: Zeqi Lin <Zeqi.Lin@microsoft.com>
|
2024-08-30 13:42:57 -06:00 |
|