sroy745
|
a78dd3303e
|
[Encoder Decoder] Add flash_attn kernel support for encoder-decoder models (#9559)
|
2024-11-01 23:22:49 -07:00 |
|
Peter Salas
|
6c0b7f548d
|
[Core][VLM] Add precise multi-modal placeholder tracking (#8346)
Signed-off-by: Peter Salas <peter@fixie.ai>
|
2024-11-01 16:21:10 -07:00 |
|
Pavani Majety
|
598b6d7b07
|
[Bugfix/Core] Flashinfer k_scale and v_scale (#9861)
|
2024-11-01 12:15:05 -07:00 |
|
Travis Johnson
|
1dd4cb2935
|
[Bugfix] Fix edge cases for MistralTokenizer (#9625)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2024-11-01 10:33:15 -07:00 |
|
Cyrus Leung
|
ba0d892074
|
[Frontend] Use a proper chat template for VLM2Vec (#9912)
|
2024-11-01 14:09:07 +00:00 |
|
Michael Goin
|
30a2e80742
|
[CI/Build] Add Model Tests for PixtralHF (#9813)
|
2024-11-01 07:55:29 -06:00 |
|
Cyrus Leung
|
06386a64dd
|
[Frontend] Chat-based Embeddings API (#9759)
|
2024-11-01 08:13:35 +00:00 |
|
Yongzao
|
2b5bf20988
|
[torch.compile] Adding torch compile annotations to some models (#9876)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-11-01 00:25:47 -07:00 |
|
youkaichao
|
566cd27797
|
[torch.compile] rework test plans (#9866)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-31 22:20:17 -07:00 |
|
youkaichao
|
96e0c9cbbd
|
[torch.compile] directly register custom op (#9896)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-31 21:56:09 -07:00 |
|
Joe Runde
|
031a7995f3
|
[Bugfix][Frontend] Reject guided decoding in multistep mode (#9892)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-11-01 01:09:46 +00:00 |
|
Mor Zusman
|
9fb12f7848
|
[BugFix][Kernel] Fix Illegal memory access in causal_conv1d in H100 (#9838)
Signed-off-by: mzusman <mor.zusmann@gmail.com>
|
2024-10-31 20:06:25 +00:00 |
|
sasha0552
|
55650c83a0
|
[Bugfix] Fix illegal memory access error with chunked prefill, prefix caching, block manager v2 and xformers enabled together (#9532)
Signed-off-by: sasha0552 <admin@sasha0552.org>
|
2024-10-31 11:46:36 -07:00 |
|
Alex Brooks
|
16b8f7a86f
|
[CI/Build] Add Model Tests for Qwen2-VL (#9846)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-31 09:10:52 -07:00 |
|
Guillaume Calmettes
|
abbfb6134d
|
[Misc][OpenAI] deprecate max_tokens in favor of new max_completion_tokens field for chat completion endpoint (#9837)
|
2024-10-30 18:15:56 -07:00 |
|
youkaichao
|
64384bbcdf
|
[torch.compile] upgrade tests (#9858)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-30 16:34:22 -07:00 |
|
Yongzao
|
00d91c8a2c
|
[CI/Build] Simplify exception trace in api server tests (#9787)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-30 14:52:05 -07:00 |
|
Joe Runde
|
3b3f1e7436
|
[Bugfix][core] replace heartbeat with pid check (#9818)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-10-30 09:34:07 -07:00 |
|
Elfie Guo
|
9ff4511e43
|
[Misc] Add chunked-prefill support on FlashInfer. (#9781)
|
2024-10-30 09:33:53 -07:00 |
|
Alex Brooks
|
cc98f1e079
|
[CI/Build] VLM Test Consolidation (#9372)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-30 09:32:17 -07:00 |
|
youkaichao
|
ff5ed6e1bc
|
[torch.compile] rework compile control with piecewise cudagraph (#9715)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-29 23:03:49 -07:00 |
|
Will Eaton
|
882a1ad0de
|
[Model] tool calling support for ibm-granite/granite-20b-functioncalling (#8339)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>
|
2024-10-29 15:07:37 -07:00 |
|
Joe Runde
|
67bdf8e523
|
[Bugfix][Frontend] Guard against bad token ids (#9634)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-10-29 14:13:20 -07:00 |
|
Michael Goin
|
ab6f981671
|
[CI][Bugfix] Skip chameleon for transformers 4.46.1 (#9808)
|
2024-10-29 11:12:43 -07:00 |
|
wangshuai09
|
622b7ab955
|
[Hardware] using current_platform.seed_everything (#9785)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-29 14:47:44 +00:00 |
|
Zhong Qishuai
|
ef7865b4f9
|
[Frontend] re-enable multi-modality input in the new beam search implementation (#9427)
Signed-off-by: Qishuai Ferdinandzhong@gmail.com
|
2024-10-29 11:49:47 +00:00 |
|
litianjian
|
5f8d8075f9
|
[Model][VLM] Add multi-video support for LLaVA-Onevision (#8905)
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-28 18:04:10 +00:00 |
|
youkaichao
|
32176fee73
|
[torch.compile] support moe models (#9632)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 21:58:04 -07:00 |
|
wangshuai09
|
4e2d95e372
|
[Hardware][ROCM] using current_platform.is_rocm (#9642)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-28 04:07:00 +00:00 |
|
madt2709
|
34a9941620
|
[Bugfix] Fix load config when using bools (#9533)
|
2024-10-27 13:46:41 -04:00 |
|
bnellnm
|
3cb07a36a2
|
[Misc] Upgrade to pytorch 2.5 (#9588)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 09:44:24 +00:00 |
|
kakao-kevin-us
|
6650e6a930
|
[Model] Add classification Task with Qwen2ForSequenceClassification (#9704)
Signed-off-by: Kevin-Yang <ykcha9@gmail.com>
Co-authored-by: Kevin-Yang <ykcha9@gmail.com>
|
2024-10-26 17:53:35 +00:00 |
|
Vasiliy Alekseev
|
07e981fdf4
|
[Frontend] Bad words sampling parameter (#9717)
Signed-off-by: Vasily Alexeev <alvasian@yandex.ru>
|
2024-10-26 16:29:38 +00:00 |
|
Mengqing Cao
|
5cbdccd151
|
[Hardware][openvino] is_openvino --> current_platform.is_openvino (#9716)
|
2024-10-26 10:59:06 +00:00 |
|
Kevin H. Luu
|
9f7b4ba865
|
[ci/Build] Skip Chameleon for transformers 4.46.0 on broadcast test #9675 (#9676)
|
2024-10-24 20:59:00 -07:00 |
|
Charlie Fu
|
59449095ab
|
[Performance][Kernel] Fused_moe Performance Improvement (#9384)
Signed-off-by: charlifu <charlifu@amd.com>
|
2024-10-24 15:37:52 -07:00 |
|
Alex Brooks
|
722d46edb9
|
[Model] Compute Llava Next Max Tokens / Dummy Data From Gridpoints (#9650)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-24 10:42:24 -07:00 |
|
Cyrus Leung
|
c866e0079d
|
[CI/Build] Fix VLM test failures when using transformers v4.46 (#9666)
|
2024-10-25 01:40:40 +08:00 |
|
Yongzao
|
d27cfbf791
|
[torch.compile] Adding torch compile annotations to some models (#9641)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 09:31:42 -07:00 |
|
Jee Jee Li
|
295a061fb3
|
[Kernel] add kernel for FATReLU (#9610)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-10-24 16:18:27 +08:00 |
|
Yongzao
|
8a02cd045a
|
[torch.compile] Adding torch compile annotations to some models (#9639)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 00:54:57 -07:00 |
|
youkaichao
|
4fdc581f9e
|
[core] simplify seq group code (#9569)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-10-24 00:16:44 -07:00 |
|
Cyrus Leung
|
836e8ef6ee
|
[Bugfix] Fix PP for ChatGLM and Molmo (#9422)
|
2024-10-24 06:12:05 +00:00 |
|
Vinay R Damodaran
|
33bab41060
|
[Bugfix]: Make chat content text allow type content (#9358)
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
|
2024-10-24 05:05:49 +00:00 |
|
Yunfei Chu
|
fc6c274626
|
[Model] Add Qwen2-Audio model support (#9248)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-23 17:54:22 +00:00 |
|
Alex Brooks
|
150b779081
|
[Frontend] Enable Online Multi-image Support for MLlama (#9393)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-10-23 17:28:57 +00:00 |
|
Alex Brooks
|
31a08f5bd2
|
[Model] Add min_pixels / max_pixels to Qwen2VL as mm_processor_kwargs (#9612)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-23 14:05:18 +00:00 |
|
Isotr0py
|
3ff57ebfca
|
[Model] Initialize Florence-2 language backbone support (#9555)
|
2024-10-23 10:42:47 +00:00 |
|
Cyrus Leung
|
831540cf04
|
[Model] Support E5-V (#9576)
|
2024-10-23 11:35:29 +08:00 |
|
yulei
|
b17046e298
|
[BugFix] Fix metrics error for --num-scheduler-steps > 1 (#8234)
|
2024-10-22 15:43:03 -07:00 |
|