Michael Goin
|
ab6f981671
|
[CI][Bugfix] Skip chameleon for transformers 4.46.1 (#9808)
|
2024-10-29 11:12:43 -07:00 |
|
wangshuai09
|
622b7ab955
|
[Hardware] using current_platform.seed_everything (#9785)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-29 14:47:44 +00:00 |
|
Zhong Qishuai
|
ef7865b4f9
|
[Frontend] re-enable multi-modality input in the new beam search implementation (#9427)
Signed-off-by: Qishuai Ferdinandzhong@gmail.com
|
2024-10-29 11:49:47 +00:00 |
|
litianjian
|
5f8d8075f9
|
[Model][VLM] Add multi-video support for LLaVA-Onevision (#8905)
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-28 18:04:10 +00:00 |
|
youkaichao
|
32176fee73
|
[torch.compile] support moe models (#9632)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 21:58:04 -07:00 |
|
wangshuai09
|
4e2d95e372
|
[Hardware][ROCM] using current_platform.is_rocm (#9642)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-28 04:07:00 +00:00 |
|
madt2709
|
34a9941620
|
[Bugfix] Fix load config when using bools (#9533)
|
2024-10-27 13:46:41 -04:00 |
|
bnellnm
|
3cb07a36a2
|
[Misc] Upgrade to pytorch 2.5 (#9588)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 09:44:24 +00:00 |
|
kakao-kevin-us
|
6650e6a930
|
[Model] Add classification Task with Qwen2ForSequenceClassification (#9704)
Signed-off-by: Kevin-Yang <ykcha9@gmail.com>
Co-authored-by: Kevin-Yang <ykcha9@gmail.com>
|
2024-10-26 17:53:35 +00:00 |
|
Vasiliy Alekseev
|
07e981fdf4
|
[Frontend] Bad words sampling parameter (#9717)
Signed-off-by: Vasily Alexeev <alvasian@yandex.ru>
|
2024-10-26 16:29:38 +00:00 |
|
Mengqing Cao
|
5cbdccd151
|
[Hardware][openvino] is_openvino --> current_platform.is_openvino (#9716)
|
2024-10-26 10:59:06 +00:00 |
|
Kevin H. Luu
|
9f7b4ba865
|
[ci/Build] Skip Chameleon for transformers 4.46.0 on broadcast test #9675 (#9676)
|
2024-10-24 20:59:00 -07:00 |
|
Charlie Fu
|
59449095ab
|
[Performance][Kernel] Fused_moe Performance Improvement (#9384)
Signed-off-by: charlifu <charlifu@amd.com>
|
2024-10-24 15:37:52 -07:00 |
|
Alex Brooks
|
722d46edb9
|
[Model] Compute Llava Next Max Tokens / Dummy Data From Gridpoints (#9650)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-24 10:42:24 -07:00 |
|
Cyrus Leung
|
c866e0079d
|
[CI/Build] Fix VLM test failures when using transformers v4.46 (#9666)
|
2024-10-25 01:40:40 +08:00 |
|
Yongzao
|
d27cfbf791
|
[torch.compile] Adding torch compile annotations to some models (#9641)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 09:31:42 -07:00 |
|
Jee Jee Li
|
295a061fb3
|
[Kernel] add kernel for FATReLU (#9610)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-10-24 16:18:27 +08:00 |
|
Yongzao
|
8a02cd045a
|
[torch.compile] Adding torch compile annotations to some models (#9639)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 00:54:57 -07:00 |
|
youkaichao
|
4fdc581f9e
|
[core] simplify seq group code (#9569)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-10-24 00:16:44 -07:00 |
|
Cyrus Leung
|
836e8ef6ee
|
[Bugfix] Fix PP for ChatGLM and Molmo (#9422)
|
2024-10-24 06:12:05 +00:00 |
|
Vinay R Damodaran
|
33bab41060
|
[Bugfix]: Make chat content text allow type content (#9358)
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
|
2024-10-24 05:05:49 +00:00 |
|
Yunfei Chu
|
fc6c274626
|
[Model] Add Qwen2-Audio model support (#9248)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-23 17:54:22 +00:00 |
|
Alex Brooks
|
150b779081
|
[Frontend] Enable Online Multi-image Support for MLlama (#9393)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-10-23 17:28:57 +00:00 |
|
Alex Brooks
|
31a08f5bd2
|
[Model] Add min_pixels / max_pixels to Qwen2VL as mm_processor_kwargs (#9612)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-23 14:05:18 +00:00 |
|
Isotr0py
|
3ff57ebfca
|
[Model] Initialize Florence-2 language backbone support (#9555)
|
2024-10-23 10:42:47 +00:00 |
|
Cyrus Leung
|
831540cf04
|
[Model] Support E5-V (#9576)
|
2024-10-23 11:35:29 +08:00 |
|
yulei
|
b17046e298
|
[BugFix] Fix metrics error for --num-scheduler-steps > 1 (#8234)
|
2024-10-22 15:43:03 -07:00 |
|
Ronen Schaffer
|
cd5601ac37
|
[BugFix] Prevent exporting duplicate OpenTelemetry spans (#9017)
|
2024-10-22 11:11:53 -07:00 |
|
Isotr0py
|
bb392ea2d2
|
[Model][VLM] Initialize support for Mono-InternVL model (#9528)
|
2024-10-22 16:01:46 +00:00 |
|
Jee Jee Li
|
a48e3ec052
|
[CI/Build][LoRA] Temporarily fix long context failure issue (#9579)
|
2024-10-22 11:32:51 +00:00 |
|
wangshuai09
|
3ddbe25502
|
[Hardware][CPU] using current_platform.is_cpu (#9536)
|
2024-10-22 00:50:43 -07:00 |
|
Wallas Henrique
|
c0292211ce
|
[CI/Build] Replaced some models on tests for smaller ones (#9570)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
|
2024-10-22 04:52:14 +00:00 |
|
Cyrus Leung
|
f085995a7b
|
[CI/Build] Remove unnecessary fork_new_process (#9484)
|
2024-10-21 19:47:29 -07:00 |
|
Travis Johnson
|
b729901139
|
[Bugfix]: serialize config by value for --trust-remote-code (#6751)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-10-21 19:46:24 -07:00 |
|
youkaichao
|
76a5e13270
|
[core] move parallel sampling out from vllm core (#9302)
|
2024-10-22 00:31:44 +00:00 |
|
Joe Runde
|
ef7faad1b8
|
🐛 Fixup more test failures from memory profiling (#9563)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-10-21 17:10:56 -07:00 |
|
Wallas Henrique
|
711f3a7806
|
[Frontend] Don't log duplicate error stacktrace for every request in the batch (#9023)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
|
2024-10-21 14:49:41 -07:00 |
|
Dhia Eddine Rhaiem
|
f6b97293aa
|
[Model] FalconMamba Support (#9325)
|
2024-10-21 12:50:16 -04:00 |
|
Cyrus Leung
|
696b01af8f
|
[CI/Build] Split up decoder-only LM tests (#9488)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-20 21:27:50 -07:00 |
|
Chen Zhang
|
4fa3e33349
|
[Kernel] Support sliding window in flash attention backend (#9403)
|
2024-10-20 10:57:52 -07:00 |
|
Chen Zhang
|
5b59fe0f08
|
[Bugfix] Pass json-schema to GuidedDecodingParams and make test stronger (#9530)
|
2024-10-20 00:05:02 +00:00 |
|
Yue Zhang
|
c5eea3c8ba
|
[Frontend] Support simpler image input format (#9478)
|
2024-10-18 23:17:07 -07:00 |
|
Joe Runde
|
380e18639f
|
🐛 fix torch memory profiling (#9516)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-10-18 21:25:19 -04:00 |
|
sasha0552
|
337ed76671
|
[Bugfix] Fix offline mode when using mistral_common (#9457)
|
2024-10-18 18:12:32 -07:00 |
|
Cody Yu
|
d11bf435a0
|
[MISC] Consolidate cleanup() and refactor offline_inference_with_prefix.py (#9510)
|
2024-10-18 14:30:55 -07:00 |
|
Cyrus Leung
|
051eaf6db3
|
[Model] Add user-configurable task for models that support both generation and embedding (#9424)
|
2024-10-18 11:31:58 -07:00 |
|
tomeras91
|
d2b1bf55ec
|
[Frontend][Feature] Add jamba tool parser (#9154)
|
2024-10-18 10:27:48 +00:00 |
|
Joe Runde
|
de4008e2ab
|
[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-10-17 22:47:27 -04:00 |
|
Robert Shaw
|
343f8e0905
|
Support BERTModel (first encoder-only embedding model) (#9056)
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Andrew Feldman <afeldman@neuralmagic.com>
Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: laishzh <laishengzhang@gmail.com>
Co-authored-by: Max de Bayser <maxdebayser@gmail.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2024-10-17 23:21:01 +00:00 |
|
bnellnm
|
eca2c5f7c0
|
[Bugfix] Fix support for dimension like integers and ScalarType (#9299)
|
2024-10-17 19:08:34 +00:00 |
|