tastelikefeet
|
08600ddc68
|
Fix the log to correct guide user to install modelscope (#9793)
Signed-off-by: yuze.zyz <yuze.zyz@alibaba-inc.com>
|
2024-10-29 10:36:59 -07:00 |
|
科英
|
74fc2d77ae
|
[Misc] Add metrics for request queue time, forward time, and execute time (#9659)
|
2024-10-29 10:32:56 -07:00 |
|
wangshuai09
|
622b7ab955
|
[Hardware] using current_platform.seed_everything (#9785)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-29 14:47:44 +00:00 |
|
Isotr0py
|
09500f7dde
|
[Model] Add BNB quantization support for Mllama (#9720)
|
2024-10-29 08:20:02 -04:00 |
|
Zhong Qishuai
|
ef7865b4f9
|
[Frontend] re-enable multi-modality input in the new beam search implementation (#9427)
Signed-off-by: Qishuai Ferdinandzhong@gmail.com
|
2024-10-29 11:49:47 +00:00 |
|
Cyrus Leung
|
eae3d48181
|
[Bugfix] Use temporary directory in registry (#9721)
|
2024-10-28 22:08:20 -07:00 |
|
Cyrus Leung
|
e74f2d448c
|
[Doc] Specify async engine args in docs (#9726)
|
2024-10-28 22:07:57 -07:00 |
|
Jee Jee Li
|
7a4df5f200
|
[Model][LoRA]LoRA support added for Qwen (#9622)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-10-29 04:14:07 +00:00 |
|
Russell Bryant
|
c5d7fb9ddc
|
[Doc] fix third-party model example (#9771)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-10-28 19:39:21 -07:00 |
|
youkaichao
|
76ed5340f0
|
[torch.compile] add deepseek v2 compile (#9775)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-28 14:35:17 -07:00 |
|
youkaichao
|
97b61bfae6
|
[misc] avoid circular import (#9765)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-28 20:51:23 +00:00 |
|
Yongzao
|
aa0addb397
|
Adding "torch compile" annotations to moe models (#9758)
|
2024-10-28 13:49:56 -07:00 |
|
litianjian
|
5f8d8075f9
|
[Model][VLM] Add multi-video support for LLaVA-Onevision (#8905)
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-28 18:04:10 +00:00 |
|
Russell Bryant
|
8b0e4f2ad7
|
[CI/Build] Adopt Mergify for auto-labeling PRs (#9259)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-10-28 09:38:09 -07:00 |
|
Yan Ma
|
2adb4409e0
|
[Bugfix] Fix ray instance detect issue (#9439)
|
2024-10-28 07:13:03 +00:00 |
|
Robert Shaw
|
feb92fbe4a
|
Fix beam search eos (#9627)
|
2024-10-28 06:59:37 +00:00 |
|
youkaichao
|
32176fee73
|
[torch.compile] support moe models (#9632)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 21:58:04 -07:00 |
|
wangshuai09
|
4e2d95e372
|
[Hardware][ROCM] using current_platform.is_rocm (#9642)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-28 04:07:00 +00:00 |
|
madt2709
|
34a9941620
|
[Bugfix] Fix load config when using bools (#9533)
|
2024-10-27 13:46:41 -04:00 |
|
Harry Mellor
|
e130c40e4e
|
Fix cache management in "Close inactive issues and PRs" actions workflow (#9734)
|
2024-10-27 10:30:03 -07:00 |
|
bnellnm
|
3cb07a36a2
|
[Misc] Upgrade to pytorch 2.5 (#9588)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 09:44:24 +00:00 |
|
youkaichao
|
8549c82660
|
[core] cudagraph output with tensor weak reference (#9724)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 00:19:28 -07:00 |
|
科英
|
67a6882da4
|
[Misc] SpecDecodeWorker supports profiling (#9719)
Signed-off-by: Abatom <abatom@163.com>
|
2024-10-27 04:18:03 +00:00 |
|
kakao-kevin-us
|
6650e6a930
|
[Model] Add classification Task with Qwen2ForSequenceClassification (#9704)
Signed-off-by: Kevin-Yang <ykcha9@gmail.com>
Co-authored-by: Kevin-Yang <ykcha9@gmail.com>
|
2024-10-26 17:53:35 +00:00 |
|
Vasiliy Alekseev
|
07e981fdf4
|
[Frontend] Bad words sampling parameter (#9717)
Signed-off-by: Vasily Alexeev <alvasian@yandex.ru>
|
2024-10-26 16:29:38 +00:00 |
|
ErkinSagiroglu
|
55137e8ee3
|
Fix: MI100 Support By Bypassing Custom Paged Attention (#9560)
|
2024-10-26 12:12:57 +00:00 |
|
Mengqing Cao
|
5cbdccd151
|
[Hardware][openvino] is_openvino --> current_platform.is_openvino (#9716)
|
2024-10-26 10:59:06 +00:00 |
|
Sam Stoelinga
|
067e77f9a8
|
[Bugfix] Steaming continuous_usage_stats default to False (#9709)
Signed-off-by: Sam Stoelinga <sammiestoel@gmail.com>
|
2024-10-26 05:05:47 +00:00 |
|
Travis Johnson
|
6567e13724
|
[Bugfix] Fix crash with llama 3.2 vision models and guided decoding (#9631)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: pavlo-ruban <pavlo.ruban@servicenow.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-25 15:42:56 -07:00 |
|
Rafael Vasquez
|
228cfbd03f
|
[Doc] Improve quickstart documentation (#9256)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-10-25 14:32:10 -07:00 |
|
Michael Goin
|
ca0d92227e
|
[Bugfix] Fix compressed_tensors_moe bad config.strategy (#9677)
|
2024-10-25 12:40:33 -07:00 |
|
Woosuk Kwon
|
9645b9f646
|
[V1] Support sliding window attention (#9679)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-10-24 22:20:37 -07:00 |
|
Will Johnson
|
a6f3721861
|
[Model] add a lora module for granite 3.0 MoE models (#9673)
|
2024-10-24 22:00:17 -07:00 |
|
Kevin H. Luu
|
9f7b4ba865
|
[ci/Build] Skip Chameleon for transformers 4.46.0 on broadcast test #9675 (#9676)
|
2024-10-24 20:59:00 -07:00 |
|
Michael Goin
|
c91ed47c43
|
[Bugfix] Remove xformers requirement for Pixtral (#9597)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2024-10-24 15:38:05 -07:00 |
|
Charlie Fu
|
59449095ab
|
[Performance][Kernel] Fused_moe Performance Improvement (#9384)
Signed-off-by: charlifu <charlifu@amd.com>
|
2024-10-24 15:37:52 -07:00 |
|
Michael Goin
|
e26d37a185
|
[Log][Bugfix] Fix default value check for image_url.detail (#9663)
|
2024-10-24 10:44:38 -07:00 |
|
Alex Brooks
|
722d46edb9
|
[Model] Compute Llava Next Max Tokens / Dummy Data From Gridpoints (#9650)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-24 10:42:24 -07:00 |
|
Cyrus Leung
|
c866e0079d
|
[CI/Build] Fix VLM test failures when using transformers v4.46 (#9666)
|
2024-10-25 01:40:40 +08:00 |
|
Yongzao
|
d27cfbf791
|
[torch.compile] Adding torch compile annotations to some models (#9641)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 09:31:42 -07:00 |
|
Harry Mellor
|
de662d32b5
|
Increase operation per run limit for "Close inactive issues and PRs" workflow (#9661)
Signed-off-by: Harry Mellor <hej.mellor@gmail.com>
|
2024-10-24 12:17:45 -04:00 |
|
litianjian
|
f58454968f
|
[Bugfix]Disable the post_norm layer of the vision encoder for LLaVA models (#9653)
|
2024-10-24 07:52:07 -07:00 |
|
Cyrus Leung
|
b979143d5b
|
[Doc] Move additional tips/notes to the top (#9647)
|
2024-10-24 09:43:59 +00:00 |
|
Yongzao
|
ad6f78053e
|
[torch.compile] expanding support and fix allgather compilation (#9637)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 01:32:15 -07:00 |
|
Jee Jee Li
|
295a061fb3
|
[Kernel] add kernel for FATReLU (#9610)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-10-24 16:18:27 +08:00 |
|
Yongzao
|
8a02cd045a
|
[torch.compile] Adding torch compile annotations to some models (#9639)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-24 00:54:57 -07:00 |
|
youkaichao
|
4fdc581f9e
|
[core] simplify seq group code (#9569)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-10-24 00:16:44 -07:00 |
|
Woosuk Kwon
|
3770071eb4
|
[V1][Bugfix] Clean up requests when aborted (#9629)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-10-23 23:33:22 -07:00 |
|
Cyrus Leung
|
836e8ef6ee
|
[Bugfix] Fix PP for ChatGLM and Molmo (#9422)
|
2024-10-24 06:12:05 +00:00 |
|
Yan Ma
|
056a68c7db
|
[XPU] avoid triton import for xpu (#9440)
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-10-24 05:14:00 +00:00 |
|