Joe Runde
|
9b9cef3145
|
[Bugfix] Backport request id validation to v0 (#11036)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-12-10 16:38:23 +00:00 |
|
Tyler Michael Smith
|
28b3a1c7e5
|
[V1] Multiprocessing Tensor Parallel Support for v1 (#9856)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-12-10 06:28:14 +00:00 |
|
Roger Wang
|
c690357928
|
[V1] Fix Detokenizer loading in AsyncLLM (#10997)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-12-09 16:27:10 +00:00 |
|
Xin Yang
|
01d079fd8e
|
[LoRA] Change lora_tokenizers capacity (#10796)
Signed-off-by: Xin Yang <xyang19@gmail.com>
|
2024-12-04 17:40:16 +00:00 |
|
Cyrus Leung
|
d2f058e76c
|
[Misc] Rename embedding classes to pooling (#10801)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-01 14:36:51 +08:00 |
|
Ricky Xu
|
519e8e4182
|
[v1] EngineArgs for better config handling for v1 (#10382)
Signed-off-by: rickyx <rickyx@anyscale.com>
|
2024-11-25 21:09:43 -08:00 |
|
Zhonghua Deng
|
d345f409b7
|
[V1] EngineCore supports profiling (#10564)
Signed-off-by: Abatom <abzhonghua@gmail.com>
|
2024-11-22 17:16:15 -08:00 |
|
Cyrus Leung
|
0b8bb86bf1
|
[1/N] Initial prototype for multi-modal processor (#10044)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-13 12:39:03 +00:00 |
|
Robert Shaw
|
6ace6fba2c
|
[V1] AsyncLLM Implementation (#9826)
Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-11-11 23:05:38 +00:00 |
|