Woosuk Kwon
|
86ac7bcf84
|
[Model Runner V2] Support pooling models (#35120)
Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
|
2026-02-27 18:03:01 -08:00 |
|
Nick Hill
|
40b2f1c3d9
|
[Model Runner V2] Minor CPU optimizations (#34856)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-19 16:05:37 -08:00 |
|
Nick Hill
|
e535d90deb
|
[ModelRunner V2] Misc minor simplifications and optimizations (#33467)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-01 22:17:14 +00:00 |
|
Nick Hill
|
6bf3b46d78
|
[ModelRunner V2] Misc code simplification and cleanup (#33266)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-01-28 14:41:23 -08:00 |
|
Woosuk Kwon
|
d471b2aff0
|
[Model Runner V2] Support num NaNs in logits (#30187)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-12-09 10:00:49 -08:00 |
|
Woosuk Kwon
|
ae0ce1be27
|
[Model Runner V2][BugFix] Keep reference to GPU tensors in AsyncOutput (#29623)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-11-27 12:38:53 -08:00 |
|
Woosuk Kwon
|
7f12c82fa6
|
[Model Runner V2] Change bookkeeping logic in preparation for spec decoding (#29194)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-11-23 09:42:52 -08:00 |
|
Woosuk Kwon
|
1bed891f72
|
[Chore] Fix pre-commit error after #25266 (#29190)
|
2025-11-21 10:21:40 -08:00 |
|
Woosuk Kwon
|
30b44a1598
|
GPU Model Runner V2 (#25266)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-11-21 08:20:55 -08:00 |
|