7 Commits

Author SHA1 Message Date
wang.yuqi
66c079ae83 [Frontend][4/n] Improve pooling entrypoints | pooling. (#39153)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-04-09 10:09:45 +00:00
Hyeonki Hong
25f2b55319 [Frontend] feat: add streaming support for token generation endpoint (#37171)
Signed-off-by: Hyeonki Hong <hyeonki.hong@moreh.io>
2026-04-03 10:20:32 +00:00
Sergey Zinchenko
5a2d420c17 [Bugfix] Use dedicated MM processor cache in /tokenize to prevent sender-cache pollution (#38545)
Signed-off-by: Sergey Zinchenko <sergey.zinchenko.rnd@gmail.com>
2026-04-01 21:14:49 -07:00
Cyrus Leung
ba2f0acc2d [Misc] Reorganize inputs (#35182)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-03-25 10:22:54 -07:00
Flora Feng
6050b93bed [Refactor] Move serve entrypoint tests under tests/entrypoints/serve/ (#37595)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
2026-03-20 02:10:47 -07:00
Flora Feng
e2d1c8b5e8 [Refactor] Relocate entrypoint tests to match serving code structure (#37593)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
2026-03-20 05:31:23 +00:00
Flora Feng
b21d384304 [Refactor] Relocate endpoint tests to mirror serving code directory structure (#37504)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
2026-03-19 07:19:36 +00:00