Commit Graph

32 Commits

Author SHA1 Message Date
Harry Mellor
f7967577f5 Remove requirement to use --hf-overrides for DeepseekVLV2ForCausalLM (#35203)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-02-24 22:00:06 -08:00
junuxyz
c61a98f529 [CI][BugFix] ShellCheck cleanup to remove baseline and preserve runtime behavior (#34514)
Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com>
2026-02-17 12:22:56 +00:00
liranschour
8322d4e47f Enable Cross layers KV cache layout at NIXL Connector V2 (#33339)
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: liranschour <liranschour@users.noreply.github.com>
Co-authored-by: Or Ozeri <or@ozery.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
2026-02-05 02:17:02 -08:00
Or Ozeri
2e8de86777 Revert "Enable Cross layers KV cache layout at NIXL Connector (#30207)" (#33241)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
Co-authored-by: Kevin H. Luu <khluu000@gmail.com>
2026-01-28 04:36:00 -08:00
Matt
c517d8c934 [Hardware][AMD][CI][Bugfix] Fix regressions from deprecated env vars (#32837)
Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
2026-01-23 00:59:15 +08:00
liranschour
64e3d67ac0 Enable Cross layers KV cache layout at NIXL Connector (#30207)
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: liranschour <liranschour@users.noreply.github.com>
Co-authored-by: Or Ozeri <or@ozery.com>
2026-01-22 10:12:58 +00:00
Nicolò Lucchesi
ab1af6aa3e [CI][NIXL] Split DPEP tests (#31491)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-12-30 07:26:12 -05:00
Nicolò Lucchesi
bc3700e0cd [NIXL] Support P tensor-parallel-size > D tensor-parallel-size (#27274)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-12-18 11:53:30 +08:00
Matthew Bonanni
7eb6cb6c18 [Attention] Update tests to remove deprecated env vars (#30563)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-12-17 09:49:59 -08:00
Ming Yang
60d17251c9 [Disagg] Support large batch size in proxy server and update NixlConnector doc for DP (#28782)
Signed-off-by: Ming Yang <minos.future@gmail.com>
2025-12-09 00:01:08 +00:00
Chendi.Xue
c3e2978620 [NIXL] fix cpu PD after physical <> logical block_size PR (#28904)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-11-18 14:03:23 -05:00
Nicolò Lucchesi
f226a3f0c1 [CI][NIXL] Change default block_size for tests (#28927)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-11-18 09:22:30 -08:00
Chendi.Xue
c9e665852a [NIXL] heterogeneous block_size support (#26759)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
2025-11-14 21:51:32 -08:00
Kuntai Du
86dca07d9b [Hybrid allocator + kv connector] revert connector test changes related to hybrid allocator (#28011)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
2025-11-05 10:36:31 +00:00
Kuntai Du
b853540388 [Core][Hybrid allocator + kv connector 1/n] Enable hybrid allocator + KV cache connector (#25712)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
2025-10-24 23:34:18 -07:00
Nicolò Lucchesi
bfa59be8f1 [CI] Nixl integration tests DP-EP (#27199)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-10-22 11:17:48 +08:00
Nicolò Lucchesi
2ba60ec7fe [CI] Nixl integration tests (#27010)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-10-17 07:13:31 -07:00
Chendi.Xue
7e6edb1469 [NIXL][HeteroTP] Enable KV transfer from HND prefill to NHD decode (#26556)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-10-14 09:46:05 +00:00
Chendi.Xue
9bb38130cb [Bugfix] Fix GPU_ID issue in test script (#26442)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-10-12 11:39:05 +00:00
Cyrus Leung
1e4ecca1d0 [V0 Deprecation] Remove VLLM_USE_V1 from tests (#26341)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-07 15:42:31 +00:00
Harry Mellor
6c04638214 Fix per file ruff ignores related to line length (#26262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-06 05:12:40 +00:00
Harry Mellor
d6953beb91 Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Chenxi Yang
d0d138bc55 [Nixl][P/D] Add cuda2cpu support (HD->DH transfer) (#24690)
Signed-off-by: Chenxi Yang <cxyang@fb.com>
Co-authored-by: Chenxi Yang <cxyang@fb.com>
2025-09-29 14:31:51 +00:00
Peter Pan
da5e7e4329 [Docs] NixlConnector quickstart guide (#24249)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
2025-09-23 14:23:22 +00:00
Abirdcfly
0d7db16a92 [PD] add test for chat completions endpoint (#21925)
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2025-08-03 19:57:03 -07:00
Roger Wang
067c34a155 docs: remove deprecated disable-log-requests flag (#22113)
Signed-off-by: Roger Wang <hey@rogerw.me>
2025-08-02 00:19:48 -07:00
Juncheng Gu
6066284914 [P/D] Support CPU Transfer in NixlConnector (#18293)
Signed-off-by: Juncheng Gu <juncgu@gmail.com>
Signed-off-by: Richard Liu <ricliu@google.com>
Co-authored-by: Richard Liu <39319471+richardsliu@users.noreply.github.com>
Co-authored-by: Richard Liu <ricliu@google.com>
2025-07-24 17:58:42 +01:00
lkchen
4734704b30 [PD] let toy proxy handle /chat/completions (#19730)
Signed-off-by: Linkun <github@lkchen.net>
2025-06-25 15:17:45 -04:00
Nicolò Lucchesi
b2fac67130 [P/D] Heterogeneous TP (#18833)
Signed-off-by: nicklucche <nlucches@redhat.com>
2025-06-04 23:25:34 +00:00
Simon Mo
02f0c7b220 [Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
rasmith
46791e1b4b [AMD] [P/D] Compute num gpus for ROCm correctly in run_accuracy_test.sh (#18568)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
2025-05-22 18:45:35 -07:00
Robert Shaw
d19110204c [P/D] NIXL Integration (#17751)
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Brent Salisbury <bsalisbu@redhat.com>
2025-05-12 09:46:16 -07:00