[Core][Hybrid allocator + kv connector 1/n] Enable hybrid allocator + KV cache connector (#25712)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
This commit is contained in:
@@ -91,6 +91,9 @@ def create_vllm_config(
|
||||
max_num_batched_tokens=max_num_batched_tokens,
|
||||
max_model_len=max_model_len,
|
||||
enable_chunked_prefill=enable_chunked_prefill,
|
||||
# Disable hybrid KV cache manager for testing
|
||||
# Should be removed after we support hybrid KV cache manager-based testing.
|
||||
disable_hybrid_kv_cache_manager=True,
|
||||
)
|
||||
model_config = ModelConfig(
|
||||
model=model,
|
||||
|
||||
Reference in New Issue
Block a user