[Doc] Compatibility matrix for mutual exclusive features (#8512)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
This commit is contained in:
@@ -22,6 +22,8 @@ If you frequently encounter preemptions from the vLLM engine, consider the follo
|
||||
|
||||
You can also monitor the number of preemption requests through Prometheus metrics exposed by the vLLM. Additionally, you can log the cumulative number of preemption requests by setting disable_log_stats=False.
|
||||
|
||||
.. _chunked-prefill:
|
||||
|
||||
Chunked Prefill
|
||||
---------------
|
||||
vLLM supports an experimental feature chunked prefill. Chunked prefill allows to chunk large prefills into smaller chunks and batch them together with decode requests.
|
||||
|
||||
Reference in New Issue
Block a user