[Doc] Update docs on handling OOM (#15357)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-25 05:29:34 +08:00
parent 3eb08ed9b1
commit 6dd55af6c9
6 changed files with 24 additions and 9 deletions
--- a/docs/source/serving/engine_args.md
+++ b/docs/source/serving/engine_args.md
@@ -2,7 +2,12 @@

 # Engine Arguments

-Below, you can find an explanation of every engine argument for vLLM:
+Engine arguments control the behavior of the vLLM engine.
+
+- For [offline inference](#offline-inference), they are part of the arguments to `LLM` class.
+- For [online serving](#openai-compatible-server), they are part of the arguments to `vllm serve`.
+
+Below, you can find an explanation of every engine argument:

 <!--- pyml disable-num-lines 7 no-space-in-emphasis -->
 ```{eval-rst}
@@ -15,7 +20,7 @@ Below, you can find an explanation of every engine argument for vLLM:

 ## Async Engine Arguments

-Below are the additional arguments related to the asynchronous engine:
+Additional arguments are available to the asynchronous engine which is used for online serving:

 <!--- pyml disable-num-lines 7 no-space-in-emphasis -->
 ```{eval-rst}