[Doc] Update more docs with respect to V1 (#29188)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-23 10:58:48 +08:00
parent 3ed767ec06
commit 389aa1b2eb
6 changed files with 89 additions and 100 deletions
--- a/docs/configuration/conserving_memory.md
+++ b/docs/configuration/conserving_memory.md
@@ -49,9 +49,6 @@ llm = LLM(model="adept/fuyu-8b", max_model_len=2048, max_num_seqs=2)

 By default, we optimize model inference using CUDA graphs which take up extra memory in the GPU.

-!!! warning
-    CUDA graph capture takes up more memory in V1 than in V0.
-
 You can adjust `compilation_config` to achieve a better balance between inference speed and memory usage:

 ??? code