[doc] Fold long code blocks to improve readability (#19926)

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-23 13:24:23 +08:00
parent 493c275352
commit f17aec0d63
50 changed files with 3455 additions and 3180 deletions
--- a/docs/serving/integrations/langchain.md
+++ b/docs/serving/integrations/langchain.md
@@ -13,19 +13,21 @@ pip install langchain langchain_community -q

 To run inference on a single or multiple GPUs, use `VLLM` class from `langchain`.

-```python
-from langchain_community.llms import VLLM
+??? Code

-llm = VLLM(model="mosaicml/mpt-7b",
-           trust_remote_code=True,  # mandatory for hf models
-           max_new_tokens=128,
-           top_k=10,
-           top_p=0.95,
-           temperature=0.8,
-           # tensor_parallel_size=... # for distributed inference
-)
+    ```python
+    from langchain_community.llms import VLLM

-print(llm("What is the capital of France ?"))
-```
+    llm = VLLM(model="mosaicml/mpt-7b",
+            trust_remote_code=True,  # mandatory for hf models
+            max_new_tokens=128,
+            top_k=10,
+            top_p=0.95,
+            temperature=0.8,
+            # tensor_parallel_size=... # for distributed inference
+    )
+
+    print(llm("What is the capital of France ?"))
+    ```

 Please refer to this [Tutorial](https://python.langchain.com/docs/integrations/llms/vllm) for more details.