[Doc][3/N] Reorganize Serving section (#11766)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 11:20:01 +08:00
parent d93d2d74fd
commit 8ceffbf315
40 changed files with 248 additions and 133 deletions
--- a/docs/source/serving/integrations/langchain.md
+++ b/docs/source/serving/integrations/langchain.md
@@ -0,0 +1,30 @@
+(serving-langchain)=
+
+# LangChain
+
+vLLM is also available via [LangChain](https://github.com/langchain-ai/langchain) .
+
+To install LangChain, run
+
+```console
+$ pip install langchain langchain_community -q
+```
+
+To run inference on a single or multiple GPUs, use `VLLM` class from `langchain`.
+
+```python
+from langchain_community.llms import VLLM
+
+llm = VLLM(model="mosaicml/mpt-7b",
+           trust_remote_code=True,  # mandatory for hf models
+           max_new_tokens=128,
+           top_k=10,
+           top_p=0.95,
+           temperature=0.8,
+           # tensor_parallel_size=... # for distributed inference
+)
+
+print(llm("What is the capital of France ?"))
+```
+
+Please refer to this [Tutorial](https://python.langchain.com/docs/integrations/llms/vllm) for more details.