Remove unnecessary explicit title anchors and use relative links instead (#20620)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@@ -33,7 +33,7 @@ Quantized models take less memory at the cost of lower precision.
|
||||
Statically quantized models can be downloaded from HF Hub (some popular ones are available at [Red Hat AI](https://huggingface.co/RedHatAI))
|
||||
and used directly without extra configuration.
|
||||
|
||||
Dynamic quantization is also supported via the `quantization` option -- see [here][quantization-index] for more details.
|
||||
Dynamic quantization is also supported via the `quantization` option -- see [here](../features/quantization/README.md) for more details.
|
||||
|
||||
## Context length and batch size
|
||||
|
||||
|
||||
@@ -1,12 +1,11 @@
|
||||
---
|
||||
title: Engine Arguments
|
||||
---
|
||||
[](){ #engine-args }
|
||||
|
||||
Engine arguments control the behavior of the vLLM engine.
|
||||
|
||||
- For [offline inference][offline-inference], they are part of the arguments to [LLM][vllm.LLM] class.
|
||||
- For [online serving][serving-openai-compatible-server], they are part of the arguments to `vllm serve`.
|
||||
- For [offline inference](../serving/offline_inference.md), they are part of the arguments to [LLM][vllm.LLM] class.
|
||||
- For [online serving](../serving/openai_compatible_server.md), they are part of the arguments to `vllm serve`.
|
||||
|
||||
You can look at [EngineArgs][vllm.engine.arg_utils.EngineArgs] and [AsyncEngineArgs][vllm.engine.arg_utils.AsyncEngineArgs] to see the available engine arguments.
|
||||
|
||||
|
||||
@@ -20,4 +20,4 @@ model = LLM(
|
||||
)
|
||||
```
|
||||
|
||||
Our [list of supported models][supported-models] shows the model architectures that are recognized by vLLM.
|
||||
Our [list of supported models](../models/supported_models.md) shows the model architectures that are recognized by vLLM.
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
---
|
||||
title: Server Arguments
|
||||
---
|
||||
[](){ #serve-args }
|
||||
|
||||
The `vllm serve` command is used to launch the OpenAI-compatible server.
|
||||
|
||||
@@ -13,7 +12,7 @@ To see the available CLI arguments, run `vllm serve --help`!
|
||||
## Configuration file
|
||||
|
||||
You can load CLI arguments via a [YAML](https://yaml.org/) config file.
|
||||
The argument names must be the long form of those outlined [above][serve-args].
|
||||
The argument names must be the long form of those outlined [above](serve_args.md).
|
||||
|
||||
For example:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user