[Docs] Switch to better markdown linting pre-commit hook (#21851)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@@ -5,6 +5,6 @@ This example contains scripts that demonstrate disaggregated prefill in the offl
|
||||
## Files
|
||||
|
||||
- `run.sh` - A helper script that will run `prefill_example.py` and `decode_example.py` sequentially.
|
||||
- Make sure you are in the `examples/offline_inference/disaggregated-prefill-v1` directory before running `run.sh`.
|
||||
- Make sure you are in the `examples/offline_inference/disaggregated-prefill-v1` directory before running `run.sh`.
|
||||
- `prefill_example.py` - A script which performs prefill only, saving the KV state to the `local_storage` directory and the prompts to `output.txt`.
|
||||
- `decode_example.py` - A script which performs decode only, loading the KV state from the `local_storage` directory and the prompts from `output.txt`.
|
||||
|
||||
@@ -19,9 +19,9 @@ We currently support `/v1/chat/completions`, `/v1/embeddings`, and `/v1/score` e
|
||||
## Pre-requisites
|
||||
|
||||
* The examples in this document use `meta-llama/Meta-Llama-3-8B-Instruct`.
|
||||
- Create a [user access token](https://huggingface.co/docs/hub/en/security-tokens)
|
||||
- Install the token on your machine (Run `huggingface-cli login`).
|
||||
- Get access to the gated model by [visiting the model card](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) and agreeing to the terms and conditions.
|
||||
* Create a [user access token](https://huggingface.co/docs/hub/en/security-tokens)
|
||||
* Install the token on your machine (Run `huggingface-cli login`).
|
||||
* Get access to the gated model by [visiting the model card](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) and agreeing to the terms and conditions.
|
||||
|
||||
## Example 1: Running with a local file
|
||||
|
||||
@@ -105,7 +105,7 @@ To integrate with cloud blob storage, we recommend using presigned urls.
|
||||
|
||||
* [Create an S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html).
|
||||
* The `awscli` package (Run `pip install awscli`) to configure your credentials and interactively use s3.
|
||||
- [Configure your credentials](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html).
|
||||
* [Configure your credentials](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html).
|
||||
* The `boto3` python package (Run `pip install boto3`) to generate presigned urls.
|
||||
|
||||
### Step 1: Upload your input script
|
||||
|
||||
@@ -28,16 +28,20 @@ to run disaggregated prefill and benchmark the performance.
|
||||
### Components
|
||||
|
||||
#### Server Scripts
|
||||
|
||||
- `disagg_prefill_lmcache_v1/disagg_vllm_launcher.sh` - Launches individual vLLM servers for prefill/decode, and also launches the proxy server.
|
||||
- `disagg_prefill_lmcache_v1/disagg_proxy_server.py` - FastAPI proxy server that coordinates between prefiller and decoder
|
||||
- `disagg_prefill_lmcache_v1/disagg_example_nixl.sh` - Main script to run the example
|
||||
|
||||
#### Configuration
|
||||
|
||||
- `disagg_prefill_lmcache_v1/configs/lmcache-prefiller-config.yaml` - Configuration for prefiller server
|
||||
- `disagg_prefill_lmcache_v1/configs/lmcache-decoder-config.yaml` - Configuration for decoder server
|
||||
|
||||
#### Log Files
|
||||
|
||||
The main script generates several log files:
|
||||
|
||||
- `prefiller.log` - Logs from the prefill server
|
||||
- `decoder.log` - Logs from the decode server
|
||||
- `proxy.log` - Logs from the proxy server
|
||||
|
||||
@@ -8,11 +8,11 @@ of logging configurations that range from simple-and-inflexible to
|
||||
more-complex-and-more-flexible.
|
||||
|
||||
- No vLLM logging (simple and inflexible)
|
||||
- Set `VLLM_CONFIGURE_LOGGING=0` (leaving `VLLM_LOGGING_CONFIG_PATH` unset)
|
||||
- Set `VLLM_CONFIGURE_LOGGING=0` (leaving `VLLM_LOGGING_CONFIG_PATH` unset)
|
||||
- vLLM's default logging configuration (simple and inflexible)
|
||||
- Leave `VLLM_CONFIGURE_LOGGING` unset or set `VLLM_CONFIGURE_LOGGING=1`
|
||||
- Leave `VLLM_CONFIGURE_LOGGING` unset or set `VLLM_CONFIGURE_LOGGING=1`
|
||||
- Fine-grained custom logging configuration (more complex, more flexible)
|
||||
- Leave `VLLM_CONFIGURE_LOGGING` unset or set `VLLM_CONFIGURE_LOGGING=1` and
|
||||
- Leave `VLLM_CONFIGURE_LOGGING` unset or set `VLLM_CONFIGURE_LOGGING=1` and
|
||||
set `VLLM_LOGGING_CONFIG_PATH=<path-to-logging-config.json>`
|
||||
|
||||
## Logging Configuration Environment Variables
|
||||
|
||||
Reference in New Issue
Block a user