FastAPI-based working frontend (#10)
This commit is contained in:
41
README.md
41
README.md
@@ -8,9 +8,46 @@ pip install flash-attn # This may take up to 10 mins.
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## Run
|
||||
## Test simple server
|
||||
|
||||
```bash
|
||||
ray start --head
|
||||
python server.py [--tensor-parallel-size <N>]
|
||||
python simple_server.py
|
||||
```
|
||||
|
||||
The detailed arguments for `simple_server.py` can be found by:
|
||||
```bash
|
||||
python simple_server.py --help
|
||||
```
|
||||
|
||||
## FastAPI server
|
||||
|
||||
Install the following additional dependencies:
|
||||
```bash
|
||||
pip install fastapi uvicorn
|
||||
```
|
||||
|
||||
To start the server:
|
||||
```bash
|
||||
ray start --head
|
||||
python -m cacheflow.http_frontend.fastapi_frontend
|
||||
```
|
||||
|
||||
To test the server:
|
||||
```bash
|
||||
python -m cacheflow.http_frontend.test_cli_client
|
||||
```
|
||||
|
||||
## Gradio web server
|
||||
|
||||
Install the following additional dependencies:
|
||||
```bash
|
||||
pip install gradio
|
||||
```
|
||||
|
||||
Start the server:
|
||||
```bash
|
||||
python -m cacheflow.http_frontend.fastapi_frontend
|
||||
# At another terminal
|
||||
python -m cacheflow.http_frontend.gradio_webserver
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user