FastAPI-based working frontend (#10)

This commit is contained in:
Zhuohan Li
2023-03-29 14:48:56 +08:00
committed by GitHub
parent d359cda5fa
commit 721fa3df15
15 changed files with 536 additions and 146 deletions

View File

@@ -8,9 +8,46 @@ pip install flash-attn # This may take up to 10 mins.
pip install -e .
```
## Run
## Test simple server
```bash
ray start --head
python server.py [--tensor-parallel-size <N>]
python simple_server.py
```
The detailed arguments for `simple_server.py` can be found by:
```bash
python simple_server.py --help
```
## FastAPI server
Install the following additional dependencies:
```bash
pip install fastapi uvicorn
```
To start the server:
```bash
ray start --head
python -m cacheflow.http_frontend.fastapi_frontend
```
To test the server:
```bash
python -m cacheflow.http_frontend.test_cli_client
```
## Gradio web server
Install the following additional dependencies:
```bash
pip install gradio
```
Start the server:
```bash
python -m cacheflow.http_frontend.fastapi_frontend
# At another terminal
python -m cacheflow.http_frontend.gradio_webserver
```