[CI/Build] Auto-fix Markdown files (#12941)
This commit is contained in:
@@ -7,10 +7,8 @@
|
||||
- Models: llama-3.1 8B, llama-3 70B, mixtral 8x7B.
|
||||
- Evaluation metrics: end-to-end latency (mean, median, p99).
|
||||
|
||||
|
||||
{latency_tests_markdown_table}
|
||||
|
||||
|
||||
## Throughput tests
|
||||
|
||||
- Input length: randomly sample 200 prompts from ShareGPT dataset (with fixed random seed).
|
||||
@@ -19,10 +17,8 @@
|
||||
- Models: llama-3.1 8B, llama-3 70B, mixtral 8x7B.
|
||||
- Evaluation metrics: throughput.
|
||||
|
||||
|
||||
{throughput_tests_markdown_table}
|
||||
|
||||
|
||||
## Serving tests
|
||||
|
||||
- Input length: randomly sample 200 prompts from ShareGPT dataset (with fixed random seed).
|
||||
@@ -33,13 +29,11 @@
|
||||
- We also added a speculative decoding test for llama-3 70B, under QPS 2
|
||||
- Evaluation metrics: throughput, TTFT (time to the first token, with mean, median and p99), ITL (inter-token latency, with mean, median and p99).
|
||||
|
||||
|
||||
{serving_tests_markdown_table}
|
||||
|
||||
|
||||
## json version of the benchmarking tables
|
||||
|
||||
This section contains the data of the markdown tables above in JSON format.
|
||||
This section contains the data of the markdown tables above in JSON format.
|
||||
You can load the benchmarking tables into pandas dataframes as follows:
|
||||
|
||||
```python
|
||||
@@ -54,9 +48,9 @@ serving_results = pd.DataFrame.from_dict(benchmarking_results["serving"])
|
||||
```
|
||||
|
||||
The json string for all benchmarking tables:
|
||||
|
||||
```json
|
||||
{benchmarking_results_in_json_string}
|
||||
```
|
||||
|
||||
You can also check the raw experiment data in the Artifact tab of the Buildkite page.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user