diff --git a/docs/features/spec_decode/README.md b/docs/features/spec_decode/README.md index 0d19ef839..0cc77ad4b 100644 --- a/docs/features/spec_decode/README.md +++ b/docs/features/spec_decode/README.md @@ -1,10 +1,5 @@ # Speculative Decoding -!!! warning - Please note that speculative decoding in vLLM is not yet optimized and does - not usually yield inter-token latency reductions for all prompt datasets or sampling parameters. - The work to optimize it is ongoing and can be followed here: - !!! warning Currently, speculative decoding in vLLM is not compatible with pipeline parallelism.