[Doc] neuron documentation update (#8671)

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
2024-09-20 15:04:37 -07:00
parent b4e4eda92e
commit 7c8566aa4f
2 changed files with 3 additions and 3 deletions
--- a/docs/source/getting_started/neuron-installation.rst
+++ b/docs/source/getting_started/neuron-installation.rst
@@ -3,8 +3,8 @@
 Installation with Neuron
 ========================

-vLLM 0.3.3 onwards supports model inferencing and serving on AWS Trainium/Inferentia with Neuron SDK.
-At the moment Paged Attention is not supported in Neuron SDK, but naive continuous batching is supported in transformers-neuronx.
+vLLM 0.3.3 onwards supports model inferencing and serving on AWS Trainium/Inferentia with Neuron SDK with continuous batching.
+Paged Attention and Chunked Prefill are currently in development and will be available soon.
 Data types currently supported in Neuron SDK are FP16 and BF16.

 Requirements