[Hardware][TPU] Support MoE with Pallas GMM kernel (#6457)

2024-07-16 09:56:28 -07:00
parent 9f4ccec761
commit c467dff24f
5 changed files with 89 additions and 8 deletions
--- a/docs/source/getting_started/tpu-installation.rst
+++ b/docs/source/getting_started/tpu-installation.rst
@@ -56,7 +56,7 @@ First, install the dependencies:
    $ pip uninstall torch torch-xla -y

    $ # Install PyTorch and PyTorch XLA.
-    $ export DATE="+20240601"
+    $ export DATE="+20240713"
    $ pip install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch-nightly${DATE}-cp310-cp310-linux_x86_64.whl
    $ pip install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-nightly${DATE}-cp310-cp310-linux_x86_64.whl

@@ -85,7 +85,7 @@ Next, build vLLM from source. This will only take a few seconds:
        ImportError: libopenblas.so.0: cannot open shared object file: No such file or directory


-    You can install OpenBLAS with the following command:
+    Please install OpenBLAS with the following command:

    .. code-block:: console