[Hardware][TPU] Support MoE with Pallas GMM kernel (#6457)

This commit is contained in:
Woosuk Kwon
2024-07-16 09:56:28 -07:00
committed by GitHub
parent 9f4ccec761
commit c467dff24f
5 changed files with 89 additions and 8 deletions

View File

@@ -56,7 +56,7 @@ First, install the dependencies:
$ pip uninstall torch torch-xla -y
$ # Install PyTorch and PyTorch XLA.
$ export DATE="+20240601"
$ export DATE="+20240713"
$ pip install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch-nightly${DATE}-cp310-cp310-linux_x86_64.whl
$ pip install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-nightly${DATE}-cp310-cp310-linux_x86_64.whl
@@ -85,7 +85,7 @@ Next, build vLLM from source. This will only take a few seconds:
ImportError: libopenblas.so.0: cannot open shared object file: No such file or directory
You can install OpenBLAS with the following command:
Please install OpenBLAS with the following command:
.. code-block:: console