[Kernel][RFC] Refactor the punica kernel based on Triton (#5036)

This commit is contained in:
Jee Jee Li
2024-08-01 08:12:24 +08:00
committed by GitHub
parent 7eb0cb4a14
commit 7ecee34321
47 changed files with 3177 additions and 4366 deletions

View File

@@ -66,7 +66,6 @@ You can also build and install vLLM from source:
$ git clone https://github.com/vllm-project/vllm.git
$ cd vllm
$ # export VLLM_INSTALL_PUNICA_KERNELS=1 # optionally build for multi-LoRA capability
$ pip install -e . # This may take 5-10 minutes.
.. tip::