[Kernel] Full Tensor Parallelism for LoRA Layers (#3524)

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
This commit is contained in:
Austin Veselka
2024-04-27 02:03:48 -05:00
committed by GitHub
parent 18d23f642a
commit eefeb16464
19 changed files with 686 additions and 111 deletions

View File

@@ -2,3 +2,4 @@
#include "bgmv_impl.cuh"
FOR_BGMV_WIDE_NARROW(INST_BGMV_TWOSIDE, nv_half, float, nv_half)
FOR_INST_BGMV_WIDE_NARROW(INST_BGMV_ONESIDE, nv_half, float, nv_half)