[Kernel] Expand MoE weight loading + Add Fused Marlin MoE Kernel (#7527)

Co-authored-by: ElizaWszola <eliza@neuralmagic.com>
This commit is contained in:
Dipika Sikka
2024-08-21 19:17:10 -04:00
committed by GitHub
parent 5844017285
commit 8678a69ab5
15 changed files with 2375 additions and 85 deletions

1740
csrc/moe/marlin_moe_ops.cu Normal file

File diff suppressed because it is too large Load Diff