[Kernel] Enable 8-bit weights in Fused Marlin MoE (#8032)
Co-authored-by: Dipika <dipikasikka1@gmail.com>
This commit is contained in:
@@ -1,3 +1,4 @@
|
||||
compressed-tensors, nm-testing/Mixtral-8x7B-Instruct-v0.1-W4A16-quantized, main
|
||||
compressed-tensors, nm-testing/Mixtral-8x7B-Instruct-v0.1-W4A16-channel-quantized, main
|
||||
gptq_marlin, TheBloke/Mixtral-8x7B-v0.1-GPTQ, main
|
||||
compressed-tensors, nm-testing/Mixtral-8x7B-Instruct-v0.1-W8A16-quantized, main
|
||||
gptq_marlin, TheBloke/Mixtral-8x7B-v0.1-GPTQ, main
|
||||
|
||||
0
tests/weight_loading/run_model_weight_loading_test.sh
Normal file → Executable file
0
tests/weight_loading/run_model_weight_loading_test.sh
Normal file → Executable file
Reference in New Issue
Block a user