Commit Graph

2 Commits

Author SHA1 Message Date
cbfc5a9afb Update nvfp4_experts_only to use dequantized BF16 model 2026-05-07 16:34:37 +00:00
6008cf128d Add model_opt_nvfp4_experts_only.py
Quantizes only MoE expert weights to NVFP4, leaving attention untouched.
Includes comments documenting all available NVFP4 strategies.
Copy to model_opt_nvfp4_<strategy>.py for each new strategy.
2026-05-07 15:16:08 +00:00