Nibble index 0 vs 8 ratio = 0.996 (FP4 -0.0 ≈ +0.0), NOT INT4 where -8 would be rare. FP4 dequant uses E2M1 LUT lookup × E8M0 scale (MXFP4 microscaling). Also adds model_opt_nvfp4_full.py for full model NVFP4 quantization.
Nibble index 0 vs 8 ratio = 0.996 (FP4 -0.0 ≈ +0.0), NOT INT4 where -8 would be rare. FP4 dequant uses E2M1 LUT lookup × E8M0 scale (MXFP4 microscaling). Also adds model_opt_nvfp4_full.py for full model NVFP4 quantization.