deepseek-v4-quant

Author	SHA1	Message	Date
biondizzle	116933dcf6	Fix: skip .cuda() when low_memory_mode; switch default to nvfp4	2026-05-07 03:06:33 +00:00
biondizzle	b8bdd00d19	Lower GPU max_memory to 100GiB, add CPU-only fallback for low_memory_mode	2026-05-07 02:49:24 +00:00
biondizzle	717151b98c	Add CPU offloading and max_memory caps for FP8 model loading	2026-05-07 02:40:48 +00:00
biondizzle	aff12c6951	Fix forward_loop: pass as callable, not via create_forward_loop	2026-05-07 02:08:09 +00:00
biondizzle	492e44c0f6	Fix dataloader API: max_sample_length not seq_len, proper create_forward_loop	2026-05-07 02:04:54 +00:00
biondizzle	b32bb2e84d	NVIDIA Model Optimizer branch: nvfp4_experts_only PTQ for DeepSeek V4 Pro	2026-05-07 00:11:31 +00:00