This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-11 02:01:54 +00:00
653e2d7a50
vLLM NVFP4 serving: full end-to-end pipeline working
db16be8e5d
S11: Fixed substr mapping, stacking, suffix, and o_a_proj - loads weights but attention forward uses FP8 einsum incompatible with NVFP4
Compare 2 commits »
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-10 16:14:34 +00:00
6fd03a0aa0
vLLM serving: patched deepseek_v4.py, disabled mega_moe, updated docs
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-10 09:33:50 +00:00
d88793dee6
Add vllm weight mapper patch and docker-compose
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-10 08:59:32 +00:00
30608e3834
Config patches: document modelopt↔vllm gaps with NVIDIA reference
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-10 08:23:13 +00:00
0d74b97fb2
Config patches doc + compress_ratios runtime patch in serve script
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-10 07:54:35 +00:00
f65d4ab99f
Run 11 SUCCESS: 881GB NVFP4 exported, add vLLM serve script
biondizzle
pushed to
master
at
biondizzle/vllm-with-media-support
2026-05-10 04:02:31 +00:00
4eb98fe467
Add soundfile - vllm audio needs both av and soundfile
biondizzle
pushed to
master
at
biondizzle/vllm-with-media-support
2026-05-10 03:55:41 +00:00
2a01870564
Fix: tag latest for master/main/null branch builds
biondizzle
pushed to
master
at
biondizzle/vllm-with-media-support
2026-05-10 03:35:51 +00:00
261d8e58fe
Fix: install av directly instead of vllm[audio] with --no-deps
biondizzle
pushed to
master
at
biondizzle/vllm-with-media-support
2026-05-10 03:30:06 +00:00
39c76cef64
Add Jenkinsfile
biondizzle
created branch
master
in
biondizzle/vllm-with-media-support
2026-05-10 03:25:13 +00:00
biondizzle
pushed to
master
at
biondizzle/vllm-with-media-support
2026-05-10 03:25:13 +00:00
1c60fd9738
tweax
biondizzle
created repository
biondizzle/vllm-with-media-support
2026-05-10 03:16:34 +00:00
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 23:00:20 +00:00
eb80bd6f80
README + memory: Run 10 result (export crash in get_weight_scaling_factor), Run 11 running
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 22:51:00 +00:00
07cd50e823
8 patches covering full export chain — no more whack-a-mole
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 22:43:50 +00:00
efc111a11f
Add Patch 4+5: get_weight_scaling_factor and get_weight_scaling_factor_2 CPU safety
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 16:09:11 +00:00
ce9056d259
README overhaul: reflect current architecture (hf_main, run history through Run 10)
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 15:58:38 +00:00
5a72da7193
Fix: apply hf_ptq __main__ post-parse conversions (dataset split, calib_size int list)
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 15:00:26 +00:00
8612914169
Update run history: Runs 7-8, Run 9 running on
a300302
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-09 14:57:30 +00:00
a300302486
Fix: use hf_ptq.py arg names (--pyt_ckpt_path, --qformat, --inference_tensor_parallel)
First
Previous
...
138
139
140
141
142
...
Next
Last