Files
nvfp4-megamoe-kernel/vllm
biondizzle e1fcfc4f01 Add CuTeDSL warmup + CUDA sync after JIT compilation
CuTeDSL cute.compile corrupts GPU memory. Add warmup forward +
torch.cuda.synchronize() + health check after finalize_weights,
matching the MoE runner pattern.
2026-05-19 01:11:44 +00:00
..