Files
nvfp4-megamoe-kernel/tests
biondizzle ba68212fa7 Add CUDA graph readiness detector (Section A of GETTING_CUDAGRAPH_READY.md)
- Grep for Section B sync patterns in hot path files
- Method 1: run decode forward with torch.cuda.set_sync_debug_mode('error')
- Method 2: attempt CUDA graph capture of L0 decode step
- Full model load + prefill + warmup before detection
- Results saved to /tmp/cuda_graph_readiness_results.json
2026-06-03 16:34:15 +00:00
..
2026-05-31 20:11:37 +00:00