|
|
bedcfc4dab
|
Pipeline test: use max_num_tokens=8192 matching vLLM
|
2026-05-17 23:04:44 +00:00 |
|
|
|
72628fb689
|
Full pipeline test: runner vs BF16 reference
|
2026-05-17 21:29:16 +00:00 |
|
|
|
2796bd81e8
|
Fix: scatter FP4 as uint8 (float4 doesn't support index_put)
|
2026-05-17 21:28:04 +00:00 |
|
|
|
364f8372bb
|
Fix FP4 buffer shapes: D//2 for packed dimensions
|
2026-05-17 21:26:46 +00:00 |
|
|
|
5e4d674736
|
Test fix: quantize slot_hidden, scatter FP4, pass slot_x_sf
|
2026-05-17 21:25:58 +00:00 |
|
|
|
4d0b6d889d
|
Set runner weights before _ensure_stacked
|
2026-05-17 21:22:50 +00:00 |
|
|
|
b7acac5e4e
|
Call _ensure_stacked() before using runner buffers
|
2026-05-17 21:22:30 +00:00 |
|
|
|
1acf01fc1a
|
Fix token_indices: repeat each token ID top_k times, not arange
|
2026-05-17 21:22:11 +00:00 |
|
|
|
a478ca4746
|
Debug: trace runner logic step by step, test L1 GEMM
|
2026-05-17 21:21:45 +00:00 |
|
|
|
a100bd11c1
|
Simplify pipeline test: BF16 ref + bridge ref + full runner
|
2026-05-17 21:20:41 +00:00 |
|
|
|
6eade5e7f8
|
Fix: gs values are floats not tensors
|
2026-05-17 21:19:47 +00:00 |
|
|
|
b05a38a9bd
|
Test stages 1-2 first: sort + L1 GEMM
|
2026-05-17 21:19:23 +00:00 |
|
|
|
9728604ea1
|
Pipeline test: stage-by-stage with BF16 reference comparison
|
2026-05-17 21:19:17 +00:00 |
|
|
|
7fff5fd39b
|
Fix: correct intermediate_size=3072, weight key prefix, dequantize shapes
|
2026-05-17 21:18:20 +00:00 |
|
|
|
4ef345773d
|
Rewrite pipeline test: load real weights, step-by-step vs BF16 reference
|
2026-05-17 21:17:18 +00:00 |
|
|
|
b43541afdd
|
Fix test path setup
|
2026-05-17 21:00:00 +00:00 |
|
|
|
490ddfa294
|
Pipeline test: use synthetic weights at 256x512 (JIT at 7168x18432 hangs for hours)
|
2026-05-17 20:58:06 +00:00 |
|
|
|
c1bb551446
|
Fix weight loading: skip already-loaded experts correctly
|
2026-05-17 18:15:51 +00:00 |
|
|
|
955d7533f2
|
Use system Python for pipeline test (CuTeDSL in system site-packages)
|
2026-05-17 18:13:42 +00:00 |
|
|
|
925e390b93
|
Fix import: use direct import from vllm/ subdirectory
|
2026-05-17 18:12:53 +00:00 |
|
|
|
cd6144b832
|
Fix imports: all functions are in cutedsl.bridge, not separate modules
|
2026-05-17 18:11:03 +00:00 |
|
|
|
5e63a0d8a3
|
Rewrite pipeline test: use raw checkpoint weights, compare runner vs dynamic-gs reference
|
2026-05-17 18:10:05 +00:00 |
|
|
|
e51eafe288
|
Rewrite pipeline test: compare runner vs reference with real weights, step-by-step
|
2026-05-17 18:08:33 +00:00 |
|
|
|
e38d60a6e8
|
Add pipeline test with real model weights, add swiglu_limit to reference moe_pipeline
|
2026-05-17 18:07:44 +00:00 |
|