biondizzle
edc8e7ee8d
KV-1/KV-2: Mixed FP8+BF16 compressed KV (DeepSeek V4 paper format)
Architecture matches paper: 'BF16 for RoPE dims, FP8 for remaining dims'
- Non-RoPE dims (448 of 512): FP8_E4M3 storage → dequant to BF16 for FMHA
- RoPE dims (64 of 512): BF16 storage (RoPE applied directly, no conversion)
- Indexer keys: FP8_E4M3 (ihd=128, no RoPE)
- SWA: BF16 (unchanged)
Pipeline:
Compressor → FP32 → split → [nope: FP32→FP8] + [rope: FP32→BF16→RoPE]
Gather: [nope: FP8→BF16] + [rope: BF16] → concat → FMHA
No BF16 intermediate for non-RoPE data.
No FP32 intermediate after BF16 RoPE.
BF16 is the final format consumed by FMHA (no further conversion).
KVCache rewritten:
- comp_nope_fp8/scale: FP8 storage for non-RoPE
- comp_rope_bf16: BF16 storage for RoPE
- comp_nope_selective/all: FP8→BF16 dequant
- comp_rope_selective/all: BF16 gather
- set_compressed_mixed: write mixed format
- set_indexer_keys_fp8: write FP8 indexer keys
2026-06-02 10:08:43 +00:00
..
2026-05-29 06:52:39 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-30 21:08:12 +00:00
2026-06-01 05:55:05 +00:00
2026-06-01 09:11:29 +00:00
2026-05-21 17:30:44 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 19:32:35 +00:00
2026-05-28 19:32:35 +00:00
2026-05-28 19:32:35 +00:00
2026-05-28 19:32:35 +00:00
2026-05-30 10:28:38 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 23:57:31 +00:00
2026-05-29 19:32:49 +00:00
2026-05-29 19:32:49 +00:00
2026-05-30 03:45:05 +00:00
2026-05-29 19:40:32 +00:00
2026-05-29 19:40:32 +00:00
2026-05-30 03:45:05 +00:00
2026-05-30 04:49:33 +00:00
2026-05-30 04:49:33 +00:00
2026-05-30 06:56:09 +00:00
2026-05-30 07:02:41 +00:00
2026-05-30 03:20:49 +00:00
2026-05-29 20:02:00 +00:00
2026-05-29 20:02:00 +00:00
2026-05-30 03:45:05 +00:00
2026-05-29 22:46:21 +00:00
2026-05-30 03:46:38 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 13:08:06 +00:00
2026-05-28 15:46:53 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:55:59 +00:00
2026-05-28 15:17:40 +00:00
2026-05-28 15:21:33 +00:00
2026-05-28 13:10:02 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-31 23:10:13 +00:00
2026-05-28 07:49:48 +00:00
2026-05-28 06:41:59 +00:00
2026-05-28 14:28:24 +00:00
2026-05-28 13:00:37 +00:00
2026-05-29 18:50:58 +00:00
2026-05-28 14:16:43 +00:00
2026-05-28 13:47:47 +00:00
2026-05-23 02:54:54 +00:00
2026-05-22 23:35:55 +00:00
2026-05-28 14:38:03 +00:00
2026-05-28 14:50:43 +00:00
2026-05-21 17:30:44 +00:00
2026-06-01 09:19:48 +00:00
2026-06-02 08:24:32 +00:00
2026-06-02 08:43:40 +00:00
2026-05-28 04:02:45 +00:00
2026-06-02 09:37:53 +00:00
2026-06-02 09:43:45 +00:00
2026-06-02 10:08:43 +00:00
2026-06-01 15:04:46 +00:00
2026-06-01 15:04:46 +00:00
2026-05-28 14:40:55 +00:00
2026-05-28 14:10:07 +00:00
2026-05-28 14:19:45 +00:00
2026-05-28 03:39:55 +00:00
2026-05-28 04:32:08 +00:00
2026-06-01 07:55:29 +00:00
2026-05-25 03:07:53 +00:00
2026-05-25 16:24:04 +00:00
2026-06-01 14:15:27 +00:00
2026-05-23 08:45:26 +00:00
2026-05-25 09:08:01 +00:00
2026-05-25 03:17:13 +00:00
2026-06-02 04:31:18 +00:00
2026-05-30 17:23:13 +00:00
2026-05-30 17:25:01 +00:00
2026-05-30 17:17:54 +00:00
2026-05-23 03:32:53 +00:00
2026-06-01 14:11:37 +00:00
2026-06-01 05:19:13 +00:00
2026-05-30 21:08:12 +00:00
2026-05-28 14:35:29 +00:00
2026-05-29 19:08:39 +00:00
2026-05-28 14:33:31 +00:00
2026-05-28 14:58:10 +00:00
2026-05-28 14:24:53 +00:00
2026-05-29 18:30:52 +00:00
2026-05-29 18:35:00 +00:00
2026-05-29 18:42:03 +00:00
2026-05-28 08:53:35 +00:00
2026-05-28 14:17:37 +00:00
2026-05-29 19:30:50 +00:00
2026-05-29 18:29:49 +00:00
2026-06-02 09:05:22 +00:00
2026-05-28 14:18:39 +00:00
2026-05-28 14:10:07 +00:00
2026-05-28 16:36:53 +00:00
2026-05-28 19:18:01 +00:00
2026-05-28 17:00:20 +00:00
2026-05-28 16:58:30 +00:00
2026-05-29 04:45:06 +00:00
2026-05-29 04:45:54 +00:00
2026-05-29 04:43:24 +00:00
2026-05-28 16:51:40 +00:00
2026-05-29 18:46:09 +00:00
2026-05-29 18:45:01 +00:00
2026-05-29 18:48:39 +00:00
2026-05-28 16:39:45 +00:00
2026-05-29 19:26:09 +00:00
2026-05-28 16:42:24 +00:00
2026-05-29 19:29:35 +00:00
2026-05-29 19:28:23 +00:00
2026-05-29 19:27:30 +00:00
2026-05-29 18:27:07 +00:00
2026-05-28 23:06:07 +00:00
2026-05-28 15:51:55 +00:00
2026-05-28 09:59:43 +00:00
2026-05-28 07:42:16 +00:00
2026-05-28 15:49:47 +00:00
2026-05-28 15:48:15 +00:00
2026-05-28 07:12:26 +00:00
2026-05-28 22:56:29 +00:00
2026-05-28 15:54:05 +00:00
2026-05-28 12:57:38 +00:00
2026-05-28 11:39:15 +00:00
2026-05-29 22:42:46 +00:00