vllm/docs/serving at e5b807607c8493155e6eccd665772d4c19b2114e - vllm

Files

leo-cf-tian 2754231ba3 [Kernel] Add FlashInfer MoE A2A Kernel (#36022 )

Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Signed-off-by: Leo Tian <lctian@nvidia.com>
Co-authored-by: wzhao18 <wzhao18.sz@gmail.com>
Co-authored-by: Stefano Castagnetta <scastagnetta@nvidia.com>
Co-authored-by: root <root@lyris0267.lyris.clusters.nvidia.com>

2026-03-15 23:45:32 -07:00

integrations

[Frontend] Exclude anthropic billing header to avoid prefix cache miss (#36829 )

2026-03-12 01:20:34 +00:00

context_parallel_deployment.md

[Doc]: fixing multiple typos in diverse files (#33256 )

2026-01-29 16:52:03 +08:00

data_parallel_deployment.md

[Docs] Clarify Expert Parallel behavior for attention and MoE layers (#30615 )

2025-12-13 08:37:59 -09:00

distributed_troubleshooting.md

[Docs] Replace all explicit anchors with real links (#27087 )