vllm/docs/serving at c6f722b93e8e795065751172812ee6a5540e5901 - vllm

Files

Vedant V Jhaveri 2e56975657 Generative Scoring (#34539 )

Signed-off-by: Vedant Jhaveri <vjhaveri@linkedin.com>
Co-authored-by: Vedant Jhaveri <vjhaveri@linkedin.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

2026-03-31 16:02:11 -07:00

integrations

[Frontend] Exclude anthropic billing header to avoid prefix cache miss (#36829 )

2026-03-12 01:20:34 +00:00

context_parallel_deployment.md

[Doc]: fixing multiple typos in diverse files (#33256 )

2026-01-29 16:52:03 +08:00

data_parallel_deployment.md

[Docs] Clarify Expert Parallel behavior for attention and MoE layers (#30615 )

2025-12-13 08:37:59 -09:00

distributed_troubleshooting.md

[Docs] Replace all explicit anchors with real links (#27087 )

2025-10-17 02:22:06 -07:00

expert_parallel_deployment.md

[MoE Refactor] Rename "naive" all2all backend (#36294 )

2026-03-19 15:50:34 -04:00

offline_inference.md

[Docs] Reorganize pooling docs. (#35592 )

2026-03-19 11:25:47 +00:00

openai_compatible_server.md

Generative Scoring (#34539 )

2026-03-31 16:02:11 -07:00

parallelism_scaling.md

[Dependency] Remove default ray dependency (#36170 )

2026-03-08 20:06:22 -07:00