vllm/tests/v1/cudagraph at ec38a7368df7a14331e5f99f5de899df13f3b954 - vllm

Files

Morrison Turnansky 0838b52e2e [Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): Set up -O infrastructure (#26847 )

Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: adabeyta <aabeyta@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Co-authored-by: adabeyta <aabeyta@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-11-27 01:55:58 -08:00

__init__.py

[Core] Allow full cudagraph with separate attention routines and orthogonal to compilation, add support for FA2 and FlashInfer (#20059 )

2025-08-15 10:01:39 -04:00

test_cudagraph_dispatch.py

[Core] Refactor padding logic and pad for CUDA graphs before attention metadata building (#28579 )

2025-11-26 14:07:13 -05:00

test_cudagraph_mode.py

[Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): Set up -O infrastructure (#26847 )

2025-11-27 01:55:58 -08:00