[MISC] cudagraph_capture_sizes related improvements (#26016)

Signed-off-by: fhl <2410591650@qq.com>
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
fhl2000
2025-10-24 20:11:05 +08:00
committed by GitHub
parent 435be10db9
commit 284cc92275
14 changed files with 303 additions and 110 deletions

View File

@@ -185,7 +185,7 @@ class Mxfp4MoEMethod(FusedMoEMethodBase):
self.moe = moe
self.mxfp4_backend = get_mxfp4_backend()
self.max_capture_size = (
get_current_vllm_config().compilation_config.max_capture_size
get_current_vllm_config().compilation_config.max_cudagraph_capture_size
)
assert self.mxfp4_backend != Mxfp4Backend.NONE, (