vllm/tests/compile/fusions_e2e at cbe7d1809649b1a8a954eb155b52d418a5554c4b - vllm

Files

Luka Govedič 40bb175027 [vLLM IR] 1/N Implement IR skeleton and rms_norm op (#33825 )

Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
Signed-off-by: chzhang <chaojun.zhang@intel.com>
Signed-off-by: Luka Govedic <luka.govedic@gmail.com>
Co-authored-by: Xinyu Chen <xinyu1.chen@intel.com>
Co-authored-by: Chaojun Zhang <chaojun.zhang@intel.com>
Co-authored-by: Luka Govedič <ProExpertProg@h100-01.nemg-001.lab.rdu2.dc.redhat.com>

2026-03-31 22:15:05 -04:00

__init__.py

[CI][torch.compile] Reduce e2e fusion test time (#33293 )

2026-02-04 19:09:03 -05:00

common.py

[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops (#34307 )

2026-03-03 06:24:21 -08:00

conftest.py

[torch.compile] Refactor Attention Quant Fusion Pass and Remove Boilerplate (#37373 )

2026-03-31 14:15:50 -04:00

models.py

[Perf] Eliminate padding and slicing op for GPT-OSS with Flashinfer MXFP4 MXFP8 MoE (#30647 )