This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
738d0a281fab2e151a67b370c26b4e4360362f8f
vllm
/
tests
/
compile
/
fusions_e2e
History
elvischenv
296839a1b0
[Perf] Eliminate padding and slicing op for GPT-OSS with Flashinfer MXFP4 MXFP8 MoE (
#30647
)
...
Signed-off-by: elvischenv <
219235043+elvischenv@users.noreply.github.com
>
2026-03-18 15:01:26 +00:00
..
__init__.py
[CI][torch.compile] Reduce e2e fusion test time (
#33293
)
2026-02-04 19:09:03 -05:00
common.py
[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops (
#34307
)
2026-03-03 06:24:21 -08:00
conftest.py
[Perf] Eliminate padding and slicing op for GPT-OSS with Flashinfer MXFP4 MXFP8 MoE (
#30647
)
2026-03-18 15:01:26 +00:00
models.py
[Perf] Eliminate padding and slicing op for GPT-OSS with Flashinfer MXFP4 MXFP8 MoE (
#30647
)
2026-03-18 15:01:26 +00:00
test_tp1_quant.py
[torch.compile] Add support for non-contiguous fused RMSNorm + group quant (
#36551
)
2026-03-11 10:56:55 -07:00
test_tp2_ar_rms.py
[Perf] Eliminate padding and slicing op for GPT-OSS with Flashinfer MXFP4 MXFP8 MoE (
#30647
)
2026-03-18 15:01:26 +00:00
test_tp2_async_tp.py
[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops (
#34307
)
2026-03-03 06:24:21 -08:00