This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
3d232dbd19f9c9d782a47b579f4a3c5a2f996499
vllm
/
tests
/
compile
/
piecewise
History
Wentao Ye
5c3fbfe46b
[Feature] Full Cuda Graph Support for Cutlass MLA and 6% E2E Throughput Improvement (
#22763
)
...
Signed-off-by: yewentao256 <
zhyanwentao@126.com
>
2025-08-15 06:27:30 +00:00
..
__init__.py
[torch.compile] rework compile control with piecewise cudagraph (
#9715
)
2024-10-29 23:03:49 -07:00
test_full_cudagraph.py
[Feature] Full Cuda Graph Support for Cutlass MLA and 6% E2E Throughput Improvement (
#22763
)
2025-08-15 06:27:30 +00:00
test_multiple_graphs.py
Add test case for compiling multiple graphs (
#21044
)
2025-07-23 11:00:47 -07:00
test_simple.py
[CUDA] Enable full cudagraph for FlashMLA (
#18581
)
2025-06-13 18:12:26 +00:00
test_toy_llama.py
[CUDA] Enable full cudagraph for FlashMLA (
#18581
)
2025-06-13 18:12:26 +00:00