vllm/benchmarks/kernels at d588cd24061011f76da721c89f9e2171a2b2c4c8 - vllm

Files

Cyrus Leung 6c117cff7d [Frontend] Pass API server count to each process (#23717 )

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

2025-09-20 01:15:19 +08:00

deepgemm

…

bench_block_fp8_gemm.py

…

bench_fp8_gemm.py

…

bench_int8_gemm.py

…

bench_nvfp4_gemm.py

…

bench_per_token_quant_fp8.py

…

benchmark_activation.py

…

benchmark_bitblas.py

…

benchmark_cutlass_fp4_moe.py

…

benchmark_device_communicators.py

…

benchmark_grouped_gemm_cutlass.py

…

benchmark_layernorm.py

…

benchmark_lora.py

…

benchmark_machete.py

…

benchmark_marlin.py

…

benchmark_moe_align_block_size.py

…

benchmark_moe_permute_unpermute.py

…

benchmark_moe.py

…

benchmark_mrope.py

…

benchmark_paged_attention.py

…

benchmark_per_token_group_quant.py

…

benchmark_polynorm.py

…

benchmark_quant.py

…

benchmark_reshape_and_cache_flash.py

…

benchmark_rmsnorm.py

…

benchmark_rope.py

…

benchmark_shapes.py

…

benchmark_silu_mul_fp8_quant.py

…

benchmark_trtllm_decode_attention.py

…

benchmark_trtllm_prefill_attention.py

…

benchmark_w8a8_block_fp8.py

[Frontend] Pass API server count to each process (#23717 )

2025-09-20 01:15:19 +08:00

graph_machete_bench.py

…

requirements.txt

…

utils.py

…

weight_shapes.py

…