Logo
Explore Help
Register Sign In
biondizzle/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
Files
3b3b778d4af545a30290275d3154bb0e514d2dcc
vllm/tests/kernels
History
Wentao Ye 42d440c22b [Perf] Use Triton instead of Torch for DeepGEMM Per Token Group Quant (#20841)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-07-12 19:38:45 -07:00
..
attention
[Core] Add Flashinfer TRTLLM Backend for Flashinfer decode path (SM100). (#19825)
2025-07-11 09:23:23 +00:00
core
…
mamba
[Kernel] Triton implementation of causal-conv1d for Mamba-based models (#18218)
2025-07-09 12:53:55 -07:00
moe
[Perf] Use Triton instead of Torch for DeepGEMM Per Token Group Quant (#20841)
2025-07-12 19:38:45 -07:00
quantization
[Perf] Use Triton instead of Torch for DeepGEMM Per Token Group Quant (#20841)
2025-07-12 19:38:45 -07:00
__init__.py
…
allclose_default.py
…
quant_utils.py
…
test_apply_repetition_penalties.py
…
test_cutlass_mla_decode.py
…
test_flex_attention.py
…
test_fused_quant_activation.py
…
test_triton_flash_attention.py
…
utils.py
[Misc] Add unit tests for MoE ModularKernel combinations + Profiling utility (#20449)
2025-07-11 07:51:46 -07:00
Powered by Gitea Version: 1.25.2 Page: 5373ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API