This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
83609ca91d42c8847d1b4c272b011a0b6c27319e
vllm
/
vllm
/
model_executor
/
layers
/
quantization
/
compressed_tensors
History
bnellnm
e9b92dcd89
[Kernels] Overlap shared experts with send/recv (
#23273
)
...
Signed-off-by: Bill Nell <
bnell@redhat.com
>
2025-09-03 12:35:18 -04:00
..
schemes
[quantization] use channel scales for w4a8 + misc fixes (
#23570
)
2025-08-26 18:23:23 -07:00
transform
fix some typos (
#24071
)
2025-09-02 20:44:50 -07:00
__init__.py
[Kernel] Initial Activation Quantization Support (
#4525
)
2024-05-23 21:29:18 +00:00
compressed_tensors_moe.py
[Kernels] Overlap shared experts with send/recv (
#23273
)
2025-09-03 12:35:18 -04:00
compressed_tensors.py
[Bugfix] Fix transform_config parsing in Compressed Tensors (
#23945
)
2025-09-02 13:54:10 -04:00
triton_scaled_mm.py
[AMD][Kernel][BugFix] fix test_rocm_compressed_tensors_w8a8 for rocm (
#19509
)
2025-06-12 07:14:24 +00:00
utils.py
[Doc]: fix typos in Python comments (
#24093
)
2025-09-02 21:05:45 -07:00