This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
d272415e57c95da63c798c22c7d87cc5c0cda21f
vllm
/
vllm
/
model_executor
/
layers
/
quantization
/
compressed_tensors
History
Dipika Sikka
d272415e57
[Quantization] Expand compressed-tensors MoE matching logic to support NFP4 + FP8 MoEs (
#22674
)
...
Signed-off-by: Dipika Sikka <
dipikasikka1@gmail.com
> Signed-off-by: Dipika <
dipikasikka1@gmail.com
>
2025-08-27 05:00:21 +00:00
..
schemes
[quantization] use channel scales for w4a8 + misc fixes (
#23570
)
2025-08-26 18:23:23 -07:00
__init__.py
[Kernel] Initial Activation Quantization Support (
#4525
)
2024-05-23 21:29:18 +00:00
compressed_tensors_moe.py
[Quantization] Expand compressed-tensors MoE matching logic to support NFP4 + FP8 MoEs (
#22674
)
2025-08-27 05:00:21 +00:00
compressed_tensors.py
[Quantization] Expand compressed-tensors MoE matching logic to support NFP4 + FP8 MoEs (
#22674
)
2025-08-27 05:00:21 +00:00
triton_scaled_mm.py
[AMD][Kernel][BugFix] fix test_rocm_compressed_tensors_w8a8 for rocm (
#19509
)
2025-06-12 07:14:24 +00:00
utils.py
[Quantization] Add compressed-tensors NVFP4 support (
#18312
)
2025-06-08 09:05:55 -04:00