Logo
Explore Help
Register Sign In
biondizzle/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
Files
54bd9a03c4b2da0fd0b0e17b0552bbb0d517a081
vllm/docs/source/quantization
History
Woosuk Kwon e20233d361 Revert "[Doc] Update supported_hardware.rst (#7276)" (#7467)
2024-08-13 01:37:08 -07:00
..
auto_awq.rst
[Docs] Update the AWQ documentation to highlight performance issue (#1883)
2023-12-02 15:52:47 -08:00
bnb.rst
[bitsandbytes]: support read bnb pre-quantized model (#5753)
2024-07-23 23:45:09 +00:00
fp8_e4m3_kvcache.rst
[Core/Bugfix] Add FP8 K/V Scale and dtype conversion for prefix/prefill Triton Kernel (#7208)
2024-08-12 22:47:41 +00:00
fp8_e5m2_kvcache.rst
[Core/Bugfix] Add FP8 K/V Scale and dtype conversion for prefix/prefill Triton Kernel (#7208)
2024-08-12 22:47:41 +00:00
fp8.rst
[Kernel] Expand FP8 support to Ampere GPUs using FP8 Marlin (#5975)
2024-07-03 17:38:00 +00:00
supported_hardware.rst
Revert "[Doc] Update supported_hardware.rst (#7276)" (#7467)
2024-08-13 01:37:08 -07:00
Powered by Gitea Version: 1.25.2 Page: 384ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API