Logo
Explore Help
Register Sign In
biondizzle/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
Files
c09dade2a263b6f684d2fbf390c9c1c64761e953
vllm/vllm/model_executor
History
Michael Goin c09dade2a2 [Misc][Breaking] Change FP8 checkpoint format from act_scale -> input_scale (#5353)
2024-06-08 13:54:05 -04:00
..
guided_decoding
[Frontend][Core] Update Outlines Integration from FSM to Guide (#4109)
2024-06-05 16:49:12 -07:00
layers
[Misc][Breaking] Change FP8 checkpoint format from act_scale -> input_scale (#5353)
2024-06-08 13:54:05 -04:00
model_loader
[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776)
2024-06-01 14:51:10 -06:00
models
[Misc][Breaking] Change FP8 checkpoint format from act_scale -> input_scale (#5353)
2024-06-08 13:54:05 -04:00
__init__.py
[Core] Refactor Attention Take 2 (#3462)
2024-03-25 04:39:33 +00:00
custom_op.py
[Misc] Add CustomOp interface for device portability (#5255)
2024-06-05 09:18:19 -07:00
pooling_metadata.py
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734)
2024-05-11 11:30:37 -07:00
sampling_metadata.py
[Core] Avoid copying prompt/output tokens if no penalties are used (#5289)
2024-06-06 18:12:00 -07:00
utils.py
[Hardware][Neuron] Refactor neuron support (#3471)
2024-03-22 01:22:17 +00:00
Powered by Gitea Version: 1.25.2 Page: 91ms Template: 14ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API