vllm/vllm/model_executor at c09dade2a263b6f684d2fbf390c9c1c64761e953 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Michael Goin c09dade2a2 [Misc][Breaking] Change FP8 checkpoint format from act_scale -> input_scale (#5353 )

2024-06-08 13:54:05 -04:00

..

guided_decoding

[Frontend][Core] Update Outlines Integration from FSM to Guide (#4109 )

2024-06-05 16:49:12 -07:00

[Misc][Breaking] Change FP8 checkpoint format from act_scale -> input_scale (#5353 )

2024-06-08 13:54:05 -04:00

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

[Misc][Breaking] Change FP8 checkpoint format from act_scale -> input_scale (#5353 )

2024-06-08 13:54:05 -04:00

__init__.py

[Core] Refactor Attention Take 2 (#3462 )

2024-03-25 04:39:33 +00:00

custom_op.py

[Misc] Add CustomOp interface for device portability (#5255 )

2024-06-05 09:18:19 -07:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Core] Avoid copying prompt/output tokens if no penalties are used (#5289 )

2024-06-06 18:12:00 -07:00

utils.py

[Hardware][Neuron] Refactor neuron support (#3471 )

2024-03-22 01:22:17 +00:00