vllm/vllm/model_executor at bb00f66e19acdf6cb614683ab74f777ed3932eee - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Woosuk Kwon bb00f66e19 Use quantization_config in hf config (#1695 )

2023-11-17 16:23:49 -08:00

..

Support Min P Sampler (#1642 )

2023-11-17 16:20:49 -08:00

Support Microsoft Phi 1.5 (#1664 )

2023-11-16 14:28:39 -08:00

TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )

2023-11-15 22:50:41 -08:00

__init__.py

[Quality] Add code formatter and linter (#326 )

2023-07-03 11:31:55 -07:00

input_metadata.py

Delay GPU->CPU sync in sampling (#1337 )

2023-10-30 09:01:34 -07:00

model_loader.py

Use quantization_config in hf config (#1695 )

2023-11-17 16:23:49 -08:00

utils.py

TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )

2023-11-15 22:50:41 -08:00

weight_utils.py

Use quantization_config in hf config (#1695 )

2023-11-17 16:23:49 -08:00