vllm/vllm/model_executor at a921d8be9dd8b98795b4d8076f3af4f48dc3d24d - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Zhuohan Li 7d761fe3c1 [FIX] Fix the case when input_is_parallel=False for ScaledActivation (#1737 )

2023-11-20 23:56:48 -08:00

..

[FIX] Fix the case when input_is_parallel=False for ScaledActivation (#1737 )

2023-11-20 23:56:48 -08:00

[BugFix] Fix TP support for AWQ (#1731 )

2023-11-20 21:42:45 -08:00

TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )

2023-11-15 22:50:41 -08:00

__init__.py

[Quality] Add code formatter and linter (#326 )

2023-07-03 11:31:55 -07:00

input_metadata.py

Delay GPU->CPU sync in sampling (#1337 )

2023-10-30 09:01:34 -07:00

model_loader.py

Migrate linter from pylint to ruff (#1665 )

2023-11-20 11:58:01 -08:00

utils.py

TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )

2023-11-15 22:50:41 -08:00

weight_utils.py

[BugFix] Fix a bug in loading safetensors (#1732 )

2023-11-20 15:51:18 -08:00