Update bnb.md with example for OpenAI (#11718)

This commit is contained in:
Alberto Ferrer
2025-01-04 00:29:02 -06:00
committed by GitHub
parent 9c93636d84
commit d1d49397e7

View File

@@ -37,3 +37,10 @@ model_id = "huggyllama/llama-7b"
llm = LLM(model=model_id, dtype=torch.bfloat16, trust_remote_code=True, \
quantization="bitsandbytes", load_format="bitsandbytes")
```
## OpenAI Compatible Server
Append the following to your 4bit model arguments:
```
--quantization bitsandbytes --load-format bitsandbytes
```