[Frontend] Add backend-specific options for guided decoding (#13505)

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
This commit is contained in:
Joe Runde
2025-02-20 13:07:58 -07:00
committed by GitHub
parent 6a417b8600
commit bfbc0b32c6
8 changed files with 123 additions and 42 deletions

View File

@@ -16,7 +16,7 @@ The following parameters are supported, which must be added as extra parameters:
- `guided_json`: the output will follow the JSON schema.
- `guided_grammar`: the output will follow the context free grammar.
- `guided_whitespace_pattern`: used to override the default whitespace pattern for guided json decoding.
- `guided_decoding_backend`: used to select the guided decoding backend to use.
- `guided_decoding_backend`: used to select the guided decoding backend to use. Additional backend-specific options can be supplied in a comma separated list following a colon after the backend name. For example `"xgrammar:no-fallback"` will not allow vLLM to fallback to a different backend on error.
You can see the complete list of supported parameters on the [OpenAI-Compatible Server](#openai-compatible-server)page.