Logo
Explore Help
Register Sign In
biondizzle/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
Files
7b86e7c9cd6541abdf5d083b0a8a98ee667a91d1
vllm/vllm/entrypoints/openai
History
Yihuan Bu 654bc5ca49 Support for guided decoding for offline LLM (#6878)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-08-04 03:12:09 +00:00
..
rpc
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
__init__.py
Change the name to vLLM (#150)
2023-06-17 03:07:40 -07:00
api_server.py
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
cli_args.py
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
logits_processors.py
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
protocol.py
Support for guided decoding for offline LLM (#6878)
2024-08-04 03:12:09 +00:00
run_batch.py
[Frontend] Refactor prompt processing (#4028)
2024-07-22 10:13:53 -07:00
serving_chat.py
[Frontend] Factor out chat message parsing (#7055)
2024-08-02 21:31:27 -07:00
serving_completion.py
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
serving_embedding.py
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
serving_engine.py
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
2024-08-02 18:27:28 -07:00
serving_tokenization.py
[Frontend] Factor out chat message parsing (#7055)
2024-08-02 21:31:27 -07:00
Powered by Gitea Version: 1.25.2 Page: 155ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API