Commit Graph

61 Commits

Author SHA1 Message Date
Joe Runde
bfbc0b32c6 [Frontend] Add backend-specific options for guided decoding (#13505)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2025-02-20 15:07:58 -05:00
Lu Fang
6224a9f620 Support logit_bias in v1 Sampler (#13079) 2025-02-14 04:34:59 -08:00
Russell Bryant
e489ad7a21 [Misc] Add SPDX-License-Identifier headers to python source files (#12628)
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**

commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:18:24 2025 -0500

    Add SPDX license headers to python source files
    
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
    also be easily used by tools to help manage license compliance.
    
The Linux Foundation runs license scans against the codebase to help
ensure
    we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
    
    More information can be found on the SPDX site:
    
    - https://spdx.dev/learn/handling-license-info/
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:36:32 2025 -0500

    Check for SPDX headers using pre-commit
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

---------

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Michael Goin
74fa1d123c [Bugfix] Fix OpenAI parallel sampling when using xgrammar (#11637)
Signed-off-by: mgoin <michael@neuralmagic.com>
2024-12-31 03:43:54 +00:00
jianzheng
8db957ee3a [bugfix] fixed parameter “n” when set parameter “bestof” > 1 (#10854)
Signed-off-by: jianzheng <57654625+o2363286@users.noreply.github.com>
2024-12-04 08:48:22 +00:00
Joe Runde
031a7995f3 [Bugfix][Frontend] Reject guided decoding in multistep mode (#9892)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-11-01 01:09:46 +00:00
Zhong Qishuai
ef7865b4f9 [Frontend] re-enable multi-modality input in the new beam search implementation (#9427)
Signed-off-by: Qishuai Ferdinandzhong@gmail.com
2024-10-29 11:49:47 +00:00
Vasiliy Alekseev
07e981fdf4 [Frontend] Bad words sampling parameter (#9717)
Signed-off-by: Vasily Alexeev <alvasian@yandex.ru>
2024-10-26 16:29:38 +00:00
Nick Hill
1325872ec8 [Frontend] Avoid creating guided decoding LogitsProcessor unnecessarily (#9521) 2024-10-18 20:21:01 -07:00
youkaichao
cbc2ef5529 [misc] hide best_of from engine (#9261)
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
2024-10-10 21:30:44 -07:00
Travis Johnson
480b7f40cf [Misc] Improve validation errors around best_of and n (#9167)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
2024-10-09 04:54:48 +00:00
youkaichao
18b296fdb2 [core] remove beam search from the core (#9105) 2024-10-07 05:47:04 +00:00
Brendan Wong
168cab6bbf [Frontend] API support for beam search (#9087)
Co-authored-by: youkaichao <youkaichao@126.com>
2024-10-05 23:39:03 -07:00
Joe Runde
062c89e7c9 [Frontend][Core] Move guided decoding params into sampling params (#8252)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-10-01 09:34:25 +08:00
youkaichao
1e7d5c01f5 [misc] soft drop beam search (#8763) 2024-09-24 15:48:39 -07:00
saumya-saran
b28298f2f4 [Bugfix] Validate SamplingParam n is an int (#8548) 2024-09-20 12:46:02 -07:00
Nick Hill
551ce01078 [Core] Add engine option to return only deltas or final output (#7381) 2024-09-12 12:02:00 -07:00
Cyrus Leung
baaedfdb2d [mypy] Enable following imports for entrypoints (#7248)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Fei <dfdfcai4@gmail.com>
2024-08-20 23:28:21 -07:00
SangBin Cho
ff7ec82c4d [Core] Optimize SPMD architecture with delta + serialization optimization (#7109) 2024-08-18 17:57:20 -07:00
Chang Su
c134a46402 Fix empty output when temp is too low (#2937)
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2024-08-14 05:31:44 +00:00
Atilla Akkuş
7b261092de [BUGFIX]: top_k is expected to be an integer. (#7227) 2024-08-07 00:32:16 -07:00
Peng Guanwen
db9e5708a9 [Core] Reduce unnecessary compute when logprobs=None (#6532) 2024-07-29 16:47:31 +00:00
Woosuk Kwon
bdf5fd1386 [Misc] Remove deprecation warning for beam search (#6659) 2024-07-23 00:21:58 +00:00
Simon Mo
32c9d7f765 Report usage for beam search (#6404) 2024-07-14 19:37:35 -07:00
Woosuk Kwon
eeceadaecc [Misc] Add deprecation warning for beam search (#6402) 2024-07-13 11:52:22 -07:00
Nick Hill
365791ff81 [BugFix] Fix min_tokens behaviour for multiple eos tokens (#5849) 2024-06-27 11:31:11 -07:00
Elisei Smirnov
e3470f8753 [Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985)
Co-authored-by: Elisei Smirnov <el.smirnov@innopolis.university>
2024-05-23 22:04:24 +00:00
sasha0552
69909126a7 [Bugfix] Use random seed if seed is -1 (#4531) 2024-05-01 10:41:17 -07:00
Li, Jiang
dd1a50a8bc [Bugfix][Minor] Make ignore_eos effective (#4468) 2024-04-30 16:33:33 -07:00
Nick Hill
81661da7b2 [BugFix] Fix min_tokens when eos_token_id is None (#4389)
Co-authored-by: DefTruth <31974251+deftruth@users.noreply.github.com>
2024-04-27 09:52:46 -07:00
Simon Mo
a134ef6f5e Support eos_token_id from generation_config.json (#4182) 2024-04-19 04:13:36 +00:00
SangBin Cho
09473ee41c [mypy] Add mypy type annotation part 1 (#4006) 2024-04-12 14:35:50 -07:00
Nick Hill
e46a60aa4c [BugFix] Fix handling of stop strings and stop token ids (#3672) 2024-04-11 15:34:12 -07:00
Thomas Parnell
1d7c940d74 Add option to completion API to truncate prompt tokens (#3144) 2024-04-05 10:15:42 -07:00
Matthias Gerstgrasser
aabe8f40f2 [Core] [Frontend] Make detokenization optional (#3749)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-04-03 21:52:18 -07:00
Travis Johnson
c13ad1b7bd feat: implement the min_tokens sampling parameter (#3124)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-03-25 10:14:26 -07:00
Zhuohan Li
2f8844ba08 Re-enable the 80 char line width limit (#3305) 2024-03-10 19:49:14 -07:00
Nick Hill
29a8d6a554 [Fix] Don't deep-copy LogitsProcessors when copying SamplingParams (#3099) 2024-02-29 19:20:42 +00:00
Nick Hill
7d2dcce175 Support per-request seed (#2514) 2024-02-21 11:47:00 -08:00
Nikola Borisov
3209b49033 [Bugfix] fix crash if max_tokens=None (#2570) 2024-01-23 22:38:55 -08:00
Roy
9140561059 [Minor] Fix typo and remove unused code (#2305) 2024-01-02 19:23:15 -08:00
Yunfeng Bai
c06170cc8e Add a flag to include stop string in output text (#1976) 2023-12-15 00:45:58 -08:00
Roy
60dc62dc9e add custom server params (#1868) 2023-12-03 12:59:18 -08:00
Jerry
f86bd6190a Fix the typo in SamplingParams' docstring (#1886) 2023-12-01 02:06:36 -08:00
ljss
de23687d16 Fix repetition penalty aligned with huggingface (#1577) 2023-11-22 14:41:44 -08:00
ljss
4cea74c73b Set top_p=0 and top_k=-1 in greedy sampling (#1748) 2023-11-22 12:51:09 -08:00
陈序
094f716bf2 Add stop_token_ids in SamplingParams.__repr__ (#1745) 2023-11-21 20:13:53 -08:00
Roy
e87557b069 Support Min P Sampler (#1642) 2023-11-17 16:20:49 -08:00
Noam Gat
555bdcc5a3 Added logits processor API to sampling params (#1469) 2023-11-03 14:12:15 -07:00
Dan Lord
7013a80170 Add support for spaces_between_special_tokens 2023-10-30 16:52:56 -07:00