This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
d4201e06d5ec3384e06b20816ad6b2f1d1fb1441
vllm
/
tests
/
spec_decode
/
e2e
History
Thomas Parnell
d4201e06d5
[Bugfix] Make spec. decode respect per-request seed. (
#6034
)
...
Signed-off-by: Thomas Parnell <
tpa@zurich.ibm.com
> Co-authored-by: Nick Hill <
nickhill@us.ibm.com
>
2024-07-18 19:22:08 -07:00
..
__init__.py
[Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (
#3951
)
2024-04-23 08:02:36 +00:00
conftest.py
[Bugfix] Make spec. decode respect per-request seed. (
#6034
)
2024-07-18 19:22:08 -07:00
test_compatibility.py
[Speculative decoding][Re-take] Enable TP>1 speculative decoding (
#4840
)
2024-05-16 00:53:51 -07:00
test_integration_dist_tp2.py
[Speculative Decoding] MLPSpeculator Tensor Parallel support (1/2) (
#6050
)
2024-07-02 07:20:29 -07:00
test_integration_dist_tp4.py
[Speculative Decoding] Support draft model on different tensor-parallel size than target model (
#5414
)
2024-06-25 09:56:06 +00:00
test_integration.py
[Speculative decoding][Re-take] Enable TP>1 speculative decoding (
#4840
)
2024-05-16 00:53:51 -07:00
test_logprobs.py
[Speculative decoding] Support target-model logprobs (
#4378
)
2024-05-03 15:52:01 -07:00
test_medusa_correctness.py
[Speculative Decoding] Medusa Implementation with Top-1 proposer (
#4978
)
2024-07-09 18:34:02 -07:00
test_mlp_correctness.py
[CORE] Quantized lm-head Framework (
#4442
)
2024-07-02 22:25:17 +00:00
test_multistep_correctness.py
[Misc] Log spec decode metrics (
#6454
)
2024-07-16 20:37:10 +00:00
test_ngram_correctness.py
[Dynamic Spec Decoding] Minor fix for disabling speculative decoding (
#5000
)
2024-05-25 10:00:14 -07:00
test_seed.py
[Bugfix] Make spec. decode respect per-request seed. (
#6034
)
2024-07-18 19:22:08 -07:00