Woosuk Kwon
|
4338cc4750
|
[Tokenizer] Add an option to specify tokenizer (#284)
|
2023-06-28 09:46:58 -07:00 |
|
Jishnu Ray Chowdhury
|
bdd6b4c8bc
|
Add LLM.set_tokenizer (#283)
|
2023-06-28 00:28:29 -07:00 |
|
twaka
|
4026a049d3
|
expand coverage of gpt2 model loading (#271)
|
2023-06-27 06:27:41 -07:00 |
|
Woosuk Kwon
|
526df28fb2
|
[BugFix] Fix a bug in counting running sequences (#266)
|
2023-06-26 13:09:02 -07:00 |
|
Zhuohan Li
|
0b7db411b5
|
[Bug] Fix the OOM condition for CPU cache (#260)
|
2023-06-26 11:16:13 -07:00 |
|
BasicCoder
|
471a7a4566
|
Compatible with Decapoda Research llama hf version (#251)
|
2023-06-26 09:23:57 -07:00 |
|
metacryptom
|
0603379863
|
fix wrong using getattr to get dict value (#232)
|
2023-06-24 22:00:24 -07:00 |
|
Michael Feil
|
298695b766
|
GPTBigCode (StarCoder, SantaCoder Support) (#209)
|
2023-06-23 01:49:27 +08:00 |
|
Zhuohan Li
|
83658c8ace
|
Bump up version to 0.1.1 (#204)
|
2023-06-22 15:33:32 +08:00 |
|
Zhuohan Li
|
1d24ccb96c
|
[Fix] Better error message when there is OOM during cache initialization (#203)
|
2023-06-22 15:30:06 +08:00 |
|
Woosuk Kwon
|
14f0b39cda
|
[Bugfix] Fix a bug in RequestOutput.finished (#202)
|
2023-06-22 00:17:24 -07:00 |
|
Zhuohan Li
|
2e0d314384
|
fix-ray (#193)
|
2023-06-22 00:21:41 +08:00 |
|
Woosuk Kwon
|
67d96c29fb
|
Use slow tokenizer for open llama models (#168)
|
2023-06-20 14:19:47 +08:00 |
|
Woosuk Kwon
|
7e2a913c64
|
[Minor] Fix CompletionOutput.__repr__ (#157)
|
2023-06-18 19:58:25 -07:00 |
|
Woosuk Kwon
|
3f92038b99
|
Add comments on swap space (#154)
|
2023-06-18 11:39:35 -07:00 |
|
Zhuohan Li
|
bf5f121c02
|
Reduce GPU memory utilization to make sure OOM doesn't happen (#153)
|
2023-06-18 17:33:50 +08:00 |
|
Zhuohan Li
|
bec7b2dc26
|
Add quickstart guide (#148)
|
2023-06-18 01:26:12 +08:00 |
|
Woosuk Kwon
|
0b98ba15c7
|
Change the name to vLLM (#150)
|
2023-06-17 03:07:40 -07:00 |
|