This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
4efbac6d3593ed35fd5b6ccb3958bd96b2c9b4da
vllm
/
vllm
/
engine
History
Woosuk Kwon
a463c333dd
Use CuPy for CUDA graphs (
#2811
)
2024-02-13 11:32:06 -08:00
..
__init__.py
Change the name to vLLM (
#150
)
2023-06-17 03:07:40 -07:00
arg_utils.py
Remove hardcoded
device="cuda"
to support more devices (
#2503
)
2024-02-01 15:46:39 -08:00
async_llm_engine.py
fix some bugs (
#2689
)
2024-01-31 10:09:23 -08:00
llm_engine.py
Use CuPy for CUDA graphs (
#2811
)
2024-02-13 11:32:06 -08:00
metrics.py
Refactor Prometheus and Add Request Level Metrics (
#2316
)
2024-01-31 14:58:07 -08:00
ray_utils.py
[Ray] Integration compiled DAG off by default (
#2471
)
2024-02-08 09:57:25 -08:00