youkaichao
|
3aec49e56f
|
[ci/build] update nightly torch for gh200 test (#12270)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-21 23:03:17 +08:00 |
|
Cyrus Leung
|
6cd40a5bfe
|
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 21:34:44 +08:00 |
|
Harry Mellor
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
|
Nathan Azrak
|
68d37809b9
|
[Misc] Minimum requirements for SageMaker compatibility (#11576)
|
2025-01-02 15:59:25 -08:00 |
|
Rafael Vasquez
|
32aa2059ad
|
[Docs] Convert rST to MyST (Markdown) (#11145)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-12-23 22:35:38 +00:00 |
|
Jiaxin Shan
|
47a0b615b4
|
Add ray[default] to wget to run distributed inference out of box (#11265)
Signed-off-by: Jiaxin Shan <seedjeffwan@gmail.com>
|
2024-12-20 13:54:55 -08:00 |
|
omer-dayan
|
995f56236b
|
[Core] Loading model from S3 using RunAI Model Streamer as optional loader (#10192)
Signed-off-by: OmerD <omer@run.ai>
|
2024-12-20 16:46:24 +00:00 |
|
youkaichao
|
7801f56ed7
|
[ci][gh200] dockerfile clean up (#11351)
Signed-off-by: drikster80 <ed.sealing@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: drikster80 <ed.sealing@gmail.com>
Co-authored-by: cenzhiyao <2523403608@qq.com>
|
2024-12-19 18:13:06 -08:00 |
|
cennn
|
b3b1526f03
|
WIP: [CI/Build] simplify Dockerfile build for ARM64 / GH200 (#11212)
Signed-off-by: drikster80 <ed.sealing@gmail.com>
Co-authored-by: drikster80 <ed.sealing@gmail.com>
|
2024-12-16 09:20:49 +00:00 |
|
Jee Jee Li
|
15859f2357
|
[[Misc]Upgrade bitsandbytes to the latest version 0.45.0 (#11201)
|
2024-12-15 03:03:06 +00:00 |
|
youkaichao
|
334d64d1e8
|
[ci] add vllm_test_utils (#10659)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-26 00:20:04 -08:00 |
|
Michael Goin
|
aea6ad629f
|
Add hf_transfer to testing image (#10096)
|
2024-11-08 08:35:25 +00:00 |
|
Joe Runde
|
d58268c56a
|
[V1] Make v1 more testable (#9888)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-11-06 11:57:35 -08:00 |
|
Nikita Furin
|
1b73ab2a1f
|
[CI/Build] Quoting around > (#9956)
|
2024-11-02 12:50:28 -07:00 |
|
Daniele
|
a2c71c5405
|
[CI/Build] remove .github from .dockerignore, add dirty repo check (#9375)
Create Release / Create Release (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.10, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.11, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.12, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.8, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.9, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.10, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.11, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.12, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.8, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.9, 2.4.0) (push) Has been cancelled
|
2024-10-17 10:25:06 -07:00 |
|
Daniele
|
203ab8f80f
|
[CI/Build] setuptools-scm fixes (#8900)
|
2024-10-14 11:34:47 -07:00 |
|
Michael Goin
|
d5fbb8706d
|
[CI/Build] Update Dockerfile install+deploy image to ubuntu 22.04 (#9130)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-09 12:51:47 -06:00 |
|
Peter Pan
|
cfba685bd4
|
[CI/Build] Add examples folder into Docker image so that we can leverage the templates*.jinja when serving models (#8758)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
|
2024-10-08 09:37:34 -07:00 |
|
Michael Goin
|
520db4dbc1
|
[Docs] Add README to the build docker image (#8825)
|
2024-09-26 11:02:52 -07:00 |
|
Tyler Michael Smith
|
f70bccac75
|
[Build/CI] Upgrade to gcc 10 in the base build Docker image (#8814)
|
2024-09-26 10:07:18 -07:00 |
|
Jee Jee Li
|
c6f2485c82
|
[[Misc]] Add extra deps for openai server image (#8792)
|
2024-09-25 09:35:23 -07:00 |
|
Daniele
|
ee5f34b1c2
|
[CI/Build] use setuptools-scm to set __version__ (#4738)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-09-23 09:44:26 -07:00 |
|
Luka Govedič
|
71c60491f2
|
[Kernel] Build flash-attn from source (#8245)
|
2024-09-20 23:27:10 -07:00 |
|
Joe Runde
|
cca61642e0
|
[Bugfix] Fix 3.12 builds on main (#8510)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-09-17 00:01:45 +00:00 |
|
Simon Mo
|
5ce45eb54d
|
[misc] small qol fixes for release process (#8517)
|
2024-09-16 15:11:27 -07:00 |
|
Yangshen⚡Deng
|
6a512a00df
|
[model] Support for Llava-Next-Video model (#7559)
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-09-10 22:21:36 -07:00 |
|
Joe Runde
|
cfe712bf1a
|
[CI/Build] Use python 3.12 in cuda image (#8133)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-09-07 13:03:16 -07:00 |
|
Rui Qiao
|
de80783b69
|
[Misc] Use ray[adag] dependency instead of cuda (#7938)
|
2024-09-06 09:18:35 -07:00 |
|
TimWang
|
ccd7207191
|
chore: Update check-wheel-size.py to read MAX_SIZE_MB from env (#8103)
|
2024-09-03 23:17:05 -07:00 |
|
Lily Liu
|
e6a26ed037
|
[SpecDecode][Kernel] Flashinfer Rejection Sampling (#7244)
|
2024-09-01 21:23:29 -07:00 |
|
Mor Zusman
|
fdd9daafa3
|
[Kernel/Model] Migrate mamba_ssm and causal_conv1d kernels to vLLM (#7651)
|
2024-08-28 15:06:52 -07:00 |
|
Kevin H. Luu
|
666ad0aa16
|
[ci] Cleanup & refactor Dockerfile to pass different Python versions and sccache bucket via build args (#7705)
Signed-off-by: kevin <kevin@anyscale.com>
|
2024-08-22 20:10:55 +00:00 |
|
Peng Guanwen
|
f710fb5265
|
[Core] Use flashinfer sampling kernel when available (#7137)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-08-19 03:24:03 +00:00 |
|
Lily Liu
|
ec2affa8ae
|
[Kernel] Flashinfer correctness fix for v0.1.3 (#7319)
|
2024-08-12 07:59:17 +00:00 |
|
Rui Qiao
|
05308891e2
|
[Core] Pipeline parallel with Ray ADAG (#6837)
Support pipeline-parallelism with Ray accelerated DAG.
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
|
2024-08-02 13:55:40 -07:00 |
|
Sage Moore
|
7e0861bd0b
|
[CI/Build] Update PyTorch to 2.4.0 (#6951)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-08-01 11:11:24 -07:00 |
|
Jee Jee Li
|
7ecee34321
|
[Kernel][RFC] Refactor the punica kernel based on Triton (#5036)
|
2024-07-31 17:12:24 -07:00 |
|
youkaichao
|
5a96ee52a3
|
[ci][build] add back vim in docker (#6661)
|
2024-07-22 16:26:29 -07:00 |
|
Kevin H. Luu
|
69d5ae38dc
|
[ci] Use different sccache bucket for CUDA 11.8 wheel build (#6656)
Signed-off-by: kevin <kevin@anyscale.com>
|
2024-07-22 14:20:41 -07:00 |
|
youkaichao
|
e81522e879
|
[build] add ib in image for out-of-the-box infiniband support (#6599)
[build] add ib so that multi-node support with infiniband can be supported out-of-the-box (#6599)
|
2024-07-19 17:16:57 -07:00 |
|
Tyler Michael Smith
|
1689219ebf
|
[CI/Build] Build on Ubuntu 20.04 instead of 22.04 (#6517)
|
2024-07-18 17:29:25 -07:00 |
|
Pernekhan Utemuratov
|
a63a4c6341
|
[Misc] Use 0.0.9 version for flashinfer (#6447)
Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>
|
2024-07-15 10:10:26 -07:00 |
|
Robert Shaw
|
a754dc2cb9
|
[CI/Build] Cross python wheel (#6394)
|
2024-07-14 18:54:46 -07:00 |
|
youkaichao
|
ccd3c04571
|
[ci][build] fix commit id (#6420)
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2024-07-14 22:16:21 +08:00 |
|
Simon Mo
|
4f0e0ea131
|
Add FlashInfer to default Dockerfile (#6172)
|
2024-07-08 13:38:03 -07:00 |
|
Simon Mo
|
bc96d5c330
|
Move release wheel env var to Dockerfile instead (#6163)
|
2024-07-05 17:19:53 -07:00 |
|
Mor Zusman
|
9d6a8daa87
|
[Model] Jamba support (#4115)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Erez Schwartz <erezs@ai21.com>
Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Tomer Asida <tomera@ai21.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 23:11:29 +00:00 |
|
zhyncs
|
f1e72cc19a
|
[BugFix] exclude version 1.15.0 for modelscope (#5668)
|
2024-06-21 13:15:48 -06:00 |
|
Kevin H. Luu
|
19091efc44
|
[ci] Setup Release pipeline and build release wheels with cache (#5610)
Signed-off-by: kevin <kevin@anyscale.com>
|
2024-06-18 11:00:36 -07:00 |
|
Antoni Baum
|
a8fda4f661
|
Seperate dev requirements into lint and test (#5474)
|
2024-06-13 11:22:41 -07:00 |
|