grace-gpu-containers

Author	SHA1	Message	Date
biondizzle	436214bb72	Use PyPI triton wheel instead of building (QEMU segfaults) Triton 3.6.0 has official aarch64 wheel on PyPI. Building triton from source causes segfaults under QEMU emulation.	2026-04-02 23:58:20 +00:00
biondizzle	e5445512aa	Reduce MAX_JOBS by half to reduce QEMU memory pressure - xformers: 6 -> 3 - flash-attention: 8 -> 4 - vllm: 8 -> 4 Testing if lower parallelism helps avoid segfaults under emulation	2026-04-02 23:44:11 +00:00
biondizzle	4f94431af6	Revert CC/CXX to full paths, keep QEMU_CPU=max	2026-04-02 22:50:23 +00:00
biondizzle	866c9d9db8	Add QEMU_CPU=max for better emulation compatibility during cross-compilation	2026-04-02 22:47:53 +00:00
biondizzle	2ed1b1e2dd	Fix: use CC=gcc CXX=g++ instead of full paths for QEMU compatibility	2026-04-02 22:47:22 +00:00
biondizzle	14467bef70	Fix: add --no-build-isolation to pip wheel for flash-attention Without this flag, pip runs the build in an isolated environment that doesn't have access to torch in the venv.	2026-04-02 20:55:32 +00:00
biondizzle	82b2ceacd5	Update build history and fix pip command docs	2026-04-02 20:24:26 +00:00
biondizzle	8f870921f8	Fix: use 'pip wheel' instead of 'uv pip wheel' (uv has no wheel subcommand)	2026-04-02 20:22:11 +00:00
biondizzle	9da93ec625	Fix setuptools pin and flash-attention build for GH200 - Pin setuptools>=77.0.3,<81.0.0 for LMCache compatibility - Use 'uv pip wheel' instead of 'pip3 wheel' for flash-attention (torch is in venv) - Add CLAWMINE.md with build pipeline documentation	2026-04-02 20:19:39 +00:00
Rajesh Shashi Kumar	5fa395825a	Updated to vLLM v0.11.1rc3	2025-10-23 18:16:57 +00:00
Rajesh Shashi Kumar	0814f059f5	Updated to v0.11.1rc3	2025-10-23 18:11:41 +00:00
Rajesh Shashi Kumar	3c4796ed55	Updated for CUDA 13	2025-10-21 19:21:13 +00:00
Rajesh Shashi Kumar	ebcdb4ab50	Updates for PyTorch 2.9, CUDA13	2025-10-20 20:16:06 +00:00
Rajesh Shashi Kumar	02430037ea	Updated for v0.11.0	2025-10-16 01:08:21 +00:00
Rajesh Shashi Kumar	31f4489d1f	Update README.md	2025-09-24 01:43:49 -05:00
Rajesh Shashi Kumar	201bbf5379	v0.10.2 cleanup	2025-09-24 06:14:16 +00:00
Rajesh Shashi Kumar	fc321295f1	Updated for vllm v0.10.2	2025-09-24 05:52:11 +00:00
Rajesh Shashi Kumar	daf345024b	Updated for v0.10.0	2025-08-20 21:02:46 +00:00
Rajesh Shashi Kumar	23267e4bf5	v0.9.1+ vLLM with FlashInfer	2025-06-25 20:03:20 +00:00
Rajesh Shashi Kumar	64ab367973	v0.9.1	2025-06-24 23:33:46 +00:00
Rajesh Shashi Kumar	3d7f1ed454	vllm 0.9.0.1	2025-06-18 21:49:59 +00:00
Rajesh Shashi Kumar	713775c491	Updates for vllm 0.9.0.1	2025-06-04 15:28:22 +00:00
Rajesh Shashi Kumar	c36ff9ee0e	Updated	2025-06-04 04:47:47 +00:00
Rajesh Shashi Kumar	3d115911aa	0.9.0.1	2025-06-04 03:22:03 +00:00
Rajesh Shashi Kumar	3ea7d34e83	Merge branch 'main' of https://github.com/rajesh-s/containers	2025-06-03 22:34:34 +00:00
Rajesh Shashi Kumar	d30802ef41	Updated for vllm 0.9.0.1	2025-06-03 22:34:21 +00:00
Rajesh Shashi Kumar	b4ae9077ae	Create native_build.sh	2025-05-29 14:34:50 -05:00
Rajesh Shashi Kumar	87c6773c8f	v0.8.4	2025-05-27 20:34:14 +00:00
Rajesh Shashi Kumar	e205f17e2e	Added nsys	2025-04-17 18:46:25 +00:00
Rajesh Shashi Kumar	256272732d	Fixed numpy version	2025-04-07 18:38:30 +00:00
Rajesh Shashi Kumar	4d0dc5d06f	numpy version fix	2025-04-03 20:35:56 +00:00
Rajesh Shashi Kumar	75e33490bd	Working version with vLLM+LMCache	2025-04-01 23:34:16 +00:00
Rajesh Shashi Kumar	c63afb3d35	Working version with vLLM+LMCache	2025-04-01 23:33:43 +00:00
Ubuntu	57ceca8b4f	vllm docker 0.8.1 with lmcache	2025-04-01 20:44:21 +00:00
Rajesh Shashi Kumar	9f2769285a	Initial commit	2025-03-28 14:31:17 -05:00

35 Commits