vllm/vllm/platforms at e23564cb703916efef20d80fd1c32dd76dee0979 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

bnellnm f9c069c85e Modularize fused experts and integrate PPLX kernels (#15956 )

2025-05-14 13:11:54 -07:00

..

__init__.py

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357 )

2025-05-07 00:07:30 -07:00

cpu.py

[Misc] Auto fallback to float16 for pre-Ampere GPUs when detected bfloat16 config (#17265 )

2025-05-09 17:16:12 +00:00

cuda.py

Modularize fused experts and integrate PPLX kernels (#15956 )

2025-05-14 13:11:54 -07:00

hpu.py

[Hardware][Intel-Gaudi] Multi-step scheduling implementation for HPU (#12779 )

2025-04-11 07:38:36 -07:00

interface.py

Update deprecated type hinting in platform, plugins, triton_utils, vllm_flash_attn (#18129 )

2025-05-14 05:28:16 -07:00

neuron.py

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357 )

2025-05-07 00:07:30 -07:00

rocm.py

Update deprecated type hinting in platform, plugins, triton_utils, vllm_flash_attn (#18129 )

2025-05-14 05:28:16 -07:00

tpu.py

Update deprecated type hinting in platform, plugins, triton_utils, vllm_flash_attn (#18129 )

2025-05-14 05:28:16 -07:00

xpu.py

[Hardware] add platform-specific request validation api (#16291 )

2025-04-09 12:50:01 -07:00