vllm/examples/offline_inference at 432cf22a6a1f800cf64e79a706639dbe163fbc18 - vllm

Files

Nick Hill 15dac210f0 [V1] AsyncLLM data parallel (#13923 )

Signed-off-by: Nick Hill <nhill@redhat.com>

2025-03-27 16:14:41 -07:00

basic

[Misc] improve example script output (#15528 )

2025-03-26 10:12:47 +00:00

openai

…

profiling_tpu

…

audio_language.py

…

chat_with_tools.py

…

cpu_offload_lmcache.py

…

data_parallel.py

[V1] AsyncLLM data parallel (#13923 )

2025-03-27 16:14:41 -07:00

disaggregated_prefill_lmcache.py

…

disaggregated_prefill.py

…

distributed.py

…

eagle.py

…

encoder_decoder_multimodal.py

…

encoder_decoder.py

…

llm_engine_example.py

…

lora_with_quantization_inference.py

[Misc] Clean up the BitsAndBytes arguments (#15140 )

2025-03-20 19:17:12 -07:00

mistral-small.py

[Doc] Update Mistral Small 3.1/Pixtral example (#15184 )

2025-03-20 04:46:06 +00:00

mlpspeculator.py

[V1][Usage] Refactor speculative decoding configuration and tests (#14434 )

2025-03-22 19:28:10 -10:00

multilora_inference.py

…

neuron_int8_quantization.py

…

neuron.py

…

prefix_caching.py

…

prithvi_geospatial_mae.py

…

profiling.py

…

reproduciblity.py

Add an example for reproducibility (#15262 )

2025-03-20 19:55:47 -07:00

rlhf_colocate.py

…

rlhf_utils.py

…

rlhf.py

…

save_sharded_state.py

…

simple_profiling.py

…

structured_outputs.py

…

torchrun_example.py

…

tpu.py

[TPU] [V1] fix cases when max_num_reqs is set smaller than MIN_NUM_SEQS (#15583 )

2025-03-26 22:46:26 -07:00

vision_language_embedding.py

…

vision_language_multi_image.py

…

vision_language.py

[Misc] Clean up MiniCPM-V/O code (#15337 )

2025-03-25 10:22:52 +00:00