This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
432cf22a6a1f800cf64e79a706639dbe163fbc18
vllm
/
examples
/
offline_inference
History
Nick Hill
15dac210f0
[V1] AsyncLLM data parallel (
#13923
)
...
Signed-off-by: Nick Hill <
nhill@redhat.com
>
2025-03-27 16:14:41 -07:00
..
basic
[Misc] improve example script output (
#15528
)
2025-03-26 10:12:47 +00:00
openai
…
profiling_tpu
…
audio_language.py
…
chat_with_tools.py
…
cpu_offload_lmcache.py
…
data_parallel.py
[V1] AsyncLLM data parallel (
#13923
)
2025-03-27 16:14:41 -07:00
disaggregated_prefill_lmcache.py
…
disaggregated_prefill.py
…
distributed.py
…
eagle.py
…
encoder_decoder_multimodal.py
…
encoder_decoder.py
…
llm_engine_example.py
…
lora_with_quantization_inference.py
[Misc] Clean up the BitsAndBytes arguments (
#15140
)
2025-03-20 19:17:12 -07:00
mistral-small.py
[Doc] Update Mistral Small 3.1/Pixtral example (
#15184
)
2025-03-20 04:46:06 +00:00
mlpspeculator.py
[V1][Usage] Refactor speculative decoding configuration and tests (
#14434
)
2025-03-22 19:28:10 -10:00
multilora_inference.py
…
neuron_int8_quantization.py
…
neuron.py
…
prefix_caching.py
…
prithvi_geospatial_mae.py
…
profiling.py
…
reproduciblity.py
Add an example for reproducibility (
#15262
)
2025-03-20 19:55:47 -07:00
rlhf_colocate.py
…
rlhf_utils.py
…
rlhf.py
…
save_sharded_state.py
…
simple_profiling.py
…
structured_outputs.py
…
torchrun_example.py
…
tpu.py
[TPU] [V1] fix cases when max_num_reqs is set smaller than MIN_NUM_SEQS (
#15583
)
2025-03-26 22:46:26 -07:00
vision_language_embedding.py
…
vision_language_multi_image.py
…
vision_language.py
[Misc] Clean up MiniCPM-V/O code (
#15337
)
2025-03-25 10:22:52 +00:00