[CI/Build][AMD] Use float16 in test_reset_prefix_cache_e2e to avoid accuracy issues (#29997)

Signed-off-by: Randall Smith <ransmith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>
This commit is contained in:
rasmith
2025-12-05 02:42:25 -06:00
committed by GitHub
parent 6038b1b04b
commit feecba09af

View File

@@ -21,6 +21,7 @@ def test_reset_prefix_cache_e2e(monkeypatch):
max_num_batched_tokens=32,
max_model_len=2048,
compilation_config={"mode": 0},
dtype="float16",
)
engine = LLMEngine.from_engine_args(engine_args)
sampling_params = SamplingParams(