[Model][Spec Decode] Nemotron-H MTP and Mamba Speculative Decoding Support (#33726)
Signed-off-by: Shahar Mor <smor@nvidia.com> Signed-off-by: Benjamin Chislett <bchislett@nvidia.com> Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by: Shahar Mor <smor@nvidia.com> Co-authored-by: Roi Koren <roik@nvidia.com> Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
This commit is contained in:
committed by
GitHub
parent
a9e15e040d
commit
f5972a872f
@@ -1200,6 +1200,11 @@ _SPECULATIVE_DECODING_EXAMPLE_MODELS = {
|
||||
},
|
||||
is_available_online=False,
|
||||
),
|
||||
"NemotronHMTPModel": _HfExamplesInfo(
|
||||
"nvidia/Nemotron-Super-Placeholder",
|
||||
speculative_model="nvidia/Nemotron-Super-Placeholder",
|
||||
is_available_online=False,
|
||||
),
|
||||
}
|
||||
|
||||
_TRANSFORMERS_BACKEND_MODELS = {
|
||||
|
||||
Reference in New Issue
Block a user