Commit Graph

12 Commits

Author SHA1 Message Date
youkaichao
f89d18ff74 [6/N] pass whole config to inner model (#10205)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-11 06:41:46 +00:00
youkaichao
1a95f10ee7 [5/N] pass the whole config to model (#9983)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-09 14:17:28 +08:00
Joe Runde
d58268c56a [V1] Make v1 more testable (#9888)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-11-06 11:57:35 -08:00
Michael Goin
399c798608 Remove ScaledActivation for AWQ (#10057)
Signed-off-by: mgoin <michael@neuralmagic.com>
2024-11-06 14:27:06 +00:00
Aaron Pham
21063c11c7 [CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2024-11-06 07:11:55 +00:00
Yongzao
d27cfbf791 [torch.compile] Adding torch compile annotations to some models (#9641)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-10-24 09:31:42 -07:00
Murali Andoorveedu
0f6d7a9a34 [Models] Add remaining model PP support (#7168)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-04 10:56:58 +08:00
Isotr0py
bc4eb65b54 [Bugfix] Fix Fuyu tensor parallel inference (#8986) 2024-10-01 17:51:41 +08:00
Jani Monoses
f2bd246c17 [VLM] Fix paligemma, fuyu and persimmon with transformers 4.45 : use config.text_config.vocab_size (#8707) 2024-09-23 14:43:09 +00:00
afeldman-nm
428dd1445e [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
Cyrus Leung
7025b11d94 [Bugfix] Fix weight loading for Chameleon when TP>1 (#7410) 2024-08-13 05:33:41 +00:00
Isotr0py
540c0368b1 [Model] Initialize Fuyu-8B support (#3924)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-07-14 05:27:14 +00:00