Wentao Ye
|
1f400c58b8
|
[CI] Add batch invariant test to ci (#27842)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-11-21 09:20:33 -07:00 |
|
Michael Goin
|
986ab5db63
|
[CI Bugfix] Fix Kernels DeepGEMM Test (H100) (#29106)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-11-20 16:42:33 -08:00 |
|
Alexander Matveev
|
3aaa94ac99
|
[Performance] Reduce DeepGEMM N dim restriction from 128 to 64 multiplier (#28687)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-11-19 15:47:13 -08:00 |
|
Shu Wang
|
613abb50d5
|
[MoE] Nvfp4 Masked Gemm: Add flashinfer grouped_gemm_nt_masked (#25990)
Signed-off-by: Shu Wang. <shuw@nvidia.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-11-19 13:29:06 -08:00 |
|
Copilot
|
61728cd1df
|
Re-enable FlashInfer for Llama4 on Blackwell in e2e fusion tests (#28966)
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-11-19 13:32:19 -05:00 |
|
Harry Mellor
|
a8b70304d6
|
Update rope_scaling to rope_parameters in preparation for Transformers v5 (#28542)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-19 09:06:36 -08:00 |
|
Yanan Cao
|
2c8b9182b5
|
[CI] Reorganize compile tests so new tests are automatically included in CI (#28625)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2025-11-19 06:13:50 -08:00 |
|
Nick Hill
|
637f292196
|
[CI] Fix broken pipeline (#28781)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-11-15 08:44:14 -08:00 |
|
Angela Yi
|
f36292dbee
|
[compile] Enable sequence parallelism matching w/o custom ops enabled (#27126)
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: ProExpertProg <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <luka.govedic@gmail.com>
|
2025-11-15 11:46:12 +00:00 |
|
Yanan Cao
|
262d263f6c
|
[Bugfix] Eliminate tuple inputs to submodules in graph partitioning (#28533)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2025-11-13 15:09:05 -05:00 |
|
Nick Hill
|
8832fff972
|
[BugFix] Fix mm_encoder_attn_backend arg type checking (#28599)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-11-13 03:06:03 +00:00 |
|
Harry Mellor
|
51c599f0ec
|
Skip models that cannot currently init on Transformers v5 (#28471)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 23:43:57 +00:00 |
|
Harry Mellor
|
a742134cc5
|
Remove deprecated fields from CompilationConfig (#27593)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 16:10:28 +00:00 |
|
Huamin Li
|
c748355e0d
|
[CI] Introduce autorun_on_main feature (#27836)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2025-11-12 08:51:19 +00:00 |
|
zhrrr
|
68c09efc37
|
[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165)
Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>
|
2025-11-11 12:00:31 -05:00 |
|
usberkeley
|
3143eb23fc
|
[BugFix] Add test_outputs.py to CI pipeline (#28466)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-11 16:01:30 +00:00 |
|
Matthew Bonanni
|
b30dfa03c5
|
[Attention] Refactor CUDA attention backend selection logic (#24794)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-11-11 07:40:44 -05:00 |
|
Adrian Abeyta
|
a5a790eea6
|
[Bugfix] Ensure calculated KV scales are applied in attention. (#27232)
Signed-off-by: adabeyta <aabeyta@redhat.com>
|
2025-11-10 23:42:37 +00:00 |
|
Ilya Markov
|
d17ecc6b19
|
[PERF] Allreduce fusion. Support torch native matching. Tuning of the thresholds (#24248)
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Co-authored-by: Luka Govedič <lgovedic@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2025-11-10 18:33:11 -05:00 |
|
Zhewen Li
|
a65a934ebe
|
[CI/Build] Temporary fix to LM Eval Small Models (#28324)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-11-09 21:08:38 +00:00 |
|
Copilot
|
a736e5ff77
|
[CI] Reduce Blackwell Fusion test runtime by filtering tests and only run all tests in nightly (#28074)
|
2025-11-07 15:58:16 +08:00 |
|
Alexis MacAskill
|
a47d94f18c
|
Add runai model streamer e2e test for GCS (#28079)
Signed-off-by: Alexis MacAskill <amacaskill@google.com>
|
2025-11-07 03:07:54 +00:00 |
|
gmagogsfm
|
bde5039325
|
[CI] Add compile/test_multimodal_compile.py to CI (#28151)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-06 05:41:47 +00:00 |
|
Samuel Shen
|
40db194446
|
[CI]: Add LMCacheConnector Unit Tests (#27852)
Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>
|
2025-11-05 09:45:57 -08:00 |
|
Ilya Markov
|
e50c454672
|
[BugFix] Support EP/DP + EPLB with MTP (#25311)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2025-11-05 15:22:17 +00:00 |
|
Matthew Bonanni
|
01baefe674
|
Add TP parameter to attention tests (#27683)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-11-03 13:04:40 -08:00 |
|
Lucas Wilkinson
|
4bc400f47e
|
[CI/Testing] Add basic single node dual batch overlap test (#27235)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-11-03 17:00:46 +00:00 |
|
Matthew Bonanni
|
f29aeb5a25
|
Add FLASHINFER_MLA to test_mla_backends and add B200 CI run (#27663)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-10-31 11:12:19 -07:00 |
|
Jee Jee Li
|
0384aa7150
|
[CI/Build] Add gpt-oss LoRA test (#27870)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-31 22:17:21 +08:00 |
|
Wentao Ye
|
2bf0bcc1fc
|
[CI Test] Add Scheduled Integration Test (#27765)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-30 17:29:26 -07:00 |
|
Huamin Li
|
5be1bed790
|
[CI/Build]Add eval config for Qwen3-235B-A22B-Instruct-2507-FP8 (#27113)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2025-10-30 07:50:56 +00:00 |
|
22quinn
|
f7a6682872
|
[CI/Build] Test torchrun with 8 cards (#27548)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-10-29 10:26:06 -07:00 |
|
bnellnm
|
1891cf605a
|
[Bugfix] Fix modular kernel tests (#27707)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2025-10-29 16:14:33 +08:00 |
|
Cyrus Leung
|
4fb8771cc0
|
[CI/Build] Move pre-commit only scripts to tools/pre_commit (#27657)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-29 08:04:33 +00:00 |
|
Mohammad Miadh Angkad
|
a8c02fb5bf
|
[Bugfix][CI] Fix v1 attention backend tests and add CI coverage (#26597)
Signed-off-by: Mohammad Miadh Angkad <MAngkad.BSDSBA2027@aim.edu>
Signed-off-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-10-28 11:42:05 -04:00 |
|
Cyrus Leung
|
55cba4a05c
|
[CI/Build] Update causal-conv1d installation (#27529)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-26 22:14:22 +08:00 |
|
Cyrus Leung
|
c7abff2990
|
Revert "[CI/Build] Use CPU for mm processing test on CI (#27522)" (#27531)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-26 04:44:27 -07:00 |
|
Isotr0py
|
d63cd9ff10
|
[CI/Build] Use CPU for mm processing test on CI (#27522)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-26 13:09:18 +08:00 |
|
Jiangyun Zhu
|
29c9cb8007
|
[CI] Add tests for cudagraph (#27391)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2025-10-25 02:37:33 +00:00 |
|
Huy Do
|
becb7de40b
|
Update PyTorch to 2.9.0+cu129 (#24994)
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-21 17:20:18 -04:00 |
|
Chen Wu
|
5f6cbf60d6
|
[Feature][Kernel]FusedMoE LoRA (#21229)
Signed-off-by: wuchen <cntryroa@gmail.com>
Signed-off-by: banjuede <lmklhc@163.com>
Signed-off-by: Chen Wu <cntryroa@gmail.com>
Signed-off-by: Danielle Robinson <dmmaddix@amazon.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: bk-201 <joy25810@foxmail.com>
Co-authored-by: wuchen <wuchen@zetyun.com>
Co-authored-by: Nathan Van Gheem <vangheem@gmail.com>
Co-authored-by: banjuede <lmklhc@163.com>
Co-authored-by: Danielle Robinson <dmmaddix@amazon.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: bk-201 <joy25810@foxmail.com>
|
2025-10-21 03:01:37 +00:00 |
|
Lunwen He
|
0eb8f2b880
|
create is_in_the_same_node on cpu (#26832)
Co-authored-by: Lunwen He <lunwenh@meta.com>
|
2025-10-21 02:04:14 +00:00 |
|
Tova Movshovitz
|
83e760c57d
|
[V1][Metrics][Plugin] Add plugin support for custom StatLoggerBase implementations (#22456)
Signed-off-by: tovam <tovam@pliops.com>
|
2025-10-18 15:12:46 -07:00 |
|
Nicolò Lucchesi
|
99722d5f0e
|
[CI] Remove forbidden slash (#27112)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-17 09:38:00 -07:00 |
|
Nicolò Lucchesi
|
2ba60ec7fe
|
[CI] Nixl integration tests (#27010)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-17 07:13:31 -07:00 |
|
Luka Govedič
|
bd7157a071
|
[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604)
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-17 08:10:23 -06:00 |
|
Michael Goin
|
f8a0acbdbe
|
[CI] Enable Blackwell Llama4 MoE tests (#26731)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-15 21:02:57 -06:00 |
|
Zhewen Li
|
f3c378ffa7
|
[CI/Build] Add Qwen2.5-VL-7B-Instruct ChartQA Accuracy Tests in CI (#21810)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com>
|
2025-10-15 08:09:56 +00:00 |
|
Michael Goin
|
7e0ef4084a
|
[CI Failure] Fix torchao dep failure for Quantization Test (#26824)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-14 16:41:43 -07:00 |
|
Zhengxu Chen
|
eef921f45e
|
AOT Compilation for torch.compile (Bundled) (#24274)
Signed-off-by: zhxchen17 <zhxchen17@fb.com>
|
2025-10-10 19:02:11 -04:00 |
|