[Bugfix] Correctly call cudaProfilerStop in benchmarks script (#14183)

Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
This commit is contained in:
Brayden Zhong
2025-03-06 19:42:49 -05:00
committed by GitHub
parent ad60bbb2b2
commit c34eeec58d
6 changed files with 5 additions and 6 deletions

View File

@@ -153,7 +153,6 @@ def ref_group_gemm(ref_out: torch.Tensor, input: torch.Tensor,
result = torch.nn.functional.linear(x, w)
result *= scaling
out_list.append(result)
torch.cat(out_list, dim=0)
cat_result = torch.cat(out_list, dim=0)