The make_rmem_tensor(tTMEM_LOADcO.shape) creates a 1D tensor that doesn't match the paired atom layout. The working pattern uses a 2D register tensor with sub-tile composition (tTMrO_i_ = tTMrO[None, i] + composition).
The make_rmem_tensor(tTMEM_LOADcO.shape) creates a 1D tensor that doesn't match the paired atom layout. The working pattern uses a 2D register tensor with sub-tile composition (tTMrO_i_ = tTMrO[None, i] + composition).