composable_kernel
composable_kernel copied to clipboard
[CK_TILE] Change output accum tensor layout of fmha fwd split-kv & combine kernels
Use same tensor layout for o_acc & o