apex icon indicating copy to clipboard operation
apex copied to clipboard

about mask check 16

Open lw921014 opened this issue 3 years ago • 3 comments

Describe the Bug

For mask op here https://github.com/NVIDIA/apex/blob/master/apex/contrib/csrc/fmha/src/fmha/mask.h#L54, If we use sm 80 m16n8k16 tensor core, here should be change as following?

col = warp_n * 32 + tid;

Minimal Steps/Code to Reproduce the Bug

Expected Behavior

Environment

lw921014 avatar Jul 25 '22 06:07 lw921014

Hello, every warp computes a 16x16 tile, so this column offset should be ok.

yjk21 avatar Jul 25 '22 12:07 yjk21

@lw921014 can we close if that answered your question, or is there something else regarding this we can help with?

yjk21 avatar Jul 29 '22 09:07 yjk21

Hello, every warp computes a 16x16 tile, so this column offset should be ok.

I got it. Thank a lot.

I have another question. According to here, does our current impl only support head size = 64, I mean, how about head size = 32, or 16?

lw921014 avatar Aug 02 '22 10:08 lw921014