Results 30 comments of Charlie_Tang

You need to set FLAG manually responding edition with your device model.

I am not sure with BPE, I test it with Madarin text. so it works with char well. It seems Kenlm not work at BPE level. Check out this: https://github.com/NVIDIA/NeMo/issues/3221

@b-flo that is what I mentioned. do you think how merge into your part?

@b-flo just relax, I haven't finished it totally. I will let you know when I finish it.

baseline: accum-grad=2 1 with 4 GPU dev:24.1/53.0 test:25.6/56.5 sliding window multihead attention: accum-grad=2 1 with 4 GPU dev:26.3/56.5 test:26.4/57.8 @b-flo I still could recurrence your result. but any way, sliding...

@pzelasko Do you have the paper of this Streaming attention decoder