FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Add support for head_dim > 1024 for fp16, no whitespace change

Open zrphercule opened this issue 4 years ago • 6 comments

Thanks to @842974287 's implementation. Add head_dim > 1024 for fp16 in add_QKV_bias_rebuild_padding add_bias_input_layernorm

For the comments in https://github.com/842974287/FasterTransformer/commit/dacb3ceed52d6cdb59f10adc6fa02f615da9084a

  1. When word_per_block != 1, dim3 grid(m * half_k / block.x / word_per_block * 3); could generate remainder, which might cause problem.
  2. This diff now contains the implementation of add_bias_input_layernorm when headdim > 1024.

Please let me know if this pr is good for commit, or we need to modify. Thanks!

zrphercule avatar Oct 08 '21 23:10 zrphercule

cc @byshiue

zrphercule avatar Oct 08 '21 23:10 zrphercule

I cannot compile the codes successfully. Even if I fix the issue, I will get wrong results when I run the hidden_dim > 1024. How do you verify the correctness?

byshiue avatar Oct 09 '21 05:10 byshiue

I cannot compile the codes successfully. Even if I fix the issue, I will get wrong results when I run the hidden_dim > 1024. How do you verify the correctness?

Thanks for you reply!

We have some unit test to test its correctness internally, but I havent test this part of code in open source environment. I wonder what is your suggestion of testing it in open source?

Also, we will work on fixing https://github.com/NVIDIA/FasterTransformer/pull/104 and merge it as well recently :)

zrphercule avatar Oct 09 '21 10:10 zrphercule

Here is a simple unit test. You can add some cases with hidden_dimension > 1024 into the unit test.

The request of #104 are supported in next beta version.

byshiue avatar Oct 10 '21 23:10 byshiue

next beta version

Great! Thanks! I wonder when will this beta version becoming a steady official releasing version, or it is already steady enough to be imported as thirdparty library?

zrphercule avatar Oct 11 '21 20:10 zrphercule

For your request and the BERT model, it should be steady. We release it as beta version because:

  1. We may break the API again recently.
  2. We still not update all guides. But the guide of BERT should be latest.

byshiue avatar Oct 12 '21 00:10 byshiue