Liwen Fan comments

Results 18 comments of


                                            Liwen Fan

baseline

I am confused with the difference between "Branch" and "Single" in Table 5. What do they mean?

baseline

Hi @aishawn , it is the same model. I can not re-produce what the paper claims.

baseline

Hi @SZULH, Q1: welcome PR. Q2: I am not sure. I jest feel that if you are working on your own papers, this paper probably is not a good start...

baseline

@rocketbear glad to hear! Can you share your code? I only get r@1=0.565321 on P3 single branch.

RFC: Sparse Domain Isolation for Supporting large-scale Sparse Weights Training.

Is this RFC related to the recently proposed paper "DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications" by Google? https://arxiv.org/pdf/2004.08366.pdf

Does cuBERT support sequence tagging output?

Currently, we only support standard BERT graph and standard BERT outputs, in your case the most related output type I guess is `model.get_sequence_output()` with size [batch_size * sequence_length * hidden_size]....

CuBERT not utilizing all threads with multi-cpu

What CPU do you use? Do you run cuBERT inside docker with limited CPU quota? Does the caller have many threads and call cuBERT concurrently? Could you provide the running...

[Bug]: HRfix tends to fail when used with inpainting models

I have the same problem. Any fix?