Jiezhong Qiu

Results 10 comments of Jiezhong Qiu

Hey, @LYF14020510036 and @veophi have you solved the problem.

@zhuhm1996 Could you share your hyper-param for row 2, especially for RDT-B and RDT-M5K. We can only achieve 77 in RDT-B and 49 in RDT-M5K. Thanks!

Thanks for pointing our this. Yes, in the original GAT paper, they don't have the tanh activation. But we found that it helps our training a little.

Hi @UnitedMarsupials and @pgiri , I also met this problem. I try to use [threading.Event](https://docs.python.org/3/library/threading.html#event-objects) to solve this problem. ```python def job_callback(job): # dump the job.result to file ... global...

I agree with @UnitedMarsupials . Another problem is that, the input argument of callback function is a copy of the job object (see line 1780 of the code below), which...

@zhuwenxi It would be really helpful if you would like to release the modified PyTorch code. Thanks!

Here is another recent work about MoE. DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning https://arxiv.org/abs/2106.03760 The idea is to activate all experts at the...

> cblas_axpy(...)------>cblas_axpy(d,factor_new,X+v_d,1,Y+u_d,1); > > The factor can be computed in advance and then used as function input to avoid the additional computational overhead caused by each multiplication. Actually we can't...

Thanks for your attention. We have update the script for blog dataset in our latest pull request https://github.com/xptree/LightNE/pull/6.

The code was compiled and run with g++ 6.5.0 and 5.4.0. Would you mind use another version of g++ to compile the code? Thanks!