Jiezhong Qiu comments

Results 10 comments of


                                            Jiezhong Qiu

result of paper

Hey, @LYF14020510036 and @veophi have you solved the problem.

Reproduce Issues

@zhuhm1996 Could you share your hyper-param for row 2, especially for RDT-B and RDT-M5K. We can only achieve 77 in RDT-B and 49 in RDT-M5K. Thanks!

tanh while calculating attention scores

Thanks for pointing our this. Yes, in the original GAT paper, they don't have the tanh activation. But we found that it helps our training a little.

cluster.wait() returns before call-back is done

Hi @UnitedMarsupials and @pgiri , I also met this problem. I try to use [threading.Event](https://docs.python.org/3/library/threading.html#event-objects) to solve this problem. ```python def job_callback(job): # dump the job.result to file ... global...

Job's ID should be specifiable at creation-time

I agree with @UnitedMarsupials . Another problem is that, the input argument of callback function is a copy of the job object (see line 1780 of the code below), which...

How to expose the "register_pre_hook()" interface?

@zhuwenxi It would be really helpful if you would like to release the modified PyTorch code. Thanks!

Adding Expert Prototyping to FastMoE

Here is another recent work about MoE. DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning https://arxiv.org/abs/2106.03760 The idea is to activate all experts at the...

Implement GBBS-based SPMMD

> cblas_axpy(...)------>cblas_axpy(d,factor_new,X+v_d,1,Y+u_d,1); > > The factor can be computed in advance and then used as function input to avoid the additional computational overhead caused by each multiplication. Actually we can't...

Cannot reproduce the result reported in the paper

Thanks for your attention. We have update the script for blog dataset in our latest pull request https://github.com/xptree/LightNE/pull/6.

Meeting problems in compiling

The code was compiled and run with g++ 6.5.0 and 5.4.0. Would you mind use another version of g++ to compile the code? Thanks!