Shaoyang Xu issues

Results 6 issues of


                                            Shaoyang Xu

featureless是什么意思？

您好，看了你的代码，大致理解了你的思路: 如果doc和word都是one-hot向量，那么x(feature)就是一个单位阵，所以我看到的第一层GCN设置的是featureless=True，即x并不需要参与计算；而第二层的x就不是单位阵了，而是第一层的激活层(输出)，所以x参与了计算。你这么设置的原因是不是就是说:其实一开始的x就可以是一个稠密的矩阵，就比如word都用预训练的词向量，doc也有自己的向量表示，比如可以是句内所有word向量的平均，或者.... 问题就是: 1. 上面我理解的对吗？ 2. 有尝试过预训练的词向量嘛，效果如何？

无法使用GPU

2021-01-18 03:07:40.679672: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2021-01-18 03:07:40.701768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:02:00.0 name: Tesla M40 24GB computeCapability: 5.2 coreClock: 1.112GHz coreCount: 24...

ask a question: how long does it take for one epoch?

A bug(maybe)

Hi, dear ziyi~ I found in your code, the bert output weights are not set to be the same as the input embedding, which can be proved in [here](https://github.com/neulab/awesome-align/blob/5f150d45bbe51e167daf0a84abebaeb07c3323d1/awesome_align/modeling.py#L374)(In detail,...

Questions about gradient values

Hello, i found the grad matrix([grad = weight.grad](https://github.com/varun19299/rigl-reproducibility/blob/97443beac90e03f899652943594695e5152c2b09/sparselearning/funcs/grow.py#L86)) has many non-zero elements while their corresponding values in weight matrix are zero, I want to ask why this happen(as a beginner...

Performance of Harmfulness exp is too high ?

Hello authors, Your experiment results on harmfulness classification:`https://github.com/andyzoujm/representation-engineering/blob/main/examples/harmless_harmful/harmless_llama2.ipynb` shows that Llama-2-13b-chat achieves near 100% acc, even in the lower layers. I have tried more model: Llama-2-{7,70}b-chat, llama-2-7b, bloomz-{560m,1b1,1b7,3b,7b1}, bloom-7b1, all...