DingDing

Results 32 comments of DingDing

可以借助一个可以连huggingface datasets,并且可以将文件scp到目标服务器上的机器,例如本地机器, 运行python 代码 ```python from datasets import load_dataset mydataset = load_dataset("glue", "mrpc") mydataset.save_to_disk("YOURPATH/glue.mrpc") # 不一定叫glue.mrpc 取个名就行 ``` 在终端中 ```bash scp -r YOURPATH/glue.mprc USERNAME@IP:THE_ABSOLUTE_PATH_TO_SAVE_YOUR_DATASET ``` 之后在服务器中, 运行python代码 ```python from datasets...

知乎: https://zhuanlan.zhihu.com/p/55334148

1. debug时,batchsize和hidden 不要设置得能整除,有同学batchsize 和 hidden都设置成32, 这样维度错了也不知道。为了避免该问题可以在debug阶段设置成33和32; (其他维度也一样,不要能整除)不要整除而不是不相等是因为pytorch有自动广播机制。 2. 遇到疯狂输出的cuda error,例如 ![1021656954422_ pic](https://user-images.githubusercontent.com/32740627/177328411-b2db2a0f-f66b-49ca-aa52-7bcf9899c14d.jpg) 一般是列表索引越界,可以设置环境变量 CUDA_LAUNCH_BLOCKING=1来debug,具体可以参考网上很多帖子 3. 首先检查输入是否正确,对于NLP任务,可以将id复原为token,看句子是不是想要的那样。 4. 查看模型梯度是否正常,如下 ``` [(name, parameter.grad.abs().sum(), parameter.sum()) for name,parameter in model.named_parameters()] ``` 每个tuple里,第一个是模块名字,第二个是grad绝对值求和,第三个是参数求和; 如果 需要由grad的模块的grad绝对值求和为零,就说明有问题;如果两次打印之间模型参数求和没有变化,可能也有问题,说明学习率可能过小。...

1 million nodes may be too large for struc2vec. For me. a 80 ,000 nodes graph with threads 24 hasn't generate embedding after 5 hours training. And the algorithm is...

But I am wondering that, is there an upper bound for the number of threads we use? i.e. can the algorithm parallel well using large number of threads? Have the...

I also noticed this difference. In fact, in the tensorflow version of original paper, the train_adj and val_adj are different. So I think this implementation detail might be missed by...

Thank you for reminding me of the self loops

What's your torch, huggingface version? I can not replicate this problem. In my case, the log is: ````python >>> import torch >>> from transformers import LlamaTokenizerFast, LlamaForCausalLM >>> model_path =...

Thanks for your information, we will check whether 4.38.2 version updates break the code.