Chudong Tian issues

Results 9 issues of


                                            Chudong Tian

How to train StyleGAN2 with labeled dataset

Hello, I found that compared with the native stylegan2, your code specifically removes the label. I want to ask you whether you have tried to add the label for training,...

Connection timeout

When the network is disconnected, the server will reconnect after ten minutes, but select() always returns the connection timeout. The select() is in s3_filesys.cc CURLReadStreamBase::FillBuffer(). And, I changed select() to...

Solver有问题

1. finetune的weight file需要改为F_weight_file而不是T_weight_file； 2. finetune里的学习率需要改为0.0001，不然无法收敛，导致weight file改为F_weight_file后，推理效果很差； 3. 在测试里，获得xywh后做一次NMS，替代原本取平均的操作；修复上面3个问题后，效果回归正常。

MacOS: got 139 (interrupted by signal 11: SIGSEGV)

When I use `from revChatGPT.revChatGPT import Chatbot` there is an error: `139 (interrupted by signal 11: SIGSEGV)` @acheong08

Source of downloading train.bin and val.bin

Hello, I am training gpt2 from scratch, but I found that the data processing of openwebtext is too slow, and our GPU server can't connect to the Internet. It's taken...

多卡finetuning gpt3 1.3B之后，怎么用单卡进行推理

多卡多tensor并行 finetuning gpt3 1.3B之后，怎么用单卡进行推理。我在单卡上推理报错如下，但我如果用多卡推理的话，就是正常的 RuntimeError: DistributedGPT3Pipeline: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1169, invalid usage, NCCL version 21.0.3 ncclInvalidUsage: This usually reflects invalid usage of NCCL library (such as too many async ops,...

大概5小时可以训练完，但是loss一直是0，是正常的吗

大概5小时可以训练完，但是loss一直是0，是正常的吗 {'loss': 0.0, 'learning_rate': 1.9230769230769234e-07, 'epoch': 0.99} {'loss': 0.0, 'learning_rate': 1.730769230769231e-07, 'epoch': 0.99} {'loss': 0.0, 'learning_rate': 1.5384615384615387e-07, 'epoch': 0.99} {'loss': 0.0, 'learning_rate': 1.3461538461538464e-07, 'epoch': 0.99} {'loss': 0.0, 'learning_rate': 1.153846153846154e-07, 'epoch':...

我基于10B模型做继续训练，loss只从11下降到5

我基于10B模型做继续训练，loss只从11下降到5后。一般来讲，最终loss收敛后是多少。我用了12w文本，其中文本长度平均在5000。训练参数： gpus=8 max length=1024 batchsize=8 梯度累计=2 lr=7e-6 总的iter=5000，约等于5个epochs @jeffra @samyam @tjruwase @WrRan

How to fine-tune based on your model

Thanks for your model! I have a job that requires generating a foreground from a background image, but your model currently cannot control the position well and cannot keep harmony...