Results 4 comments of kenny

I found the reason why GPU load is so low that the dataloader processes data so slowly and GPU is always waiting.

you can try this: rewrite the __getitem__ function in dataset.py and just for getting items, not process data, and move the process operation to preprocess functions such as word segmentation,...

torch.stack有问题,这个算子的计算逻辑和pytorch不同。用unsqueeze和concat做替换。