LingxiaoShawn
LingxiaoShawn
https://github.com/THUDM/SwissArmyTransformer/blob/7ed825c5eb07e98d3408c6ddbfcd6e37db1d51c7/examples/cogview/pretrain_gpt2.py#L105 I'm not sure that whether the tokenizer of cogview is correctly configured. Btw nice library! Thank you.
Dear authors, Thank you for the amazing work and code. Can I ask for the baseline for GRAN? I saw you have the function to call it but the core...
## 🚀 Feature Request The current StreamingTextDataset truncate the text/tokens to the max_seq_len directly and throw out all left text/tokens. It is possible to support the truncate the text/tokens to...