Shilei Liu
Shilei Liu
The separator in Windows is "\\" and in Linux is "/", so this line will be invalid on Win system ``` det_names = pd.Series(imlist).apply(lambda x: "{}/det_{}".format(args.det,x.split("/")[-1])) ``` and I modified...
如果加入一些简繁转换,以及蛤三篇之“谈笑风生”和“视察二院”的部分内容,那就坠吼了
I first extract contexts from `test.refs.txt` (6000 lines) ```bash cat test.refs.txt | cut -f 1 > test.source ``` and extract multi ref files (use up to 15 per sample) ```bash...
### 🐛 Describe the bug As mentioned in #2569, I try to use `SaveCheckpointHook` to save the checkpoint of titans GPT in hybrid parallel training. However, only the model state...
UPdate
适配python3
How to allow the merging of consecutive newline tokens \n when training a byte-level bpe tokenizer?
Hello, I'm currently working on training a byte-level BPE tokenizer using the Huggingface tokenizers library. I've created a simple training script, a sample corpus, and provided the output produced by...
I am tuning hyper-parameters on two different compute clusters. Since the number of GPUs on these clusters varies, I need to use gradient accumulation (GA) to ensure that the total...