OLMo icon indicating copy to clipboard operation
OLMo copied to clipboard

scripts/prepare_tulu_data.py ERROR

Open Maxhyl opened this issue 2 years ago • 0 comments

❓ The question

我下载好了v1_6数据集,运行python scripts/prepare_tulu_data.py ./v1_6/train --tokenizer="./tokenizers/allenai_eleuther-ai-gpt-neox-20b-pii-special.json" 报错 CRITICAL [olmo.util:156, rank=0] Uncaught InvalidConfigName: Bad characters from black list '<>:/|?*' found in './v1_6/train'. They could create issues when creating a directory for this config on Windows filesystem.请问要怎么解决这个问题?需要怎么处理数据才能进行finetune和training。

Maxhyl avatar Mar 01 '24 02:03 Maxhyl