blldd

Results 6 comments of blldd

cause i don't want get negative number in embedding, so how can i get normalization between 0 and 1,Thank you :)

Thank you for your explanation and suggestion:)

> Env: Ubuntu 18.04.5 LTS Python 3.9.16 deepspeed 0.9.0+c8fc9c5f transformers 4.28.0.dev0

> @blldd is 15GB only for deepspeed initialization or it is the peak memory consumption during training? During training, a lot of memory is consumed by activation/compute and others. 15GB...

training_scripts: deepspeed main.py \ --data_path Dahoas/rm-static \ --data_split 2,4,4 \ --model_name_or_path /data1/opt-iml-max-1.3b \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --max_seq_len 512 \ --learning_rate 1e-3 \ --weight_decay 0.1 \ --num_train_epochs 2...

Hello lucidrains, can you share your training script and data preparation code to make it easier to try? Thanks in advance.