Jiacheng Ye comments

Results 11 comments of


                                            Jiacheng Ye

AttributeError: module 'cpu_adam' has no attribute 'create_adam'

I got the same error, is it because the deepspeed version?

Train/Val Loss Issues when training GPT-2 from OWT

Same issue for me. I'm using 4\*A100 80G on openwebtext, I change the batch 12 -> 24 and gradient_accumulation_steps = 5\*8 -> 5\*4.

Train/Val Loss Issues when training GPT-2 from OWT

I use fp16, disable the flash attention works for me

Could not find the SWI-Prolog library in this platform.

Btw, this is the system info: ``` NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" ```

question about details of parameters

Hi, it's weird as the settings in run_epr.sh is the same as that in the paper. Could you check whether you can obtain similar results to the paper for other...

How to evaluate results after prediction?

I've figured out solusions about above questions. With the default parameters in codebase, I got 26.15 with BM25. However, the EPR performs even worse (22.9) after training the BERT-based retriever....

How to evaluate results after prediction?

Hi, Here is the full list of commends: ``` #!/bin/bash #SBATCH --job-name=epr_mtop-null_v4 #SBATCH --output=outputs/epr_mtop-null_v4/out.txt #SBATCH --error=outputs/epr_mtop-null_v4/out.txt #SBATCH --partition=NLP #SBATCH --time=12000 #SBATCH --quotatype=reserved #SBATCH --gres=gpu:2 srun python find_bm25.py output_path=$PWD/data/bm25_mtop-null_a_train.json \ dataset_split=train...