albert icon indicating copy to clipboard operation
albert copied to clipboard

Default Tutorial Not Working - Can't download MRPC data

Open Jadiker opened this issue 4 years ago • 2 comments

When running the "prepare for training" tutorial code in Google Colab, I get the following error:

**** Model output directory: gs://albert_glue_tutorial/albert-tfhub/models/MRPC *****
Cloning into 'download_glue_repo'...
remote: Enumerating objects: 24, done.
remote: Total 24 (delta 0), reused 0 (delta 0), pack-reused 24
Unpacking objects: 100% (24/24), done.
Processing MRPC...
Traceback (most recent call last):
  File "download_glue_repo/download_glue_data.py", line 150, in <module>
    sys.exit(main(sys.argv[1:]))
  File "download_glue_repo/download_glue_data.py", line 142, in main
    format_mrpc(args.data_dir, args.path_to_mrpc)
  File "download_glue_repo/download_glue_data.py", line 65, in format_mrpc
    URLLIB.urlretrieve(MRPC_TRAIN, mrpc_train_file)
NameError: name 'URLLIB' is not defined
***** Task data directory: glue_data *****

I've followed the instructions as written on the Colab, setting up storage and filling in the parameter, as well as setting the runtime to TPU, then clicked "run all". How can I download the glue data needed for MRPC?

Jadiker avatar Feb 06 '22 02:02 Jadiker

Looks like this issue has been noted here

Jadiker avatar Feb 06 '22 03:02 Jadiker

The issue was fixed by doing the following.

  1. Click "Show Code" on the code cell where parameters (Bucket, Task, and Albert_Model) are filled in
  2. If you've already run the script once, you'll need to delete the download_glue_repo folder. This can be done by adding the line !rm -rf download_glue_repo right after the # Download glue data. comment
  3. Instead of cloning the broken repo, clone the fixed repo instead, which can be found here and was mentioned here. This can be done by changing the !git clone line to !git clone https://gist.github.com/fef1601580f269eca73bf26a198595f3.git download_glue_repo
  4. Rerun everything. This time, the dataset should be downloaded correctly.

Jadiker avatar Feb 06 '22 03:02 Jadiker