UnicodeDecodeError
Hi, I got UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte on main.py", line 124, in get_featurizer featdata = json.load(f) when I tried to run: python3 main.py --config configs/config-custom.yaml or python3 main.py --config configs/config-joblight.yaml
I think there might be a typo in the author's script when writing download_imdb_workload.sh, specifically on line 13:
wget -O imdb_data.json https://www.dropbox.com/s/o8m1fthow6zn1kg/imdb-unique-plans-sqls.tar.gz?dl=1
It seems unusual that a .tar.gz file is being saved as a .json file. I believe it should be corrected to:
wget -O imdb_data.json https://www.dropbox.com/s/nxtt17s4gdt21r5/imdb_data.json?dl=1
This line is from download_imdb_uniqueplans.sh, and I think this correction reflects the intended behavior. However, please note that this is just my personal interpretation, and it may not be accurate.