scout
scout copied to clipboard
numpy genfromtxt encoding error
When reading in the EIA 'rsmeqp.txt' file to 'mseg_techdata.py' via numpy genfromtxt (numpy v 1.14.2), the following error is yielded:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 961: invalid start byte
This problem is fixed by explicitly specifying the file's encoding as 'latin1' using the encoding parameter, added in numpy 1.14. However, it's unclear why the error is occurring in the first place, and modifying the encoding of the file to utf-8 with the following Terminal line does not fix the problem:
iconv --from-code=ISO-8859-1 --to-code=UTF-8 current_file.csv
Ideally, we'd be able to read this file in utf-8 encoded without any errors from genfromtxt.