scout icon indicating copy to clipboard operation
scout copied to clipboard

numpy genfromtxt encoding error

Open jtlangevin opened this issue 7 years ago • 0 comments

When reading in the EIA 'rsmeqp.txt' file to 'mseg_techdata.py' via numpy genfromtxt (numpy v 1.14.2), the following error is yielded:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 961: invalid start byte

This problem is fixed by explicitly specifying the file's encoding as 'latin1' using the encoding parameter, added in numpy 1.14. However, it's unclear why the error is occurring in the first place, and modifying the encoding of the file to utf-8 with the following Terminal line does not fix the problem:

iconv --from-code=ISO-8859-1 --to-code=UTF-8 current_file.csv

Ideally, we'd be able to read this file in utf-8 encoded without any errors from genfromtxt.

jtlangevin avatar Mar 27 '18 16:03 jtlangevin