dataprep icon indicating copy to clipboard operation
dataprep copied to clipboard

UnicodeEncodeError when calling create_report

Open vitamins opened this issue 5 years ago • 6 comments

To Reproduce python:

df = pd.read_csv("https://www.openml.org/data/get_csv/1595261/phpMawTba", na_values = [' ?'])
>>> report = create_report(df)
Report has been created!: 100%|████████████████████████████████████████████████████████| 73/73 [00:12<00:00,  5.84it/s]
>>> report.show_browser()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python38\lib\site-packages\dataprep\eda\create_report\io.py", line 74, in show_browser
    file.write(self.report)
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 957167: character maps to <undefined>

Desktop:

  • OS: Windows 10
  • Platform: Windows Powershell
  • Platform Version [e.g. 1.0]
  • Python Version Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)] on win32
  • Dataprep Version: dataprep-0.2.11-py3-none-any.whl

vitamins avatar Aug 19 '20 00:08 vitamins

Hey, just wondering if there is an ETA on this. Thanks a lot.

Garett-MacGowan avatar Nov 03 '20 19:11 Garett-MacGowan

Hi @vitamins and @Garett-MacGowan , thanks for the bug report. We currently mainly focus on Jupyter Notebook environment. The terminal and Windows Powershell has not been tested. We will try to fix this ASAP. Before that, you could try dataprep in Notebook to avoid the issue.

jinglinpeng avatar Nov 04 '20 17:11 jinglinpeng

I have the same issue in JupyterLab on Windows 10 ( Anaconda Distribution)

image

ssenathi avatar Nov 26 '20 07:11 ssenathi

Same as the Powershell. Using the sample code from documentation

image

ssenathi avatar Nov 26 '20 07:11 ssenathi

Hi @vitamins , @Garett-MacGowan and @ssenathi . Thanks for reporting! We have fixed the encoding issue in Windows and it will be released in the next version. For now, you could try the develop branch by pip install git+https://github.com/sfu-db/dataprep.git@develop.

The support on terminal is still under fix. For now please try notebook :)

jinglinpeng avatar Nov 28 '20 11:11 jinglinpeng

Hello, How to install it will conda but without git installed ? Thanks

bacoco avatar Dec 07 '20 10:12 bacoco