UnicodeEncodeError when calling create_report
To Reproduce python:
df = pd.read_csv("https://www.openml.org/data/get_csv/1595261/phpMawTba", na_values = [' ?'])
>>> report = create_report(df)
Report has been created!: 100%|████████████████████████████████████████████████████████| 73/73 [00:12<00:00, 5.84it/s]
>>> report.show_browser()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Lukas\AppData\Local\Programs\Python\Python38\lib\site-packages\dataprep\eda\create_report\io.py", line 74, in show_browser
file.write(self.report)
File "C:\Users\Lukas\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 957167: character maps to <undefined>
Desktop:
- OS: Windows 10
- Platform: Windows Powershell
- Platform Version [e.g. 1.0]
- Python Version Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)] on win32
- Dataprep Version: dataprep-0.2.11-py3-none-any.whl
Hey, just wondering if there is an ETA on this. Thanks a lot.
Hi @vitamins and @Garett-MacGowan , thanks for the bug report. We currently mainly focus on Jupyter Notebook environment. The terminal and Windows Powershell has not been tested. We will try to fix this ASAP. Before that, you could try dataprep in Notebook to avoid the issue.
I have the same issue in JupyterLab on Windows 10 ( Anaconda Distribution)

Same as the Powershell. Using the sample code from documentation

Hi @vitamins , @Garett-MacGowan and @ssenathi . Thanks for reporting! We have fixed the encoding issue in Windows and it will be released in the next version. For now, you could try the develop branch by pip install git+https://github.com/sfu-db/dataprep.git@develop.
The support on terminal is still under fix. For now please try notebook :)
Hello, How to install it will conda but without git installed ? Thanks