numpydoc icon indicating copy to clipboard operation
numpydoc copied to clipboard

UnicodeDecodeError

Open abedhammoud opened this issue 1 year ago • 2 comments

I am getting this error when running the hook

-   repo: https://github.com/numpy/numpydoc
    rev: v1.7.0
    hooks:
      - id: numpydoc-validation
        exclude: (test|docs|labs|alant-st|apps)/.*

I am not really sure if it is my code that is causing this error, or the hook itself.

C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Lib\site-packages\numpydoc\docscrape.py:456: UserWarning: Unknown section Return
  warn(msg)
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Scripts\validate-docstrings.EXE\__main__.py", line 7, in <module>
  File "C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Lib\site-packages\numpydoc\hooks\validate_docstrings.py", line 400, in main
    findings.extend(process_file(file, config_options))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Lib\site-packages\numpydoc\hooks\validate_docstrings.py", line 336, in process_file
    module_node = ast.parse(file.read(), filepath)
                            ^^^^^^^^^^^
  File "C:\Users\abed\miniconda3\envs\adev\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 7689: character maps to <undefined>

abedhammoud avatar Apr 27 '24 05:04 abedhammoud

Trying a fix in #550!

larsoner avatar Apr 29 '24 13:04 larsoner

Based on our tests I'm not sure if this is expected or not :thinking:

The system encoding on Windows is cp1252 by default IIRC so in that sense it makes sense that UTF-8 would be a problem. Maybe we need to provide an option somehow / somewhere to specify the encoding or update the pre-commit docs about how to set it for the project / all systems? It seems like you can hack in env vars so could maybe use PYTHONUTF8 or something but that's not pretty :slightly_frowning_face:

larsoner avatar Apr 29 '24 13:04 larsoner