PraisonAI icon indicating copy to clipboard operation
PraisonAI copied to clipboard

UnicodeDecodeError: 'cp950' codec can't decode byte 0xe2 in position 2072: illegal multibyte sequence

Open Adamchanadam opened this issue 1 year ago • 1 comments

Environment : Windows 11 (Conda) with Python 3.11.8 praisonAI 0.0.17

After installation ( : pip install praisonai

I ran the Initialise : praisonai --init create a movie script about dog in moon

it return error code as below , I think it's UTF-8 issue :

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Scripts\praisonai.exe\__main__.py", line 4, in <module>
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\praisonai\__init__.py", line 1, in <module>
    from .cli import PraisonAI
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\praisonai\cli.py", line 11, in <module>
    import gradio as gr
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\__init__.py", line 3, in <module>
    import gradio._simple_templates
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\_simple_templates\__init__.py", line 1, in <module>
    from .simpledropdown import SimpleDropdown
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\_simple_templates\simpledropdown.py", line 6, in <module>
    from gradio.components.base import FormComponent
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\components\__init__.py", line 40, in <module>
    from gradio.components.multimodal_textbox import MultimodalTextbox
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\components\multimodal_textbox.py", line 34, in <module>
    class MultimodalTextbox(FormComponent):
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\component_meta.py", line 198, in __new__
    create_or_modify_pyi(component_class, name, events)
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\site-packages\gradio\component_meta.py", line 92, in create_or_modify_pyi
    source_code = source_file.read_text()
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\adam\anaconda3\envs\adamlab2_env\Lib\pathlib.py", line 1059, in read_text
    return f.read()
           ^^^^^^^^
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe2 in position 2072: illegal multibyte sequence

Adamchanadam avatar Mar 26 '24 21:03 Adamchanadam

The problem is solved by adding the encoding="utf-8" into the create_or_modify_pyi() function in .\gradio\component_meta.py.

Four modifications are needed:

1 x Change source_file.read_text() to source_file.read_text(encoding="utf-8") once. 3 x Change pyi_file.read_text() to pyi_file.read_text(encoding='utf-8') three times.

def create_or_modify_pyi(
    component_class: type, class_name: str, events: list[str | EventListener]
):
    source_file = Path(inspect.getfile(component_class))

    source_code = source_file.read_text(encoding="utf-8") # <---- add utf-8 here

    current_impl, lineno = extract_class_source_code(source_code, class_name)

    if not (current_impl and lineno):
        raise ValueError("Couldn't find class source code")

    new_interface = create_pyi(current_impl, events)

    pyi_file = source_file.with_suffix(".pyi")
    if not pyi_file.exists():
        last_empty_line_before_class = -1
        lines = source_code.splitlines()
        for i, line in enumerate(lines):
            if line in ["", " "]:
                last_empty_line_before_class = i
            if i >= lineno:
                break
        lines = (
            lines[:last_empty_line_before_class]
            + ["from gradio.events import Dependency"]
            + lines[last_empty_line_before_class:]
        )
        with no_raise_exception():
            pyi_file.write_text("\n".join(lines))
    current_interface, _ = extract_class_source_code(pyi_file.read_text(encoding='utf-8'), class_name) # <---- add utf-8 here
    if not current_interface:
        with no_raise_exception():
            with open(str(pyi_file), mode="a") as f:
                f.write(new_interface)
    else:
        contents = pyi_file.read_text(encoding='utf-8') # <---- add utf-8 here
        contents = contents.replace(current_interface, new_interface.strip())
        current_contents = pyi_file.read_text(encoding='utf-8') # <---- add utf-8 here
        if current_contents != contents:
            with no_raise_exception():
                pyi_file.write_text(contents)

Adamchanadam avatar Mar 26 '24 21:03 Adamchanadam

@claude review this request and propose a fix

MervinPraison avatar May 30 '25 11:05 MervinPraison

Claude finished @MervinPraison's task —— View job


Analysis: Unicode Decode Error in Gradio Import

Todo List:

  • [x] Read repository guidelines (CLAUDE.md) - No CLAUDE.md found
  • [x] Review the error and proposed solution
  • [x] Analyze if this is a PraisonAI or upstream gradio issue
  • [x] Propose appropriate fix strategy
  • [x] Provide recommendations for resolution

Issue Analysis:

Root Cause: This is a third-party dependency issue in Gradio's component_meta.py file. The problem occurs when pathlib.read_text() defaults to the system encoding (cp950 on Chinese Windows) instead of UTF-8, causing a decode error when reading UTF-8 encoded Python files.

User's Solution Review:Technically Correct The proposed fix to add encoding="utf-8" to the 4 read_text() calls is the correct solution. This explicitly specifies UTF-8 encoding instead of relying on system defaults.

Recommended Fix Strategy:

Option 1: Upstream Fix (Recommended)

  • This is a Gradio library bug that should be reported upstream
  • The issue affects Windows users with non-UTF-8 system encodings
  • Gradio should handle UTF-8 encoding explicitly in their component_meta.py

Option 2: PraisonAI Workaround Since this is a third-party issue blocking PraisonAI users, consider adding a post-installation patch:

# In praisonai/setup/post_install.py
def patch_gradio_encoding():
    """Patch gradio component_meta.py for Windows encoding issues."""
    try:
        import gradio.component_meta
        import inspect
        import os
        
        # Get gradio component_meta.py path
        gradio_file = inspect.getfile(gradio.component_meta)
        
        # Read and patch the file if needed
        with open(gradio_file, 'r', encoding='utf-8') as f:
            content = f.read()
        
        if 'read_text()' in content and 'encoding=' not in content:
            # Apply the encoding fixes
            content = content.replace(
                'source_file.read_text()', 
                'source_file.read_text(encoding="utf-8")'
            )
            content = content.replace(
                'pyi_file.read_text()', 
                'pyi_file.read_text(encoding="utf-8")'
            )
            
            with open(gradio_file, 'w', encoding='utf-8') as f:
                f.write(content)
                
            logging.info("Applied gradio encoding patch for Windows compatibility")
            
    except Exception as e:
        logging.warning(f"Could not apply gradio encoding patch: {e}")

Final Recommendations:

  1. Document the workaround in PraisonAI's troubleshooting docs for Windows users
  2. Report this upstream to Gradio's GitHub repository
  3. Consider implementing the post-install patch as a temporary solution
  4. Pin gradio version until the upstream fix is available
  5. Add environment variable option to set PYTHONIOENCODING=utf-8 in setup scripts

The user's analysis and solution are spot-on. This is a Windows-specific issue that affects any Python package importing Gradio on systems with non-UTF-8 default encodings.


claude[bot] avatar May 30 '25 11:05 claude[bot]

@Adamchanadam Can you please test it with the latest version after upgrading and let me know if this is still an issue. Thanks for you contribution.

MervinPraison avatar May 30 '25 12:05 MervinPraison