python-hosts icon indicating copy to clipboard operation
python-hosts copied to clipboard

Opening the hosts file failed because it contains GBK encoded characters.

Open ricardolee opened this issue 1 year ago • 4 comments

image The windows system is a Chinese version, can it be detected? Open the file using GBK encoding。

ricardolee avatar Apr 15 '24 04:04 ricardolee

https://github.com/dotnet/runtime/issues/67229 dot net 6 platform. change UTF-8 to UTF-8 BOM

ricardolee avatar Apr 16 '24 14:04 ricardolee

I've spent some time looking at this and have a simple solution that works in Python 3, but to support Python 2 also adds a lot of complexity.
I need to drop Python 2 support at some point, so will create a separate "3" release in the coming weeks/months that will incorporate this change.

jonhadfield avatar Apr 21 '24 15:04 jonhadfield

Currently, I check for UTF-8 BOM before usage and convert to UTF-8 format if necessary. I believe this issue is caused by other software. Of course, it would be better if it could be compatible with UTF-8 BOM.

def change_hosts_encoding_to_utf8(host_path:str) -> bool:
    """
    Convert hosts file encoding to UTF-8.
    """

    raw = open(host_path, 'rb').read()
    if raw.startswith(codecs.BOM_UTF8):
        data = None
        with io.open(host_path, "r", encoding='utf-8-sig') as hosts:
            data = hosts.read()

        if data:
            with io.open(host_path, "w", encoding='utf-8') as hosts:
                hosts.write(data)
                logger.info(f"convert hosts file utf-8-sig to utf-8 success")
                return True
    return False

ricardolee avatar May 21 '24 05:05 ricardolee