cChardet icon indicating copy to clipboard operation
cChardet copied to clipboard

UniversalDetector.reset() does not reset the detector

Open laurielounge opened this issue 4 years ago • 0 comments

OS/Arch

$ python -c 'import platform;print(platform.uname())'

uname_result(system='Linux', node='testserver.mimeanalytics.com', release='4.18.0-240.10.1.el8_3.x86_64', version='#1 SMP Mon Jan 18 17:05:51 UTC 2021', machine='x86_64', processor='x86_64')

Python version

$ python --version

Python 3.6.8

cChardet version

$ python -c 'import cchardet;print(cchardet.__version__)'

2.1.7

What is the problem?

ud = cchardet.UniversalDetector() ud.reset() does not reset the values of ud.done or ud.result after the first file has had its encoding detected.

Expected behavior

ud.done == False ud.result == None

Actual behavior

ud.done == True ud.result == (the last result)

Steps to reproduce the behavior

#!/usr/bin/env python3
import cchardet

files = ['file1', 'file2', 'file3']
ud = cchardet.UniversalDetector()
for file in files:
    ud.reset()
    print(f'Before: {ud.done}, {ud.result}')
    with open(file, 'rb') as ifh:
        for line in ifh.readlines():
            ud.feed(line)
            if ud.done:
                break
    ud.close()
    print(f'After: {ud.done}, {ud.result}')

laurielounge avatar Feb 28 '21 22:02 laurielounge