UTF-unknown
UTF-unknown copied to clipboard
Configure enabeld and options of encodingDetectors
To prevent false positives or results that aren't usage
Example 1: prefer utf8 over ascii Example 2: disable non western style encodings Example 3: detect utf16 without bom
Idea, users could:
- configure list of probers, there is a default list
- weight of probers and/or ordening of probers
- options of a prober, e.g. 60% 0 values on 2 bytes is a utf16 without bom
Related #186