UTF-unknown icon indicating copy to clipboard operation
UTF-unknown copied to clipboard

Configure enabeld and options of encodingDetectors

Open 304NotModified opened this issue 6 months ago • 2 comments

To prevent false positives or results that aren't usage

Example 1: prefer utf8 over ascii Example 2: disable non western style encodings Example 3: detect utf16 without bom

Idea, users could:

  • configure list of probers, there is a default list
  • weight of probers and/or ordening of probers
  • options of a prober, e.g. 60% 0 values on 2 bytes is a utf16 without bom

Related #186

304NotModified avatar Aug 09 '25 13:08 304NotModified