Simple-File-Manager icon indicating copy to clipboard operation
Simple-File-Manager copied to clipboard

Filenames inside zip archives are not UTF-8-encoded

Open Maelan opened this issue 4 years ago • 0 comments

Versions:

  • Simple File Manager v 6.9.3 Pro
  • Android 7.1.1
  • for testing: unzip 6.0-14 (latest version at the time of writing) on a Linux desktop

Steps to reproduce:

  1. In Simple File Manager, select some files whose filenames contain non-ASCII characters (even better, non-latin1 characters; I tested with filenames containing both latin1 and Cyrillic letters).
  2. Zip them from the context menu.
  3. Transfer the zip file to a desktop computer by any means.
  4. Run unzip -l the_zip_file.zip from the desktop computer and observe broken filenames.

On Linux at least, the standard unzip program is unable to decode filenames produced by Simple File Manager, even though unzip does support UTF-8 (I tested this by compressing the same files with the standard zip program (3.0-9, latest version) and succeeding in decompressing them). None of these options to unzip helps: -U, -UU, -^, -a (the last one is not expected to have an effect on filenames, anyway). My desktop’s locale is set to UTF-8.

Simple File Manager still uses some valid encoding, which supports at least latin1 and Cyrillic, because 7z (17.04) succeeds in detecting it and decoding filenames.

Thus I suspect that either Simple File Manager does not encode filenames in UTF-8 within zip archives, or that it does not signal the encoding properly. Now, with my limited knowledge, I believe that the zip format has poor support for filename encoding, but:

  1. my zip-to-unzip experiment demonstrates that there is at least a way to reliably signal UTF-8.
  2. according to that StackExchange answer, “the format specification says that names are supposed to be either UTF-8 or cp437”; CP437 is obviously inapplicable, so a portable zip must use UTF-8.

That’s all I can say. Thank you for this software!

Maelan avatar Sep 25 '21 18:09 Maelan