refinery icon indicating copy to clipboard operation
refinery copied to clipboard

[BUG] - Encoding error on downloading arabic letters

Open jhoetter opened this issue 3 years ago • 2 comments

Describe the bug When downloading data with arabic letters (see the data file provided in the context on Discord), the encoding has an erroneous format:

\u064a\u0635\u0628\u0631 \u0627\u0644\u0645\u064a\u0645\u062a\u0647\u0627 \u0648\u0627\u0644\u0644\u0647 \u064a\u0631\u062d\u0645 \u0645\u0631\u064a\u0645 \u0648\u064a\u0633\u0643\u0646\u0647\u0627 \u0641\u0633\u064a\u062d \u062c\u0646\u0627\u062a\u0647"

To Reproduce Upload said data file and export the data in the UI

Expected behavior Should download the data in correct encoding

Screenshots -

Desktop (please complete the following information):

  • OS: macOS
  • Browser: chrome
  • Version: 1.2.0

Additional context Asked on Discord by abdessamad

jhoetter avatar Sep 15 '22 16:09 jhoetter

seems to be a common coding thing for pandas / json export. Using the file option should do the trick. https://stackoverflow.com/a/39612316/19801189

JWittmeyer avatar Sep 19 '22 09:09 JWittmeyer

Awesome, thanks for the suggestion. Will look into this :)

jhoetter avatar Sep 19 '22 09:09 jhoetter

after reading a bit on the benefits of using the byte decode on stack overflow it seems that the utf-8 should be useable by all readers. Since we now switched to a different export procedure with this issue I see no problem in using the utf-8 version for record export.

will be included in the next release.

JWittmeyer avatar Oct 27 '22 06:10 JWittmeyer