pontoon icon indicating copy to clipboard operation
pontoon copied to clipboard

Mixed RTL and LTR content is hard to read in text inputs

Open bugzilla-to-github opened this issue 7 years ago • 20 comments

This issue was created automatically by a script.

Bug 1500333

Bug Reporter: Mahtab Alam [:alamM] <[email protected]> CC: @amire80, @flodolo, @guerojeff, @mathjazz, [email protected], [email protected], [email protected] See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1602426

Created attachment 9018524 Screenshot (61).png

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0

Steps to reproduce:

For using XML Tag and External Argument as it is I clicked on it but it got mixed with one another.

Actual results:

Both got mixed with one another.

Expected results:

They should have remained separate.

bugzilla-to-github avatar Oct 18 '18 17:10 bugzilla-to-github

Comment Author: @flodolo

Sorry but I don't understand what the bug is about.

Is it an issue with Pontoon? Is it an error with translation? If it's Pontoon, we need to move this bug, and you should provide a bit more information on what you did, and what the expected behavior was.

bugzilla-to-github avatar Oct 18 '18 17:10 bugzilla-to-github

Comment Author: Mahtab Alam [:alamM] <[email protected]>

(In reply to Francesco Lodolo [:flod] from comment #1)

Sorry but I don't understand what the bug is about.

Is it an issue with Pontoon? Is it an error with translation? If it's Pontoon, we need to move this bug, and you should provide a bit more information on what you did, and what the expected behavior was.

Yes! This is with Pontoon. It's not a Translation Error. As I have attached the Screenshot where you can see the XML Tag and External Argument are separate in the actual string but in the translated one & Translation Panel it got mixed.

bugzilla-to-github avatar Oct 18 '18 18:10 bugzilla-to-github

Comment Author: @flodolo

I'm looking at the string in a text editor, and it seems correct to me? https://pontoon.mozilla.org/ur/firefox/browser/browser/preferences/preferences.ftl/?search=extension-controlled-privacy-containers&string=178407

ایک ایکسٹینشن , {$name}, کو کنٹینر ٹیب کی ضرورت ہے۔

bugzilla-to-github avatar Oct 18 '18 19:10 bugzilla-to-github

Comment Author: Mahtab Alam [:alamM] <[email protected]>

(In reply to Francesco Lodolo [:flod] from comment #3)

I'm looking at the string in a text editor, and it seems correct to me? https://pontoon.mozilla.org/ur/firefox/browser/browser/preferences/ preferences.ftl/?search=extension-controlled-privacy-containers&string=178407

ایک ایکسٹینشن , {$name}, کو کنٹینر ٹیب کی ضرورت ہے۔

Putting it in text editor correct only XML Tag & External Argument but the Urdu Translation got mismatched.

bugzilla-to-github avatar Oct 18 '18 19:10 bugzilla-to-github

Comment Author: @mathjazz

Steps to reproduce:

  1. Type "a" in the textarea.
  2. Click on the XML placeable: "".

You get this in the textarea: </"a<img data-l10n-name="icon

Mahtab, thanks for the report! What would be the expected value in the textarea after you insert the placeable?

bugzilla-to-github avatar Oct 18 '18 19:10 bugzilla-to-github

Comment Author: Mahtab Alam [:alamM] <[email protected]>

The expected value should be <img data-l10n-name="icon"/> a or a <img data-l10n-name="icon"/> depending upon the context.

bugzilla-to-github avatar Oct 18 '18 19:10 bugzilla-to-github

Comment Author: @mathjazz

Thanks!

I'm looking at how this works for Hebrew (another RTL locale), which has an approved translation in Pontoon: https://pontoon.mozilla.org/he/firefox/browser/browser/preferences/preferences.ftl/?string=178407

Which means it's also in the file: https://hg.mozilla.org/l10n-central/he/file/2c05277ec42e/browser/browser/preferences/preferences.ftl#l91

According to Comment 6, the XML tag in the file output (LTR) seems correct.

So I suspect the problem is that the string contains both, the RTL and LTR content and we force

I wonder what can we even do about this. Flagging Amir with a NI, who's been helping us with RTL issues in the past (see bug #1190566 for example).

bugzilla-to-github avatar Oct 18 '18 19:10 bugzilla-to-github

Comment Author: Mohammed Yaseen Khan [:foxt7ot] <[email protected]>

Thanks Mahtab for raising the issue.

Yes Matjaz, your hunch is correct. The issue is becuase the statement contains both RTL and LTR characters and this issue was there in pootle as well and I suspect this is the case with other RTL as well.

bugzilla-to-github avatar Oct 18 '18 22:10 bugzilla-to-github

Comment Author: @amire80

Sorry, noticed it only now.

Unfortunately, I cannot think of any way to fix this easily. It's a major inherent problem with how RTL languages work. Mixing RTL text with any kind of left-to-right code, including XML is always a disaster. This is why translating into RTL languages in text files is so awful: in translations files every single line has some LTR text in it, so everything is jumbled. Using any web-based translation solution such as Pontoon makes it much better, because it separates the translation from the source string and from the LTR string key. However, it doesn't fix this problem completely because some code or markup is quite often embedded in the string itself, as it is in this example.

The ways to fix such things are:

  • Make Pontoon have super-smart input boxes that are not just plain text, but that are able to truly separate code from text. It would be super-cool, but probably very complicated to make.
  • Create aliases in RTL languages for XML element and attribute names. If it's done, then in Hebrew it would look like this:הרחבה" בשם <תמונה נתונים-תרגום-שם="סמל"/> {$שם} דורשת שימוש במגירת לשוניות." In theory, it would solve the problem, but it may introduce other problems, and it's a bit of a bottomless pit.
  • The most realistic solution is to have a policy that strongly suggests developers to avoid any kind of code or markup in translatable strings, unless it's really, really needed. It would be good for translators to all languages and not only to RTL ones, because it will make it easier for non-developers to translate. (For many people who grew up with the 1990s web HTML and similar things are natural, but it's not true for everyone. There are people who could be great translators, but who have a hard time with markup languages, and reducing this problem may increase volunteers' participation.)

bugzilla-to-github avatar Dec 06 '18 17:12 bugzilla-to-github

Comment Author: @mathjazz

Thanks for a very valuable input, Amir!

I'm lowering the priority until we find a meaningful way forward.

bugzilla-to-github avatar Dec 07 '18 05:12 bugzilla-to-github

Comment Author: @amire80

(In reply to Matjaz Horvat [:mathjazz] from comment #10)

Thanks for a very valuable input, Amir!

Sure, happy to help any time. Sorry it took so long.

I'm lowering the priority until we find a meaningful way forward.

The most realistic way, as I mention in the end of my comment is not so much in the area of feature development, but in the area of policies and practices for writing, reviewing, and maintaining code: strongly encourage developers to move as much code and markup out of translatable strings as possible.

bugzilla-to-github avatar Dec 07 '18 16:12 bugzilla-to-github

Comment Author: Safa Alfulaij <[email protected]>

It might help to add a "Raw mode" as what Pootle did. Here everything is breaked and forced LTR so you can check tags and other stuff easily. Link: https://github.com/translate/pootle/issues/3941 There is no other way of fixing it as I see it.

bugzilla-to-github avatar Dec 08 '19 17:12 bugzilla-to-github

Comment Author: Safa Alfulaij <[email protected]>

Created attachment 9114558 Urdu (ur) · Firefox Updated bidi algorithm.png

This is how I see it. Yes it has a problem, but not a big one.

Tbh, eliminating markup from text strings is a bad idea, absoulutly bad. You create different parts, making translation much harder. Developers need to provide proper context, mistakes occur. I belive that translators who translate applications must have at least a bit of knowledge in techincal aspects like variables and placeholders and plurals and and and

Attached file: Screenshot_2019-12-09-Urdu-(ur)-·-Firefox.png (image/png, 53428 bytes) Description: Urdu (ur) · Firefox Updated bidi algorithm.png

bugzilla-to-github avatar Dec 08 '19 22:12 bugzilla-to-github

Comment Author: @guerojeff

(In reply to Amir Aharoni from comment #11)

(In reply to Matjaz Horvat [:mathjazz] from comment #10)

Thanks for a very valuable input, Amir!

Sure, happy to help any time. Sorry it took so long.

I'm lowering the priority until we find a meaningful way forward.

The most realistic way, as I mention in the end of my comment is not so much in the area of feature development, but in the area of policies and practices for writing, reviewing, and maintaining code: strongly encourage developers to move as much code and markup out of translatable strings as possible.

Thanks Amir, but in our experience it's often more effort/cost to change developer behavior. We already ask developers to be aware of how much code they're including in strings, but resourcing any strict enforcement is not something we have resources to do.

Your first suggestion is consistent with how the majority of other computer-assisted translation tools handle code/tagged elements. They're condensed in the string automatically and the user has to expand them manually if they want to see or manipulate their content. Here's a good example: https://docs.sdl.com/LiveContent/content/en-US/SDL%20Trados%20Studio%20Help-v4/GUID-C6676C93-2EEF-4945-9438-905F05EF268E

bugzilla-to-github avatar Dec 08 '19 23:12 bugzilla-to-github

Comment Author: Md Shahbaz Alam [:shahbaz17] <[email protected]>

Hello Everyone,

After much thought, we are thinking to use the following approach in Urdu.

Pontoon: https://pontoon.mozilla.org/ur

Writing Hamza(ء) at the end of an LTR to convert into RTL(for editor) as adding Hamza won't change the meaning of the sentence and it is a symbol that doesn't add meaning to a word.

What we expect is having a tool that looks for hamza(ء) and makes it a hidden element. So, in theory, they will be in DOM but not shown to the end-user.

bugzilla-to-github avatar Jun 12 '20 00:06 bugzilla-to-github

Comment Author: Md Shahbaz Alam [:shahbaz17] <[email protected]>

Created attachment 9156283 Without Hamza Approach, a sentence would look like this in textarea.

Attached file: Without Hamza.jpeg (image/jpeg, 44518 bytes) Description: Without Hamza Approach, a sentence would look like this in textarea.

bugzilla-to-github avatar Jun 12 '20 00:06 bugzilla-to-github

Comment Author: Md Shahbaz Alam [:shahbaz17] <[email protected]>

Created attachment 9156285 Hamza Approach, this will be displayed correctly in textarea.

Attached file: Hamza Approach.jpeg (image/jpeg, 46511 bytes) Description: Hamza Approach, this will be displayed correctly in textarea.

bugzilla-to-github avatar Jun 12 '20 00:06 bugzilla-to-github

Comment Author: Safa Alfulaij <[email protected]>

Why not use a RLM? Assign it to a shortcut in the keyboard and you're good to go! This is not the proper way to "solve" this issue. RLM/LRM and all the other marks are there for this exact thing.

bugzilla-to-github avatar Jun 12 '20 01:06 bugzilla-to-github

Comment Author: Md Shahbaz Alam [:shahbaz17] <[email protected]>

Hello Safa,

Thanks for the reference.

I agree this is not a proper approach.

But using either left-to-right mark: ‎ or ‎ (U+200E) right-to-left mark: ‏ or ‏ (U+200F)

on Pontoon editor, it doesn't work unless it is designed to do it this way.

bugzilla-to-github avatar Jun 12 '20 01:06 bugzilla-to-github

Comment Author: Safa Alfulaij <[email protected]>

Hi!

I'm not sure what you mean. Pontoon doesn't need to do it itself, one can just insert it. We did that many times in Arabic translations. In windows there are shortcuts for ZWJ/ZWNJ/LRM/RLM in the Arabic layout (Ctrl+Shift+[1-4]). In Linux you can just customize the keyboard layout, or use helper tools (Character map, etc).

Ideal is that Pontoon implements this: https://bugzilla.mozilla.org/show_bug.cgi?id=1372861

bugzilla-to-github avatar Jun 12 '20 04:06 bugzilla-to-github