DocumentTranslator-Legacy icon indicating copy to clipboard operation
DocumentTranslator-Legacy copied to clipboard

Word document translation issues

Open micche78 opened this issue 6 years ago • 3 comments

myname.docx myname.it.docx

Translating the partial formatted phrase "My name is Michael" result in two issues:

  1. the phrase is splitted in three parts because "is" is in bold, so the meaning is lost
  2. "is" is translated to the verb definition instead of literal translation

micche78 avatar Jul 08 '19 14:07 micche78

Indeed the system does not deal well with in-sentence markup. This is an artefact of OpenXML which does not keep sentences contiguous if there is inline markup. Options are:

  1. Replace OpenXML logic with HTML. That is ugly, because it would require a good OpenXML<>HTML converter. Which Office is, but then this would require Office for all Office documents. OR

  2. Go a bit deeper in simplifying OpenXML. That would come at the expense of losing the inline markup.

In short: Not a golden solution in sight.

chriswendt1 avatar Jul 08 '19 14:07 chriswendt1

Thanks Chris, losing the inline markup is acceptable, is there any example to achieve that?

micche78 avatar Jul 09 '19 15:07 micche78

Just sent Chris PPTX files with in-line formatting such as italics and bold face...

When parsing files with the OKAPI framework, inline tags no longer impact translations; i.e., the application sends the complete sentence - not fragments - to MSFT Translator.

That said, parsing in Document Translator v2.6 is much better than in v2.1.1. Here is an exmample translating text with inline tags from German to English, using Document Translator v2.6:

No inline tags in source: Source: Geben Sie Ihren Text in das Quelltextfeld ein und klicken Sie auf die Schaltfläche Übersetzen. Target: Enter your text in the source box and click the Translate button.

Inline tags in source: Source: Geben Sie Ihren Text in das Quelltextfeld ein und klicken Sie auf die Schaltfläche Übersetzen. Target: Give your Text into the source code field and click button on the Translate button.

Parsed with OKAPI the Target is as follows:

Enter your text in the source box and click the Translate button.

georgkirchner avatar May 08 '20 18:05 georgkirchner