translator icon indicating copy to clipboard operation
translator copied to clipboard

Wrong detection of source language

Open aneesh1122 opened this issue 1 year ago • 10 comments

The sentence is "我們一同追著心中的夢想"

Google translate is detecting it as Chinese Traditional Screenshot_2024-11-12-02-52-19-386_com.brave.browser-edit.jpg

But your translator is detecting the source language the same as the target language.

For example, the source language is shown English here IMG_20241112_025152_366.jpg

the source language is shown Russian here IMG_20241112_025519_608.jpg

but the source sentence is same in both the translations.

This problem is only with Traditional Chinese. Simplified Chinese works fine.

The translation for Traditional Chinese is working but I'm working in a transliteration process and for this the source language needs to be accurate.

aneesh1122 avatar Nov 11 '24 21:11 aneesh1122

I have not been very active on this library recently and I apologize in advance if I cannot fix this in a timely manner. However my first guess is that there is an error in Language.kt parsing the language string incorrectly. I may be able to take a closer look soon, but I would start there. I would also check the raw JSON response to verify if the source language is correct there. It's also possible the format of the JSON response has changed.

therealbush avatar Nov 12 '24 00:11 therealbush

On a further look I don't think it's related to language string to enum parsing, likely just a change in the JSON format

therealbush avatar Nov 12 '24 00:11 therealbush

On a further look I don't think it's related to language string to enum parsing, likely just a change in the JSON format

Me and my friend twistios were looking at the raw output and found out that it's possible to directly transliterate the source sentence. He did a pull request and you've merged his pull.

Could you please do a new release so that Twistios and I can use it in a project we contribute to?

aneesh1122 avatar Nov 12 '24 02:11 aneesh1122

I need to fix https://github.com/therealbush/translator/issues/4 at some point as well but I can probably do 1.1.1 right now.

therealbush avatar Nov 12 '24 16:11 therealbush

I need to fix https://github.com/therealbush/translator/issues/4 at some point as well but I can probably do 1.1.1 right now.

That would be great. Thanks

aneesh1122 avatar Nov 12 '24 16:11 aneesh1122

it appears the raw json response from the google api shows the target language as the source language if source is set to auto, and only for traditional chinese. Maybe there is something I can do to fix this, but it seems its on google's end, not mine. still weird that the web translate correctly detects the source language.

therealbush avatar Nov 12 '24 21:11 therealbush

maybe if it only happens for traditional chinese, you can use the fact that the source and target are the same to assume that the source must be traditional chinese? lol

therealbush avatar Nov 12 '24 21:11 therealbush

the following request

translator.translateBlocking("我們一同追著心中的夢想", Language.SPANISH, Language.AUTO).rawData

produces this raw response

[
   [
      [
         "Persigamos nuestros sueños juntos",
         "我們一同追著心中的夢想",
         null,
         null,
         11,
         null,
         null,
         [
            [
               
            ],
            [
               
            ]
         ],
         [
            [
               [
                  "af64405095a399ceb1e05c7abb7cda66",
                  "zh_en_2023q1.md"
               ]
            ],
            [
               [
                  "e050167b38f2a566522b4157651fc616",
                  "en_es_2023q1.md"
               ]
            ]
         ]
      ],
      [
         null,
         null,
         null,
         "Wǒmen yītóng zhuīzhe xīnzhōng de mèngxiǎng"
      ]
   ],
   null,
   "es",
   null,
   null,
   [
      [
         "我們一同追著心中的夢想",
         null,
         [
            [
               "Persigamos nuestros sueños juntos",
               null,
               true,
               false,
               [
                  11
               ]
            ],
            [
               "Persigamos juntos nuestros sueños",
               null,
               true,
               false,
               [
                  11
               ]
            ]
         ],
         [
            [
               0,
               11
            ]
         ],
         "我們一同追著心中的夢想",
         0,
         0
      ]
   ],
   1,
   [
      
   ],
   [
      [
         "es"
      ],
      null,
      [
         1
      ],
      [
         "es"
      ]
   ]
]

therealbush avatar Nov 12 '24 21:11 therealbush

Created a new release

therealbush avatar Nov 12 '24 21:11 therealbush

maybe if it only happens for traditional chinese, you can use the fact that the source and target are the same to assume that the source must be traditional chinese? lol

I'm already using if (sourceLanguage == targetLanguage) to disable the translation process otherwise I would have forced this condition for translating the sentence to Chinese Traditional 😅

Created a new release

Thanks

aneesh1122 avatar Nov 12 '24 22:11 aneesh1122