to_kana() doesn't consistently return Hepburn or Kunrei

Open blagarde opened this issue 12 years ago • 1 comments

Hello,

I have already reported a couple of other issues and a PR, but I haven't yet even taken the time to thank you for this neat package... Thank you!!

I am opening this issue because I am a bit confused with which inverse romanization I should expect to_kana(str) to return.

These lines suggest that your intent was for it to return the Hepburn version if possible, otherwise the Kunrei version: https://github.com/soimort/python-romkan/blob/master/src/romkan/common.py#L373-376

Later however, ROMKAN.update( {"ti": "チ"} ) explicitly prescribes Kunrei over Hepburn: https://github.com/soimort/python-romkan/blob/master/src/romkan/common.py#L382-383 ( "チ" is Kunrei, "ティ" is Hepburn)

What is the rationale behind this? Is the intent to emulate keyboard input method ("wapuro" style) inverse romanization?

Thanks! Baptiste

Jan 23 '14 12:01 blagarde

Good question. The code is ported from the original Romkan and I did not change it since then. When converting romaji to kana, ti could be any of these

>>> romkan.to_kana('ti')
'チ'
>>> romkan.to_kana('ti')
'ティ'

(same problem with di, du...)

Obviously we should have two methods for doing this, to_kana_from_kunrei and to_kana_from_hepburn, to avoid any confusion. Furthermore, it should be specified in the document that to_kana prescribes Kunrei over Hepburn.

I'll fix this issue later, when I have more free time to dive into the code (hopefully in one or two weeks). Also I will take a look at other issues you reported then.

Thank you for your interest! This really helped a lot.

Jan 23 '14 13:01 soimort