babel icon indicating copy to clipboard operation
babel copied to clipboard

Adjust Persian locale numbers (fa_IR)

Open 5j9 opened this issue 9 years ago • 5 comments

Currently:

>>> from babel import numbers
>>> numbers.format_number(1234567890.123456790, 'fa_IR')
'1,234,567,890.123'

What I was expecting:

>>> from babel import numbers
>>> numbers.format_number(1234567890.123456790, 'fa_IR')
'۱٬۲۳۴٬۵۶۷٬۸۹۰٫۱۲۳'

Also the current currency symbol is "﷼" and the "%" sign is "٪".

I'm using Python 3.5.2 64 bit on a Windows 10 machine.

Thanks!

5j9 avatar Sep 20 '16 01:09 5j9

There are a bunch of problems in this ticket.

  1. Converting Arabic numerals to Persian numerals -> new Babel feature
  2. Incorrect currency symbol -> CLDR data issue
  3. Incorrect number symbols/separators -> Babel bug

Regarding 3. Incorrect number symbols/separators

Ahh, the code that parses number symbols doesn't account for different number systems.

def parse_number_symbols(data, tree):
    number_symbols = data.setdefault('number_symbols', {})
    for elem in tree.findall('.//numbers/symbols/*'):
        if _should_skip_elem(elem):
            continue
        number_symbols[elem.tag] = text_type(elem.text)

In the CLDR, fa_IR.xml doesn't define number symbols, so it inherits from fa.xml, which specifies the latn number system last. Which means these elements will be processed last, which means they're the ones that 'stick'.

import_cldr.py should respect the defaultNumberingSystem tag.

This is a bug.

jtwang avatar Sep 20 '16 18:09 jtwang

Regarding 1. Converting Arabic numerals to Persian numerals

It looks like data for this conversation is supported in the CLDR, but it's not currently read into Babel.

IMO this is a feature request.

jtwang avatar Sep 20 '16 18:09 jtwang

Regarding 2. Incorrect currency symbol

>>> print format_currency(123.45, 'IRR', locale='fa_IR')
‎ریال

Which is different from @5j9 's requested symbol '﷼'

CLDR 28 defines the currency symbol for the Iranian Rial as

<currency type="IRR">
    <displayName>ریال ایران</displayName>
    <displayName count="one">ریال ایران</displayName>
    <displayName count="other">ریال ایران</displayName>
    <symbol>ریال</symbol>
</currency>

It's the same in trunk as well. :(

@5j9 - unfortunately, we get this data from the Unicode Consortium. If you want to show a different symbol, you'll have to override it yourself.

jtwang avatar Sep 20 '16 18:09 jtwang

FYI: although "﷼" and "ریال" look the same here on GitHub (and possibly other browser/console windows), they are not the same.

The first is:

$ unic-inspector '﷼'
 ﷼ | U+FDFC | RIAL SIGN | Currency_Symbol

The second is:

$ unic-inspector 'ریال'
 ر | U+0631 | ARABIC LETTER REH       | Other_Letter
 ی | U+06CC | ARABIC LETTER FARSI YEH | Other_Letter
 ا | U+0627 | ARABIC LETTER ALEF      | Other_Letter
 ل | U+0644 | ARABIC LETTER LAM       | Other_Letter

CLDR has the second form in currency's displayName, which is correct: http://unicode.org/repos/cldr/trunk/common/main/fa.xml

behnam avatar Feb 21 '18 20:02 behnam

@akx This issue has been open for a long time, and I’m curious about the plans to implement this? The Persian and Arabic numbers are also relevant for dates, e.g.

>>> import datetime 
>>> 
>>> today = datetime.date.today()
>>> today
datetime.date(2024, 7, 13)
>>> 
>>> import babel.dates
>>> print(babel.dates.format_date(today, format='full', locale='ar'))
السبت، 13 يوليو 2024

should also use Arabic numerals.

Furthermore, regarding the comment above the relevant CLDR transformer file has moved to Github here.

jenstroeger avatar Jul 13 '24 11:07 jenstroeger