globalize icon indicating copy to clipboard operation
globalize copied to clipboard

Re: missing 'en-US' locale in cldr-data

Open casey-speer opened this issue 8 years ago • 19 comments

Regarding this issue: https://github.com/rxaviers/cldr-data-npm/issues/27

I would expect to be able to do Globalize.locale('en-Latn-US') or Globalize.locale('en-US'), which refer to perfectly valid locales, and have globalize understand internally to use en. As it stands, if I do Globalize.locale('en-US') having loaded 'en' data, I get an error (this.xxxFormatter is not a function) when trying to use a formatter e.g. Globalize.formatNumber(5.55). Similarly, I'd like to pass those locales to globalize-compiler without error.

Right now, In order to load a locale like en-US in globalize I need to know ahead of time that en maps to en-Latn-US and that en-US maps to en-Latn-US.

Note that I'm compiling my formatters ahead of time using globalize-compiler so maybe this isn't an issue when loading CLDR and creating formatters at runtime. Either way, it would be handy if globalize added locale normalization at compile time so users of precompiled formatters who don't know about default content could still use them when specifying en-US as the locale, for instance.

casey-speer avatar Jul 21 '17 17:07 casey-speer

That should work, please could you share your complier code too?

rxaviers avatar Jul 21 '17 17:07 rxaviers

Looks like this:

import formattersModuleTemplate from './formattersModuleTemplate'
import codeStringWithFormatters from './myCode'

export default locale => {
  const globalizeExtracts = globalizeCompiler.extract(codeStringWithFormatters);
  return globalizeCompiler.compileExtracts({
    extracts: globalizeExtracts,
    defaultLocale: locale,
    template: formattersModuleTemplate
  });
};

and the compiled output looks like this:

import Globalize from 'globalize/dist/globalize-runtime';
import 'globalize/dist/globalize-runtime/number';
import 'globalize/dist/globalize-runtime/plural';
import 'globalize/dist/globalize-runtime/currency';
var validateParameterTypeNumber = Globalize._validateParameterTypeNumber;
var validateParameterPresence = Globalize._validateParameterPresence;
var numberRound = Globalize._numberRound;
var numberFormat = Globalize._numberFormat;
var numberFormatterFn = Globalize._numberFormatterFn;
var pluralGeneratorFn = Globalize._pluralGeneratorFn;
var currencyNameFormat = Globalize._currencyNameFormat;
var currencyFormatterFn = Globalize._currencyFormatterFn;

Globalize.b468386326 = numberFormatterFn(["",,1,0,3,,,,3,,"","#,##0.###","-#,##0.###","-","",numberRound(),"∞","NaN",{".":".",",":",","%":"%","+":"+","-":"-","E":"E","‰":"‰"},]);
Globalize.a340063086 = numberFormatterFn(["",,1,,,,,,3,,"%","#,##0%","-#,##0%%","-","%",numberRound(),"∞","NaN",{".":".",",":",","%":"%","+":"+","-":"-","E":"E","‰":"‰"},]);
Globalize.a860983140 = numberFormatterFn(["",,1,2,2,,,,3,,"","#,##0.###","-#,##0.###","-","",numberRound(),"∞","NaN",{".":".",",":",","%":"%","+":"+","-":"-","E":"E","‰":"‰"},]);
Globalize.b957349717 = numberFormatterFn(["'$'",,1,2,2,,,0,3,,"","'$'#,##0.00","-'$''$'#,##0.00","-'$'","",numberRound(),"∞","NaN",{".":".",",":",","%":"%","+":"+","-":"-","E":"E","‰":"‰"},]);
Globalize.b581200589 = numberFormatterFn(["'€'",,1,2,2,,,0,3,,"","'€'#,##0.00","-'€''€'#,##0.00","-'€'","",numberRound(),"∞","NaN",{".":".",",":",","%":"%","+":"+","-":"-","E":"E","‰":"‰"},]);
Globalize.a1662346136 = pluralGeneratorFn(function(n) {
  var s = String(n).split('.'), v0 = !s[1];
  return (n == 1 && v0) ? 'one' : 'other';
});
Globalize.b1223214380 = currencyFormatterFn(Globalize("en").numberFormatter({"raw":"'$'#,##0.00"}));
Globalize.b1162560488 = currencyFormatterFn(Globalize("en").numberFormatter({"raw":"'€'#,##0.00"}));
export default Globalize

If I import this code and then do Globalize.locale('en'); Globalize.formatNumber(123.5) it works fine. If I do Globalize.locale('en-US'); Globalize.formatNumber(123.5), that's an error. Also, if I pass in en-US as defaultLocale to the compiler, it errors out. Thanks for your help!

casey-speer avatar Jul 21 '17 18:07 casey-speer

I also recently had troubles loading the en_US locale, not sure if it's related:

Problem was, the directory in the cldr-data package is named en-US-POSIX. After creating a symbolic link to en-US, loading the locale now works fine...

ray007 avatar Jul 24 '17 13:07 ray007

@ray007 en is the same as en-US, so you could really link en-US -> en as well, but en-Latn-US should also then link to en, fr-Latn-CA to fr-CA, etc etc as it stands. This approach doesn't scale very well as locale support is added and possibilities multiply. Perhaps some additional CLDR data must be leveraged to determine the correct default?

casey-speer avatar Aug 02 '17 22:08 casey-speer

I have the problem only with en-US. All others locales seem to have a directory name without the script/variant in it unless necessary - which then works fine for loading.

ray007 avatar Aug 03 '17 07:08 ray007

If I import this code and then do Globalize.locale('en'); Globalize.formatNumber(123.5) it works fine. If I do Globalize.locale('en-US'); Globalize.formatNumber(123.5), that's an error. Also, if I pass in en-US as defaultLocale to the compiler, it errors out. Thanks for your help!

@casey-speer, when you're using regular globalize, any given locale is processed and normalized (via cldrjs): deducing bundle, minLanguageId, etc. On the compiled version, that result is kept in the compiled data (processed during compilation) and therefore if you import the compiled code, that will only work for the locale string you used during compilation. It isn't able to normalize different locale strings.

rxaviers avatar Aug 03 '17 11:08 rxaviers

After creating a symbolic link to en-US, loading the locale now works fine...

@ray007, you shouldn't need to create any symlinks. Please, what globalize and cldrjs version are you using?

rxaviers avatar Aug 03 '17 11:08 rxaviers

As I read this issue, I don't see anything that needs to be fixed or improved. If you believe so, please propose a solution :). Thanks

rxaviers avatar Aug 03 '17 11:08 rxaviers

EDIT: please explain the problem briefly and optionally/ideally propose a solution.

rxaviers avatar Aug 03 '17 11:08 rxaviers

@rxaviers I'm using the cldr-data npm package version 31.0.2. So at least my problem is no fault of your package here.

ray007 avatar Aug 03 '17 13:08 ray007

@ray007 what about globalize and cldrjs versions? :)

rxaviers avatar Aug 03 '17 13:08 rxaviers

@ray007 I could not reproduce your issue...

npm install [email protected] [email protected] # Note you don't need the extra packages listed above
var Globalize = require( "globalize" );
Globalize.load( require( "cldr-data" ).entireSupplemental() );
Globalize.load( require( "cldr-data" ).entireMainFor( "en" ) );

Globalize('en-US').formatNumber(Math.PI);
// > '3.142'

Globalize('en').formatNumber(Math.PI);
// > '3.142'

PS: Note I loaded the en locale, which is the default for en-US. Is figuring this out the issue for you?

rxaviers avatar Aug 03 '17 21:08 rxaviers

@rxaviers I see what you're saying. But this is what happens if I pass 'en-US' as the locale to the compiler:

Error: ENOENT: no such file or directory, scandir '/Users/cspeer/repos/i18n/node_modules/cldr-data/main/en-US'

Would it be possible to add a compiler step to map the given locale, e.g. en-US, to the normalized locale? My concern is that clients of my compiled formatters are able to use a consistent format for the locale string across locales.

casey-speer avatar Aug 03 '17 22:08 casey-speer

It may have to do with the fact I don't use .entireSupplemental() and .entireMainFor(), but cldrdata() calls for data subsets. Where the localeId is used to construct the file path. Since I'm still in the experimentation phase, I didn't invest too much energy beyond making it work for now...

ray007 avatar Aug 04 '17 07:08 ray007

@casey-speer Ok, I see the issue you're saying too, which is related to this https://github.com/globalizejs/globalize-compiler/issues/30. This needs fix

rxaviers avatar Aug 04 '17 11:08 rxaviers

Where the localeId is used to construct the file path.

@ray007 Ok, got it. Basically, it's not localeId you should use, but a bundle id.

@ray007 and @casey-speer, please read https://github.com/rxaviers/cldrjs/blob/master/doc/bundle_lookup_matcher.md

We need to come up with a way to make this process easier, probably by improving the documentation and API. This is somehow related to https://github.com/rxaviers/cldrjs/issues/30.

Any thoughts?

rxaviers avatar Aug 04 '17 11:08 rxaviers

I'm dealing with this exact issue myself and have resorted to using a lookup table. Was frustrated in that I could not understand why there was not a pt-BR /cldr-data/main/ folder until I read this:

American English [en_US] is the default content locale for English [en] German (Germany) [de_DE] is the default content locale for German [de]. Portuguese (Brazil) [pt_BR] is the default content locale for Portuguese [pt] Serbian (Cyrillic) [sr_Cyrl] is the default content locale for Serbian [sr]. Serbian (Cyrillic, Serbia) [sr_Cyrl_RS] is the default content locale for Serbian (Cyrillic) [sr_Cyrl].

Makes logical sense but is not machine friendly. +1 for a more flexible interface.

hhrogersii avatar Sep 14 '17 23:09 hhrogersii

Now that I think about it, I am finding the setting of default locales for a language based on the number of speakers to be especially troubling. If locale YY is assigned as default for lang xx, then due to conflict, environmental or economic collapse, or choice there is a mass migration of xx speakers to locale ZZ, then would the default locale assignment change? I would suggest always defining CLDR data under locale specific folders (xx_YY) and then aliasing them internally. Locale aliasing could be extended to allow aliasing by number of speakers, locale of origin, or utilisation context such as colloquial or corporate.

hhrogersii avatar Sep 21 '17 15:09 hhrogersii