datafaker icon indicating copy to clipboard operation
datafaker copied to clipboard

Add 'Locale' faker

Open panilya opened this issue 3 years ago • 13 comments

panilya avatar Aug 31 '22 16:08 panilya

Codecov Report

Merging #319 (419d878) into main (401556f) will decrease coverage by 0.05%. The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main     #319      +/-   ##
============================================
- Coverage     94.87%   94.82%   -0.06%     
- Complexity     1937     1943       +6     
============================================
  Files           203      204       +1     
  Lines          3866     3883      +17     
  Branches        383      385       +2     
============================================
+ Hits           3668     3682      +14     
- Misses          101      102       +1     
- Partials         97       99       +2     
Impacted Files Coverage Δ
src/main/java/net/datafaker/Faker.java 98.39% <100.00%> (+<0.01%) :arrow_up:
src/main/java/net/datafaker/LocaleFaker.java 100.00% <100.00%> (ø)
src/main/java/net/datafaker/Barcode.java 90.47% <0.00%> (-6.96%) :arrow_down:
src/main/java/net/datafaker/CNPJ.java 100.00% <0.00%> (ø)
src/main/java/net/datafaker/Code.java 97.70% <0.00%> (ø)
src/main/java/net/datafaker/Unique.java 100.00% <0.00%> (ø)
src/main/java/net/datafaker/Fallout.java 100.00% <0.00%> (ø)
src/main/java/net/datafaker/Twitter.java 84.61% <0.00%> (ø)
src/main/java/net/datafaker/Computer.java 100.00% <0.00%> (ø)
src/main/java/net/datafaker/StarWars.java 91.66% <0.00%> (ø)
... and 6 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Aug 31 '22 16:08 codecov-commenter

Why do we need to use *.yml instead of java's embedded DateFormat#getAvailableLocales?

snuyanzin avatar Aug 31 '22 16:08 snuyanzin

I thought it would be better to use locale.yml to maintain consistency (I mean that all fakers have a corresponding .yml ).

panilya avatar Aug 31 '22 16:08 panilya

Not every faker uses yml, e.g. idnumbers generators, date time generators do not use them. Right now java provides more than 1000 locales, i do not see a reason to not reuse it

Moreover Locale class itself has a bunch of public methods to retrieve different data locale specific like language, country, variant, extension keys and etc. This stuff is not covered within yaml faker

snuyanzin avatar Aug 31 '22 16:08 snuyanzin

I can't understand why in most repeated test runs with over 1.000 iterations, there is 1 Locale.baseLocale() fall

panilya avatar Sep 14 '22 13:09 panilya

Could you give a name of test failing after 1000 iterations?

snuyanzin avatar Sep 14 '22 14:09 snuyanzin

LocaleFakerTest.baseLocale()

panilya avatar Sep 14 '22 14:09 panilya

javadoc for the java.util.Locale#toString contains an answer

If both the language and country fields are missing, this function will return the empty string, even if the variant, script, or extensions field is present (you can't have a locale with just a variant, the variant must accompany a well-formed language or country code).

snuyanzin avatar Sep 14 '22 22:09 snuyanzin

Croatian
Chechen (Cyrillic, Russia)
English (Malta)
Punjabi (Gurmukhi, India)
Albanian (Latin, Albania)
Northern Sami
Inari Sami
English (Macao SAR China)
Koyraboro Senni (Latin, Mali)
Cebuano (Philippines)
French (Belgium)
Sangu
Samburu (Latin, Kenya)
Bodo (India)
Serbian (Cyrillic, Bosnia & Herzegovina)
Arabic (Arabic, Egypt)

This is the result of Locale.displayName(). I don't know if this result is ok

panilya avatar Sep 15 '22 12:09 panilya

Result of LocaleFaker.baseLocale():

sr_ME
ms_MY
doi_IN
en_VC
el_GR
mg_MG
rm_CH
seh_MZ
nl_BE
mzn_IR
ne_NP
ar_JO

Result of LocaleFaker.displayName():

Swedish
English (Antigua & Barbuda)
Chakma (Bangladesh)
Danish (Denmark)
Konkani (India)
English (Tanzania)
Vai
Lower Sorbian (Germany)
Spanish (Ceuta & Melilla)
French (France)
English (Sint Maarten)
Faroese (Faroe Islands)
Inari Sami (Finland)

panilya avatar Sep 15 '22 13:09 panilya

By accident I noticed there is another class dealing with locales net.datafaker.service.LocalePicker It's better to have one instead of adding more

snuyanzin avatar Sep 16 '22 06:09 snuyanzin

@snuyanzin If it's better to have one, then how to implement generation of random locales. As I see, LocalePicker is a service, and isn't accessible via Faker instance, the solution I see is to add LocalePicker as a provider in Faker, what do you think?

panilya avatar Sep 16 '22 14:09 panilya

There is no a real reason to have it not a provider. We could convert it to provider and make it having methods to pick locale based on what is supported by datafaker and methods based on what is supported by java

snuyanzin avatar Sep 17 '22 08:09 snuyanzin

Looking at such significant changes in the project, I think it is better to close this pull request and add this provider in the new one

panilya avatar Sep 25 '22 13:09 panilya