add more test data (company names)
@psolin , would you have any lists of company names that you want to see tested?
Hi I've some compay name such as:
- xxx co ltd
- xxx private limited
- xxx pte limited
- xxx co limited
Do you think it's a good idea to add these additional terms on termdata.py ?
https://opencorporates.com could be used for testing?
@davidheryanto it depends. What countries are those for?
I have added a companies.csv file to the tests directory, but unfortunately it seems we cannot really use bulk ascii company names for testing, since many international companies use common anglo-american suffixes such as ltd. or inc. in their corporate names. Which results in a lot of failures.
If we could get the unicode versions of the national suffixes, now that would be useful (ie. in native Chinese or Russian characters). But I am not sure whether cleanco even supports that.
Yes, agree with the Unicode approach. It will be applicable to company names in different countries.
The company names I gave are examples of companies in Singapore.
We now have improved Unicode & non - Latin script support. So better test coverage would make sense too.
One option would be to use https://faker.readthedocs.io/en/master/ to generate fake test company names. Manual labour would still be needed to provide the expected base names that cleanco should be able to produce.