commons-validator icon indicating copy to clipboard operation
commons-validator copied to clipboard

VAT Id Number validators for all EU countries

Open homebeaver opened this issue 1 year ago • 5 comments

Hi @garydgregory ,

here is EUgen from Europe. With this PR I provide validation and check digit calculation for VAT Id Numbers VATIN for all countries in the EU. This number is neccessary in the intra comunity trades to apply a zero VAT rate. You can find a documentation of the different VATINs structures in my VATIN wiki (de)

The validator for all is VATINValidator - it implements the jira issue 494

Pls review and merge. Regards EUGen

homebeaver avatar Oct 08 '24 17:10 homebeaver

Is it really necessary to have a separate class for each country? Also DE has a test class but no main class

sebbASF avatar Oct 09 '24 15:10 sebbASF

Is it really necessary to have a separate class for each country? Also DE has a test class but no main class

@sebbASF

Yes. Each of the 28 countries use a different algoritm to calculate/validate the check digit. Germany and Croatia use a standard ISO/IEC 7064, MOD 11,10. So the tests are against Modulus11TenCheckDigit Italy uses LUHN.

homebeaver avatar Oct 09 '24 18:10 homebeaver

Hi @homebeaver

A nice to have item would be an update to the package-info.java file for the package in play.

garydgregory avatar Oct 10 '24 20:10 garydgregory

Hi @garydgregory where does the test data come from?

Different - I documented it in wiki/VATIN - in German! You'll find examples in the Details-link:

grafik

homebeaver avatar Oct 10 '24 21:10 homebeaver

A nice to have item would be an update to the package-info.java file for the package

Hi @garydgregory

updated the package-info.java in org.apache.commons.validator.routines Regards - next week I'm on holiday :-) EUGen

homebeaver avatar Oct 11 '24 22:10 homebeaver

I'm afraid the NL validation is incomplete: my VAT ID, NL004350351B91, is not accepted while it is in fact valid (confirmed with the online VIES check). I suspect this may have to do with a change in 2020 or so: before that time, the btw-id for self-employed people consisted of 'NL'+their BSN (the Dutch SSN) + "B" + two digits. While 'leaking' a SSN isn't as problematic in NL as it is in some other countries, it's not great, so they stopped doing that. Apparently the BSN adheres to the MOD-11#i pattern, while the new number does not. I can share a second number that fails the validator but is correct according to VIES privately on request.

raboof avatar Jan 29 '25 21:01 raboof

Is the new NL VAT validation scheme officially documented anywhere? It seems to me that even with access to official documentation, it is going to be a mammoth job keeping track of changes.

Tracking updates to IBAN and Domain entries is time-consuming enough, and they each have a single authoritative source.

Sorry, but I think the VAT validation proposal is not a suitable candidate for Commons Validator.

However, it might be worth adding some of the generic CheckDigit validators such as Modulus11TenCheckDigit and Modulus11XCheckDigit.

sebbASF avatar Jan 29 '25 23:01 sebbASF

Is the new NL VAT validation scheme officially documented anywhere?

I looked around a bit but didn't find anything - the sources I found just say 'NL, 9 digits, B, 2 digits', no further structure described.

It seems to me that even with access to official documentation, it is going to be a mammoth job keeping track of changes.

Tracking updates to IBAN and Domain entries is time-consuming enough, and they each have a single authoritative source.

Sorry, but I think the VAT validation proposal is not a suitable candidate for Commons Validator.

OTOH, you could also say that this is a good motivation to share this effort. Perhaps we should make sure that there are at least 3 or so parties that are planning to use this in production (and share notes when rules change) before including it?

raboof avatar Jan 29 '25 23:01 raboof

I support @raboof's idea 👍

garydgregory avatar Jan 29 '25 23:01 garydgregory

It's still going to be a lot of work for the limited pool of Commons developers.

Also, although the code can do basic syntax and structure validation, it does not check that the ID is actually in use. So I wonder what the use case is?

Also, how many parties would be needed to get decent coverage of all the countries?

sebbASF avatar Jan 30 '25 00:01 sebbASF

... VAT ID, NL004350351B91, is not accepted while it is in fact valid (confirmed with the online VIES check). I suspect this may have to do with a change in 2020 or so: before that time, the btw-id for self-employed people consisted of 'NL'+their BSN (the Dutch SSN) + "B" + two digits. ...

@raboof thank you for testing. Indeed, my algorithm use MOD 11 with $${ G_i = i }$$ - this was changed in 2020: There are two groups with VAT-numbers: • Modulo 11 VAT-ID numbers; • Modulo 97 VAT-ID numbers.

I´ll correct this.

regards EUGen H.

homebeaver avatar Jan 30 '25 11:01 homebeaver

I looked around a bit but didn't find anything ...

@raboof - I found this BMF_UID_Konstruktionsregeln-Nov-2020.pdf bmf.gv.at

homebeaver avatar Jan 30 '25 16:01 homebeaver

... it does not check that the ID is actually in use. So I wonder what the use case is?

@sebbASF

Compare VATID to IBAN: validating an IBAN does not check if the IBAN is actually in use. Use VIES for checking if a VATIN is actually in use. It delegates the check to the national systems, but is not avaiable 7/24. We use it for valid VATINS only when creating an invioce.

The use cases: european standard EN16931-1:2017 for electronic invoices needs a valid VATIN of the customer. This number is neccessary in the intra comunity trades to apply a zero VAT rate.

In my case I migrate ERP data from system A to system B. There are a lot of unchecked VATINs and VIES is not the tool of choice for checking a mass. When a shop or company discontinues VIES returns invalid (not invalid since...), but invoices are stored a while for auditing.

homebeaver avatar Jan 30 '25 17:01 homebeaver

I agree that testing the checksums and structure of VAT numbers will allow the removal of incorrect numbers, however it won't prove that they were ever valid, only that they could have been.

sebbASF avatar Jan 30 '25 22:01 sebbASF

I looked around a bit but didn't find anything ...

@raboof - I found this BMF_UID_Konstruktionsregeln-Nov-2020.pdf bmf.gv.at

Nice find! I checked and can confirm that both my own btw-id and the one I won't disclose are accepted by the updated implementation.

To justify the additional maintenance burden it'd still be good to find one or two other parties that can confirm they'd use this component in production and help keep it up-to-date before merging it into the main library.

raboof avatar Jan 30 '25 23:01 raboof

I think it is vital to include links to the official documentation of the validation requirements in each national class. The Wikipedia pages are useful, but they are not normative, and as we have seen with NL, the page does not include the full story.

Furthermore, most of the Wikipedia entries give no validation details.

sebbASF avatar Jan 31 '25 00:01 sebbASF

These look like bugs

No. Imprecise comments + inconsistent code. Both resolved. Tkx

homebeaver avatar Jan 31 '25 10:01 homebeaver

There are some classes with logging, but not all. What determines whether logging is used? Is it really needed?

sebbASF avatar Feb 01 '25 21:02 sebbASF

I wouldn't except any logging from this component.

garydgregory avatar Feb 01 '25 21:02 garydgregory

I've removed logging.

There is one class with logging left : TidDECheckDigit. The reason is the specification - the link to it is in the class header. The spec is in german. In some cases valid tids should not be used - and I warn with the message recomended in the spec. See page 6+7 of the spec or screenshot: grafik

I wouldn't except any logging from this component.

@garydgregory To satisfy you I can remove the class from this PR

homebeaver avatar Feb 02 '25 19:02 homebeaver

Logging can be completely disabled, so should not be essential to the functioning of a class.

In this case, the log messages are directly associated with an exception (apart from an unnecessary debug log), so I don't see the point of them; they don't provide any extra information.

I think they should be removed as well please.

sebbASF avatar Feb 02 '25 22:02 sebbASF

Hi @sebbASF , @sebbASF

when merging apache/master to my repo branch I got an error in pom.xml

- cvc-elt.1.a: Cannot find the declaration of element 'project'.
- Downloading external resources is disabled.

The solution is https instead http in pom element 'project'

I resolve this in my pull request and hope you accept this pull. It is waiting to be accepted for more then one year!

regards EUGen

homebeaver avatar Oct 27 '25 15:10 homebeaver

I resolve this in my pull request and hope you accept this pull. It is waiting to be accepted for more then one year!

As mentioned earlier in https://github.com/apache/commons-validator/pull/271#issuecomment-2623175681: keeping this up-to-date looks like quite a maintenance burden. This means the work is valuable to share - but only if we're confident that there will be someone looking after this code after it's merged. I assume you're using this and would be ready to take some of that maintainership on yourself, but it'd be good to have one or two other people showing up.

raboof avatar Oct 27 '25 15:10 raboof

tkx @raboof. Changes can be cleanly merged. @garydgregory

homebeaver avatar Nov 25 '25 11:11 homebeaver

I resolve this in my pull request and hope you accept this pull. It is waiting to be accepted for more then one year!

As mentioned earlier in #271 (comment): keeping this up-to-date looks like quite a maintenance burden. This means the work is valuable to share - but only if we're confident that there will be someone looking after this code after it's merged. I assume you're using this and would be ready to take some of that maintainership on yourself, but it'd be good to have one or two other people showing up.

There has been no reply from @homebeaver to your comment @raboof, which doesn't inspire confidence that this will be maintained.

garydgregory avatar Nov 25 '25 12:11 garydgregory

I agree that this is likely to become too much of a maintenance burden.

Seems to me this would be better as a separate project (with its own package name). This could of course use existing Validator methods where appropriate.

It might be worth considering adding some or all of the new ModulusCheckDigit classes, if the algorithm is standardised and in general use.

sebbASF avatar Nov 25 '25 15:11 sebbASF

... would be better as a separate project (with its own package name).

I'll do it.

homebeaver avatar Dec 17 '25 15:12 homebeaver