codelyzer icon indicating copy to clipboard operation
codelyzer copied to clipboard

"i18n": [false,"check-text"] fails on text containing only non-word characters

Open jaufgang opened this issue 7 years ago • 10 comments

Is there a way to use the "i18n": [false,"check-text"] rule would pass for any text node containing only numbers or other non-word symbols even if the wrapper element is missing the i18n attribute?

If not, could this be added as a config option?

jaufgang avatar May 09 '18 17:05 jaufgang

Would you give more details, show an example?

mgechev avatar May 09 '18 17:05 mgechev

Sure,

<div>12345</div>
<span>+</span>

both produce an Each element containing text node should have an i18n attribute error.

jaufgang avatar May 09 '18 17:05 jaufgang

I do like the idea of always ignore texts containing only numbers and/or symbols without extra config... however we still expect a failure for this case <div>éáíç</div> (only special chars), right?

rafaelss95 avatar Jun 13 '18 01:06 rafaelss95

Not sure how we can solve this without false positives. golint solves this by assigning confidence coefficient to the failures, however, tslint does not have such feature.

We can just drop this rule, I suppose.

mgechev avatar Jun 13 '18 02:06 mgechev

Oh, please don't drop this rule, it is still very handy.

I can see how it would probably be prohibitively complicated to distinguish alphabetic vs non alphabetic symbols for any unicode character in general.

Perhaps it would bring the problem down to a more manageable scope to simply allow all strings containing only non-alphabetic printable characters from the Basic Latin (ASCII) and Latin-1 Supplement (Extended ASCII) Unicode blocks without the i18n attribute, and to require it on all strings containing characters outside this set. It may not be perfect, but it would cover a large set of use cases.

Also, if there were an equivalent to the //tslint:ignore comments for html, it would allow users to enable this rule while selectively disabling it in some places

eg. something like:

<div><!--tslint-ignore:i18n-->ignored text<!--end-tslint-ignore--></div>

jaufgang avatar Jun 13 '18 02:06 jaufgang

@rafaelss95

I do like the idea of always ignore texts containing only numbers and/or symbols without extra config... however we still expect a failure for this case

éáíç
(only special chars), right?

These are not special chars... These are normal printable characters, that can be matched using \p{L} qualifier in XRegExp. Special character examples are +-$^& or e.g. Unicode punctuation class and more.

Hermholtz avatar Oct 21 '18 11:10 Hermholtz

Thank you @chojrak11, looks like the perfect solution. matching on \p{L} using https://github.com/slevithan/xregexp

XRegExp looks really cool. I was unaware that this library existed until now and did not know of such a simple way to match letter characters across unicode code blocks.

So forget about my suggestion to only test for letters in the Latin code blocks. this would be a much better approach.

jaufgang avatar Oct 22 '18 15:10 jaufgang

These Unicode character classes first appeared in Perl 5 on 1998-07-24. .NET adopted them in 2002, Java also uses them, JavaScript for reasons unknown to me doesn't... There's lot of great additional information here: https://www.regular-expressions.info/unicode.html#category

Hermholtz avatar Oct 23 '18 00:10 Hermholtz

I believe that this can be solved with possibility to define custom regular expressions for empty elements. This is universal solution, and it is up to developers to define rules that the needs. Example:

{ 
	"rules": { 
		"i18n": [ 
			true, 
			"check-id", 
			"check-text", 
			{ 
				"empty": [
					"^[0-9\\s]+$",
					"^[\+\-\|\\s]+$"
				]
			} 
		] 
	} 
}

janousek avatar Feb 05 '19 07:02 janousek

I would love if this existed. Even just for a simple set of chars like +-()[]

rbirkgit avatar Aug 02 '19 16:08 rbirkgit