json-formula icon indicating copy to clipboard operation
json-formula copied to clipboard

String to number conversion corner cases

Open Eswcvlad opened this issue 1 year ago • 6 comments

When implementing the spec, I assumed, that for a string to coerce to a number it should, basically, contain a number literal per json-formula rules. But it looks like in the implementation the JavaScript rules are used, which makes an expression like this work:

0 + "0x10" + "0o10" + "0b10" + "010" -> 36

Also there is this expression, which returns an invalid JSON value:

0 + "Infinity" -> Infinity

I stumbled upon this with "_localDate", but there it was just the leading zero.

It seems like something, that would be better to describe more thoroughly in the spec, as this could cause differences in implementations. Especially with leading zeros, which could be treated as octal markers.

Eswcvlad avatar Jul 04 '24 16:07 Eswcvlad

The JS implementation for toNumber() takes different paths depending on whether the radix parameter is something other than 10. Currently: toNumber("0x11") returns 17, but toNumber("0x11", 16) fails (we check for valid hex digits). We could fix that, but still, it's hard to treat these number literals consistently. e.g. in JS, parseInt("0b11") and parseInt("0b11", 2) each return 0, but "0b11" * 1 => 3

I think we should remove support for the JavaScript literals: 0x, 0b, 0o. i.e. an expression like "0b11" * 1 should fail with a TypeError. (also saves us from having to document these literals)

JohnBrinkman avatar Jul 09 '24 15:07 JohnBrinkman

Well there could be other discrepancies, not only with literals prefixes:

Strings are converted by parsing them as if they contain a number literal. Parsing failure results in NaN. There are some minor differences compared to an actual number literal:

  • Leading and trailing whitespace/line terminators are ignored.
  • A leading 0 digit does not cause the number to become an octal literal (or get rejected in strict mode).
  • + and - are allowed at the start of the string to indicate its sign. (In actual code, they "look like" part of the literal, but are actually separate unary operators.) However, the sign can only appear once, and must not be followed by whitespace.
  • Infinity and -Infinity are recognized as literals. In actual code, they are global variables.
  • Empty or whitespace-only strings are converted to 0.
  • Numeric separators are not allowed.

Eswcvlad avatar Jul 09 '24 15:07 Eswcvlad

I will change the implementation to add regex expressions to check for valid numeric literals

JohnBrinkman avatar Jul 09 '24 16:07 JohnBrinkman

Please review https://github.com/adobe/json-formula/pull/181

JohnBrinkman avatar Jul 09 '24 17:07 JohnBrinkman

Looks good.

Maybe change "Number literals may include a leading zero" to "Number literals may include leading zeros"? To be more explicit.

I'll test it a bit locally tomorrow.

Eswcvlad avatar Jul 09 '24 17:07 Eswcvlad

Seems to be that everything is fine.

Eswcvlad avatar Jul 10 '24 13:07 Eswcvlad