#"🚀" cast to string is not "🚀"
I wonder if this is normal but it has caught me off-guard a few times. The gist: "🚀" == (#"🚀" :> string) is false.
See https://rescript-lang.org/try?version=v12.0.0-alpha.8&module=esmodule&code=C4TwDgpgBMULxQNoGIBEheDcAF7qA+yQC6AUADYSwAeAXDPFGlqqeVCDcAvkQFIDOAdCQD2AcwAUGbPARiKUKgD4ovYACcAlgDsRASh08Bw8ahCppUMSHlKVG7XqA
This is definitely a bug, but it makes me think of the constraint. It seems to be not clearly defined.
This is broken, too:
type t = {\"🎉": int}
let x = {
\"🎉": 42,
}
https://rescript-lang.org/try?version=v12.0.0-alpha.10&module=esmodule&code=C4TwDgpgBMULxQN4B0BEgeDcJH7qBcUCWAdsAL4BQZANhLAB7xJlRRpa5QAsATADRnlA
And this even more!
let \"🎉" = 42
https://rescript-lang.org/try?version=v12.0.0-alpha.10&module=esmodule&code=DYUwLgBAOgRIPBuEj9mEC8EAsAmAUEA
And this even more!
let "🎉" = 42 https://rescript-lang.org/try?version=v12.0.0-alpha.10&module=esmodule&code=DYUwLgBAOgRIPBuEj9mEC8EAsAmAUEA
Should we even allow this?
Should we even allow this?
In the definition of the exotic identifier.. yes. As they are legit identifier names in JS 😅
Ok, so this is not valid JS:
So there should be compile error for this: https://rescript-lang.org/try?version=v12.0.0-alpha.10&module=esmodule&code=DYUwLgBAOgRIPBuEj9mEC8EAsAmAUN0kBbAQwGsRUIAKAP1kRgEpUA+CAJRCIGMwA6AZzAAnAJYA7AOaU6SBtiA
However, this is valid JS:
So this should be compiled correctly: https://rescript-lang.org/try?version=v12.0.0-alpha.10&module=esmodule&code=C4TwDgpgBMCMUF4oG0DEAiQvBuAC99UA+UqIAugFBkA2EwUAHrAFwzxIY7oWiQwBMiUAN4AddIB4NwJH76ZgEsAdsAC+FarTr8kgslCijJ0qABZeAGjLKgA
Hmm, weird. I thought any Unicode sequences were allowed in identifier names since ES6. https://mathiasbynens.be/notes/javascript-identifiers-es6
And saw some toy projects like https://github.com/Thomas101/emoji-js
I can still find records saying it was supported, and all the LLMs are convinced it's still supported, but in reality it looks like it's not supported in any JS engines?
I found specific Unicode properties, ID_Start and ID_Continue, to restrict the range of Unicode in identifier names.
- https://tc39.es/ecma262/#prod-IdentifierStart
- https://github.com/dtolnay/unicode-ident
- https://github.com/oxc-project/unicode-id-start
Not sure its perf and size, but maybe we can check in the parser
https://github.com/dbuenzli/uucp/blob/master/src/uucp__id.ml