RegEx is slow (can we make faster hex functions?)
Noticed a recent attempt at speeding up some functions by switching to RegEx. But RegExes are usually slower than iterating over strings.
For example, this function:
https://github.com/MetaMask/utils/blob/1bea7f378c37117260912af1921556bc57110d20/src/hex.ts#L60
Is probably about 4-5x slower than something like this:
function isHexAddress(str) {
if (typeof str !== "string" || str.length !== 42) return false;
if (str[0] !== '0' || str[1] !== 'x') return false;
for (let i = 2; i < 42; i++) {
let code = str.charCodeAt(i);
// numbers and lowercase a - f
if (!((code >= 48 && code <= 57) || (code >= 97 && code <= 102))) {
return false;
}
}
return true;
}
(don't take my word for it. this should be tested further)
I got into some further optimizations:
function isHexAddressCaseInsensitiveFast(str) {
if (typeof str !== "string" || str.length !== 42) return false;
if (str[0] !== '0' || str[1] !== 'x') return false;
for (let i = 2; i < 42; i++) {
// transform the character code for case-insensitivity
const charCode = str.charCodeAt(i) & 0xDF;
// after transformation, '0'-'9' are in the range 16-25
const numberRange = (charCode - 16) | (25 - charCode);
// after transformation, 'a'-'f' are in the range 65-70 ('A'-'F'), `A`-`F` remains unchanged
const letterRange = (charCode - 65) | (70 - charCode);
// identifies if the charCode is OUTSIDE of both valid ranges.
if (((numberRange & letterRange) & 0x80000000) !== 0) {
return false;
}
}
return true;
}
for a 50% spread of 1000000 valid/invalid mix-cased strings the runtimes are:
isHexAddressCaseInsensitive: 238.701ms (uses `HEX_CHECKSUM_ADDRESS_REGEX`)
isHexAddressCaseInsensitiveFast: 50.609ms
Ok, I'm going to stop now. Use bitwise responsibly. :-)
@davidmurdoch I thought your view was that these changes to the hex functions were in the end micro-optimizations? Do you think it's still worth it to optimize them further?