Consider adding a leading modifier to a string in the manner of "0b10" to identify numbers of any allowed base
For example, one could use leading subscript characters in the range 0x2080..0x2089 ended by a ',' (or other suitable character):
my $x = "\x2081\x2081,10" # represents 10 in base 11
Then $x would be coerced to be a computable number in the same manner as "0b10".
What is the actual problem this is trying to solve?
https://raku.land/zef:lizmat/Slang::NumberBase
Although on reading this, I'm not sure that's what @tbrowder wanted :-)
What is the actual problem this is trying to solve?
I would like Raku to be able to use its core capability to deal with real numbers up to base 36 by enabling use of a number string with visible leading modifiers to coerce the string to its base.
For example, to get a base 11 value for decimal 10.1 now:
my $x = (10.1).base: 11; # OUTPUT: A.111111
After the desired core change, use this syntax to provide embedded base information:
my $x = "\x2081\x[2081]10.1"; # yes, still ugly, but all ASCII
Or, using @lizmat's module 'Slang::NumberBase':
my $x = ββ10.1;
The number 'ββ10.1' could be used as is since it would be coerced to hold that base 11 value.
Note I would prefer to have the base modifier at the end which is more natural and conventional in the literature, but @lizmat reminded me of Larry's desire to not backtrack in parsing.
I intend to provide the trailing modifier capability in my forthcoming 'Number' module. Using its syntax:
my $x = "10.1\x2081\x2081";
Then maybe @lizmat would accept a PR to update 'Slang::NumberBase' to provide that reversed modifier capability to get a prettier input method as shown here:
my $x = 10.1ββ;
I am making progress on a proof of concept with module βNumberβ. See https://github.com/tbrowder/Number. It handles both leading and trailing base modifiers.
I would like Raku to be able to use its core capability to deal with real numbers up to base 36 by enabling use of a number string with visible leading modifiers to coerce the string to its base.
Why stop at Base 36? The highest numeric base in use prior to the Information Age was probably Sexagesimal β base 60 β used by the Sumerians and later the Babylonians. Knowing this and assuming the upper limit needs to be an even power of 2 to be optimal, base 64 would make the most sense.
If someone here is a bigger nerd about such things and knows of a higher base used by pre-Information Age man, please don't be shy.
What do you think, @tbrowder?
AFAIK, the problem is that there is no agreed way to express number bases > 36. @jeffgazso: or do you know of one?
AFAIK, the problem is that there is no agreed way to express number bases > 36. @jeffgazso: or do you know of one?
It looks like my mouth got ahead of my brain on not one but two fronts simultaneously.
@lizmat, you're correct:
- Looking at this from a value representation perspective:
[0-9]+[A-Z]gives us 36 possible case-insensitive glyphs. The only way to get more would be to add[a-z]in a non-standard way. I think we all know what XKCD has to say about standards. Further, this would have the effect of making hex values case sensitive, which is not desirable. No thanks. - After some more digging, I found there are even higher bases that see legitimate use by humans. (That is, not something a math nerd dreamt up as an exercise in world building or something.) Octogesimal β base 80 β is used in the Supyire language. I'm willing to bet, given enough time someone will find an even higher base than this used somewhere. There is no limiting principle to the idea of going beyond base 36. It's a bad idea.
Lesson learned: if someone wants to do base 60 arithmetic with Cuneiform numerals, let him write a module.
Yes, many versions. My in-progress module "Number" will provide up to base 90 with real numbers in base 91. The choice of characters is pretty reasonable:
Bases 37 through 90 use the ASCII characters in .ord order following the lower case z, with one exception: the '.' is swapped with the last ASCII char (I believe it is the '"'), so the '.' is used as the radix point.
The bases could be expanded by keeping the .ord order, but the next base characters used would be 16-bit rather than 8-bit. Hence my stop at base 91.
It's easy enough to continue, but with not as much compression potential per new base.
An existing approach to bases > 36, just FYI: https://raku.land/github:thundergnat/Base::Any
Thanks, Bruce. Yes, I looked at that a long time ago and it seemed too "wild west" for my needs. I prefer something I can get my simple mind around and, of course, using my own coding style in the process.
Have a good time at the gathering in SC!
I just looked at 'Base::Any' again and realized how much is encapsulated in the code. Initially I could see roughly what was being done, but I needed a good technical description of the process so I started looking around and only recently have I found a fair source in the Wolfram docs.
@thndergnat's code now makes much more sense to me, and he packs a LOT into each line. The use of the Unicode glyph types is very instructive, as is the handling of precision. The practical use of grep and map are good 'keepers', too.
Bruce, I think you could turn Base::Any into a great short talk at the upcoming conference.