factor icon indicating copy to clipboard operation
factor copied to clipboard

request for math.binary library, variant of math.bits

Open nomennescio opened this issue 3 years ago • 7 comments

The math.bits library creates virtual sequences of bits, where each bit is either f of t. When working with binary numbers, it's very useful to treat them as math.vectors. However, math.bits cannot use math.vectors operations, as f and t are no digits. I therefore request a library math.binary (?) that supports the features of math.bits, but using 0 and 1 instead.

Converting math.bits to such vectors is trivial ([ 0 1 ? ] map), but not very elegant nor efficient.

Conversely, it might be useful to directly use such binary vectors as truth values, hence support for that would be nice. Again, such conversion is trivial ([ zero? not ] map), bit not very elegant nor efficient.

nomennescio avatar Jul 21 '22 16:07 nomennescio

Do you have a good name suggestion for the version that maps on binary digits instead of boolean bits?

mrjbq7 avatar Jul 27 '22 10:07 mrjbq7

A suggestion for the name of what exactly? You mean for all the words in math.bits?

The current naming scheme is a bit unfortunate, as math.bits are not really bit vectors in the classical sense, but are actual boolean vectors. In every classical sense of processing bits, everyone is processing the 1 bit numbers 0 and 1. I think the consideration in Factor to use booleans has to do with space efficiency?

So the 1 bit numbers 0 and 1 should actually be called bit, and vectors of bits can be represented efficiently with integers, but could also be in an unpacked format. Being able to have such a bit as a truth value would be very convenient.

nomennescio avatar Jul 27 '22 20:07 nomennescio

In a language like Python or C or Java, having bits be both booleans and 1 and 0 digits might be convenient. I think iterating over the 1 and 0 digits in Factor introduces maybe more complications in all cases except some kind of arithmetic one.

We could introduce a virtual sequence that did this, for example:

TUPLE: binary-digits { bits bits } ;

C: <binary-digits> binary-digits

M: binary-digits length bits>> length ; inline

M: binary-digits nth-unsafe bits>> nth-unsafe 1 0 ? ; inline

INSTANCE: binary-digits virtual-sequence

But I'm not sure binary-digits is a good name, maybe binary-bits?

mrjbq7 avatar Jul 27 '22 22:07 mrjbq7

I think this is language independent. If you look at the history of programmable computing, starting with the Turing machine, devices were used to represent two states, and multiple devices were used to represent 2^n states, typically numbers, therefore these states were interpreted to be numbers 0 and 1. Before that, Boolean algebra used truth tables with 0 and 1 for false and true respectively, which Shannon applied to algebra of switching circuits (https://en.wikipedia.org/wiki/Two-element_Boolean_algebra, https://en.wikipedia.org/wiki/Boolean_domain). This usage has a long history and is universally used. Even computing using Boolean 0 and 1 is very common, using + and * over this set for logical or and and respectively.

I understand why Factor has f and t, and I also know where that comes from conceptually. I also understand that f is the universal "false", and is also used in Factor to flag "error", "empty", "maybe", "not present", "null" to name just a few uses. In that sense it functions like 0 in C, which is an integer, boolean, null-pointer, and 'maybe'/'error'. You don't want to break that.

To be useful, having binary (where I think the -digits is understood, just as in decimal. Not binary-bits, as bits already is an abbreviation for "binary digit") over math.bits should give some clear advantages next to convenience. Making it a thin wrapper on top of math.bits is rather a disadvantage.

It would rather make sense to integrate with the integers, as these are already sequences of bits in Factor (math.bitwise on bignum...) and inside the CPU. And it would make sense to have a convenient way to use 0 and 1 as truth values, i.e. have a non-intrusive bijection of {f, t} to [0,1] (or maybe better, to map {f} to {0} as the universal false integer) . Best case scenario would be support in the compiler to efficiently map these vectors onto native instructions. The choice of representation is crucial to achieve such efficiency, so I'm not sure if e.g. virtual sequences over integer would achieve that, but it would at least integrate nicely with current (!) MATH: operators (close to how math.bitwise operates).

As for computations with 0 and 1 for truth values as in C, in many situations that gives efficient code, and is very convenient. Suppose for instance we had an efficient Factor operator that maps false/true on 0/1, we can code some typical functions:

: !! ( obj -- 0/1 ) 1 0 ? ; ! replaced by efficient code by the optimizing compiler
: conditional-increment ( n ? -- n' ) !! + ;

I can come up with a proposal, but I've noticed that you typically like to discuss first, so let's do that.

nomennescio avatar Jul 28 '22 06:07 nomennescio

Note that I write "current MATH:", as when having multiple dispatch, it would be easy to mix boolean with integer

nomennescio avatar Jul 28 '22 07:07 nomennescio

Hmm, I was thinking, would it make sense to be able to coerce boolean into integer using the existing coercing mechanism?

nomennescio avatar Jul 28 '22 07:07 nomennescio

At least having an explicit word to turn a boolean into a bit would be useful:

: >bit ( ? -- 1/0 ) 1 0 ? ;

and an inverse variant maybe

: >bitnot ( ? -- 0/1 ) 0 1 ? ;

nomennescio avatar Aug 11 '22 09:08 nomennescio

However, math.bits cannot use math.vectors operations, as f and t are no digits.

Maybe this was added after this issue was raised, but math.vectors has vnot vand vor vxor, etc. so you can indeed use the operations working with binary numbers represented as f and t. With this in mind, is an additional vocab still necessary?

Kacarott avatar Apr 01 '23 07:04 Kacarott

The point is to have binary numbers represented as sequences of 0 and 1, not sequences of f and t.

nomennescio avatar Apr 03 '23 20:04 nomennescio

Sure, but I thought the reason for that was to use math.vectors operations, which you can do with f and t anyway.

Kacarott avatar Apr 03 '23 20:04 Kacarott

In other languages zero and non-zero have implicit Boolean-ness. In Factor that’s not the case so this is the interface that developed, I can see wanting binary digits sometimes. On Apr 3, 2023, at 1:10 PM, nomennescio @.***> wrote: The point is to have binary numbers represented as sequences of 0 and 1, not sequences of f and t.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

mrjbq7 avatar Apr 03 '23 21:04 mrjbq7

Fixed in 3e275b00f2f614901d1e10b9e6695116df949ae9.

mrjbq7 avatar May 05 '24 17:05 mrjbq7