mathjs icon indicating copy to clipboard operation
mathjs copied to clipboard

feat: add notation for array simplification rules

Open gwhitney opened this issue 4 years ago • 4 comments

Although #1913 is closed, the discussion there raised the possibility of extending array simplification to handle something like x*[3/x, 7] simplifying to [3, 7x]. One ingredient for this was a complexity measure to guide the application of a rule (see #2389), but in addition to have such rules there would need to be some notation for rules that handle matrices.

For example, right now one could try using a rule [n3*n1, n3*n2] -> n3*[n1,n2] and might reasonably expect it to rewrite [x*y, x*z] to x*[y, z] but whatever the details of rule application are right now, it doesn't have any effect. And that particular point aside, it would be impossible to specify not only this but [n4*n1, n4*n2, n4*n3] -> n4*[n1,n2,n3] and [n3*n1; n3*n2] -> n3*[n1; n2] and so on for every possible shape of array.

Hence some notation that can express rules for arbitrarily shaped arrays would be needed to extend the existing array simplification rules (which are primarily incorporated into the simplifyConstant function). One can imagine it could look something like [n*a_i] -> n*[a_i] where "a" is a special letter, like the existing "n", "c", and "v", standing for an array entry, with the convention that each entry of the array can separately match a symbol starting with "a" (whereas there must be only one distinct match to "n" as usual). We might also want to be able to write something like [a1_i] + [a2_i] -> [a1_i + a2_i]. This is not a fully fleshed-out proposal, and any suggestions on how to make it more complete and concrete would be very welcome.

gwhitney avatar Jan 15 '22 23:01 gwhitney

That is an interesting idea.

For cases where there is a common term like [n*a_i] -> n*[a_i] it makes a lot of sense to me. So we basically have three categories: scalar-to-matrix, matrix-to-matrix, and matrix-to-scalar. We could name this element a_i, or the suffix _i is sufficient to denote this "common element wise node", and we could allow using a1_i, a2_i, but also a_i and b_i if you prefer that, or would that just complicate the matter?

josdejong avatar Jan 23 '22 10:01 josdejong

This is a good point. Perhaps as you suggest, the "_i" suffix should universally indicate that this is a "node that may vary per element of an array." Then we could leave the current meanings of the main symbol alone: "c_i" would indicate a constant per element of an array; "v_i" would indicate a non-constant per element of the array, and "n_i" would be an arbitrary node per element of the array. In any case "_i" would only be valid inside explicit array brackets in the pattern. If that sounds like a reasonable start to you, I will work on a PR for this as I am able.

gwhitney avatar Jan 27 '22 21:01 gwhitney

That sounds good indeed! It looks very straightforward and easy to understand. Thanks Glen!

josdejong avatar Jan 30 '22 09:01 josdejong

(formerly issue #2552, should be adressed to the extent possible along with this and #2423) zopieux commented: Current behavior

const diff = simplify('a - b', {a: parse('[a1, a2]'), b: parse('[b1, b2]')})
diff.toString()
// '[a1, a2] - [b1, b2]'
simplify('diff[1]', {diff}).toString()
// '([a1, a2] - [b1, b2])[1]'

Expected behavior

diff.toString()
// Element-wide spread of binary operator '-'
// '[a1 - b1, a2 - b2]'
simplify('diff[1]', {diff}).toString()
// 'a1 - b1'
// (simplification of vector index is already implemented, but here
//  it's prevented by the previous step not simplifying enough)

gwhitney avatar May 07 '22 18:05 gwhitney