matchpy icon indicating copy to clipboard operation
matchpy copied to clipboard

Optional wildcards to match the identity element of the current node

Open Upabjojr opened this issue 5 years ago • 7 comments

When using the optional=value argument in Wildcard, one can specify which value to get in case no matching is found. The value has to be given for every wildcard.

In case of addition and multiplication, the usual optional values are the identity elements, zero and one respectively.

It would be nice to have the possibility to have the optional value to return the identity element of the current node, if available.

Upabjojr avatar Jan 06 '21 09:01 Upabjojr

I like the idea. Would you be interested in working on this? I currently don't have the time.

My idea to implement this would be to define some special "neutral element" value that can be used for optional. In addition, it needs to be possible to provide the desired neutral element when function symbols are created or registered. Then, an easy solution might be to replace wildcards in patterns with that special value with wildcards that use the actual neutral element as the default value. However, I'm not sure if that's sufficient; we need to make sure that something reasonable happens if the same wildcard is used in functions with different neutral elements, for x example f(x+a, x*b).

One could also make this special value the default value for the default argument in the constructor Wildcard.optional(name, default).

hbarthels avatar Jan 09 '21 19:01 hbarthels

However, I'm not sure if that's sufficient; we need to make sure that something reasonable happens if the same wildcard is used in functions with different neutral elements, for x example f(x+a, x*b).

Testing this feature in Wolfram Mathematica (they support it by appending a dot to the wildcard):

In[4]:= f[a, b] /. f[x_. + a, b + x_.] -> {a, b, x}

Out[4]= {a, b, 0}

In[5]:= f[a, b] /. f[x_. + a, b x_.] -> {a, b, x}

Out[5]= f[a, b]

So... no match is found in the second case.

My idea to implement this would be to define some special "neutral element" value that can be used for optional.

Again, Mathematica calls it default value.

In addition, it needs to be possible to provide the desired neutral element when function symbols are created or registered

This can be easily done with a single-dispatched method.

Upabjojr avatar Jan 09 '21 20:01 Upabjojr

no match is found in the second case.

I think, this is only a reasonable answer in this case.

On another hand (here we got the default value for Times instead):

In[7]:= f[a, b] /. f[a, b x_.] -> {a, b, x}

Out[7]= {a, b, 1}

Mathematica calls it default value.

We shouldn't blindly copy Mathematica. "Neutral element" or "identity element" (or just "identity") makes much more sense. BTW, as Operation's may be non-commutative - probably, we must support "left identy" and "right identy" cases.

skirpichev avatar Jan 10 '21 03:01 skirpichev

"Neutral element" or "identity element" (or just "identity") makes much more sense.

That makes sense within an algebraic context. Consider that MatchPy's scope is a lot wider... you could for example use it to match XML files (the commutative property may be helpful there, not sure about the associative one). default as a name makes more sense in these contexts.

BTW, as Operation's may be non-commutative - probably, we must support "left identy" and "right identy" cases.

There it become more complicated. Matrix expressions in SymPy need the size of the matrix for the identity matrix... I think this is too complicated and maybe out of scope for a pattern matching library.

Upabjojr avatar Jan 10 '21 09:01 Upabjojr

you could for example use it to match XML files (the commutative property may be helpful there, not sure about the associative one)

Hmm, why this doesn't fit "algebraic context"? AFAIK, this notion doesn't assume associativity or commutativity.

On another hand, the identity notion assume binary operation (which may be not the case)... But then, the default value may be specific for every argument (more generic case for left and right identities).

There it become more complicated.

Maybe, I don't know the MatchPy codebase enough. But lets at least investigate this first.

The current MatchPy API fits exactly this case. I.e. we can have non-commutative and even non-associative operators. Thus, we can't assume that left and right identities - are equal. On another hand, two-sided identity will be, probably, the most practically used use-case. Maybe we should restrict this mechanism only for monoids (where there are no such complications - and any identity element is both left and right)?

skirpichev avatar Jan 10 '21 10:01 skirpichev

MatchPy does not exactly replicate the behavior of Mathematica. The reason is that there are some cases where Mathematica behaves in a way that could be considered inconsistent (Example: https://arxiv.org/pdf/1705.00907.pdf, pp.11–12. See also https://doi.org/10.1007/978-3-030-23250-4_6 and https://doi.org/10.1016/j.jsc.2008.05.001 for a very formal discussion of what exactly Mathematica might be doing). Thus, I would not use Mathematica as a reference for what MatchPy should do in such special cases.

I would say that now already, MatchPy behaves reasonably for this special case:

from matchpy import *
a = Symbol('a')
b = Symbol('b')
c = Symbol('c')
d = Symbol('d')
one = Symbol('1')
zero = Symbol('0')
x0 = Wildcard.optional("x", zero)
x1 = Wildcard.optional("x", one)
Add = Operation.new('Add', Arity.polyadic, associative=True, commutative=True, one_identity=True)
Mul = Operation.new('Mul', Arity.polyadic, associative=True, commutative=True, one_identity=True)
f = Operation.new('f', Arity.binary)
p = Pattern(f(Add(x0, a), Mul(x1, b)))

>>> list(match(f(a, b), p))
[]
>>> list(match(f(Add(a, c), b), p))
[]
>>> list(match(f(Add(a, c), Mul(b, c)), p))
[{'x': Symbol('c')}]
>>> list(match(f(Add(a, c), Mul(b, d)), p))
[]
>>> list(match(f(Add(a, one), b), p))
[{'x': Symbol('1')}]
>>> list(match(f(a, Mul(b, zero)), p))
[{'x': Symbol('0')}]

Regarding the implementation: After thinking about it for a bit, I would not implement this feature by replacing the wildcards in a pattern. It might be unlikely, but this could lead to unexpected results if the user starts using the pattern for something different than matching. Perhaps I can find some time to take a look at the code soon.

Regarding the name: We also use more "algebraic" names for what is called Flat and Orderless in Mathematica, namely associative and commutative, so from that point of view, it would be consistent to also use an "algebraic" name for such a value.

Regarding left- and right-identities: I wouldn't support this. I expect that it requires a lot more changes, and I don't think that this feature would be used a lot. The way I see it, it should be possible to simulate left- and right-identities now already by manually specifying different identity elements depending on the position in the pattern (with the exception of identity matrices that require a size, but I would consider that out of scope). I would not restrict this option to monoids if we can make it behave reasonably in all cases; I don't want to restrict features of MatchPy only to use-cases that make sense from a mathematical point of view unless it's really necessary.

hbarthels avatar Jan 12 '21 19:01 hbarthels

Maybe we still need the definition of identity elements in MatchPy, but for a different reason: I'm realizing that Wildcard.star is not working in SymPy in case of empty sequence matches. But that's probably a different issue.

Upabjojr avatar Jan 14 '21 06:01 Upabjojr