taco icon indicating copy to clipboard operation
taco copied to clipboard

RFC: Reduction nodes in Index Expressions

Open fredrikbk opened this issue 8 years ago • 3 comments

This is a proposal for adding a reduction node/nodes to the index expression IR.

The rationale is that our current approach of inferring summations based on a convention is flawed, because any such convention rules out possible use cases. For example, we can choose the Einstein convention, which should give us tensor and linear algebra, but which rules out supporting the numpy API. For example, the Einstein convention does not let me express a = sum(i)(c + d(i)), which is a reasonable numpy expression.

The aim is to support many types of reductions, such as sum, product, min, max, exist, all, and user- definable. One question that arrises is whether to capture this in one node (Reduction) or in several nodes (Sum, Product, ..., Custom).

The proposal is to add reduction nodes to the index expression IR, and to build the Einstein convention on top of it. If the user does not use reductions in his expression, then the Einstein convention is enforced, which means variables repeated in a term are summed over that term. If any reduction node is used, then all of the reduction variables must be defined.

I believe the reduction nodes will make it easier to lower the code, as we can be confident about which statements go into which loops. However, nested reduction statements do impose the additional constraint on iteration graph construction that the index variables corresponding to nested reductions must be nested. That is, the nested reductions sum(i)(b(i) + sum(j)(c(j)), requires the i and j index variables to be nested in the iteration graph (but they can be nested either way because sumation is commutative). All the necesarry rules about ordering, coming from operator properties, can be built into iteration graph construction.

There's one case where we can optimize placement, and that is that we can move an operation out of a reduction, if the operation distributes over the reduction. For example, sum(j)(b * c(j)) = b * sum(j)(c(j)).

fredrikbk avatar Feb 02 '18 14:02 fredrikbk

+1. I think this is a good idea, and from the perspective of the backend, should require very few changes.

The aim is to support many types of reductions, such as sum, product, min, max, exist, all, and user- definable. One question that arrises is whether to capture this in one node (Reduction) or in several nodes (Sum, Product, ..., Custom).

I would say one node, with different types represented by an enum.

shoaibkamil avatar Feb 05 '18 18:02 shoaibkamil

Or maybe one node, but configured with a binary operator node, such as AddNode, whose operands are set to undefined IndexExpr. That would compose nicely over time as we'll want to start phrasing our code generator and optimizations in terms of operator properties (commutativity, etc.)

fredrikbk avatar Feb 06 '18 15:02 fredrikbk

I notice that sum has been added, but I haven't seen any other reductions used (at least in taco's tests). Does there currently exist support for other reductions (specifically max)? If not, what would be involved in adding them?

lockshaw avatar Jun 13 '21 04:06 lockshaw