pyfuncol icon indicating copy to clipboard operation
pyfuncol copied to clipboard

[FEATURE] Optimise chained transformations

Open Gondolav opened this issue 4 years ago • 5 comments

Describe the solution you'd like We should optimise chained transformations in a similar way as what Apache Spark does: they should be lazy and re-ordered so that transformations reducing the number of elements are always executed first.

Gondolav avatar Jan 01 '22 11:01 Gondolav

That's a good idea. Do you have any clues about how we would be able to get the information that calls are being chained?

samuelchassot avatar Jan 02 '22 09:01 samuelchassot

We could instantiate an object that keeps track in a dict of what methods have been called whenever we call a transformation, re-orders them and executes them. For example if we do list.map(...).filter(...) the map instantiates the object, that records the map operation. Then, it stores filter as well, re-orders the operation to run filter first, then runs them.

Gondolav avatar Jan 02 '22 09:01 Gondolav

Mmhh yes sounds good! Just, how do you know when the last call occurs to execute them all?

samuelchassot avatar Jan 02 '22 09:01 samuelchassot

I guess here's why Spark has transformations and actions haha

We could do the same, but I don't like it very much

Gondolav avatar Jan 02 '22 10:01 Gondolav

haha most probably ^^

Without doing something of the sort, I don't really see yet how we could do it... I'll think about it but my intuition is that we can't without either manipulate AST or having explicit call to execute transformations

samuelchassot avatar Jan 02 '22 10:01 samuelchassot