datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Consolidate optimizer passes in optimizer module for better testing

Open alamb opened this issue 3 years ago • 1 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do. Thinking about the comments from @avantgardnerio on https://github.com/apache/arrow-datafusion/pull/3482/files#r972435250 if/when we want to try rearranging the optimizer passes, we need some way to observe the effect on an overall plan.

Conveniently @andygrove added a test framework for doing so in https://github.com/apache/arrow-datafusion/blob/master/datafusion/optimizer/tests/integration-test.rs

However, while reviewing https://github.com/apache/arrow-datafusion/pull/3482 I noticed there were two separate lists of optimizer's that were instantiated -- one in an integration test and one in the core module.

This is bad as the lists can get out of sync meaning the tests may not provide as much coverage as we would like

Describe the solution you'd like I would like the Optimizer pass lists to be in the datafusion-optimizer crate so it can be used by the integration test, and the interface to the overall optimizer to be more well defined

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

alamb avatar Sep 17 '22 11:09 alamb

Thanks @alamb. I had been intending to do some work in this direction as well. I think this sounds great.

andygrove avatar Sep 20 '22 22:09 andygrove

@alamb I am working on this now because it became important to something that I am working on for this release

andygrove avatar Oct 03 '22 16:10 andygrove

Outstanding @andygrove -- the biggest issue I hit was extracting the passes from the state.

I am standing by to help review it and get it in quickly 👍

alamb avatar Oct 03 '22 16:10 alamb