dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

Lazy statistics for ValueColumn

Open CarloMariaProietti opened this issue 1 month ago • 0 comments

Fix #1492 The idea is the following: ValueColumnInternal is an interface for statistic values, which in this way are not exposed as public. Implementations of ValueColumnInternal contain the actual cache.

It was necessary to have two caches for each stat (for the moment only max) because computing the stat may give different outputs basing on skipNaN boolean parameter.

I implemented the solution by overloading aggregateSingleColumn, this overload exploits the original aggregateSingleColumn by wrapping it so that it is possible to exploit caches.

For the moment there is only max, however it would be easy to do the same with min, sum, mean and median. For percentile and std it could be done something similar.

CarloMariaProietti avatar Dec 11 '25 18:12 CarloMariaProietti