data-forge-ts icon indicating copy to clipboard operation
data-forge-ts copied to clipboard

Rolling window with select function not behaving correctly

Open e-tang opened this issue 3 years ago • 4 comments

When using rollingwindow with the select function, the rolling windows are correct but select function calls the transformer more than it should.

Please see the following test code:

import { assert, expect } from 'chai';
import 'mocha';
import * as dataForge from './index';
import { DataFrame, Series, ISeries, Index } from './index';

var df = new DataFrame({ index: [10, 20, 30, 40, 50], values: [0, 1, 2, 3, 4] });
var counter = 0;
var windows = df
.rollingWindow(2);

var testSeries = windows.select((window) => {
    console.log("counter:" + counter);
    return [window.getIndex().last(), ++counter];
})   
.withIndex(pair => pair[0])
.inflate(pair => pair[1]);

The following are the outputs:

counter:0
counter:1
counter:2
counter:3
counter:4
counter:5
counter:6
counter:7
counter:8
__index__
---------
20       
30       
40       
50  

I think only 4 times should be the correct number.

e-tang avatar Mar 02 '22 05:03 e-tang

Based on the testing, particularly the first window got called twice

e-tang avatar Mar 02 '22 06:03 e-tang

Tried to use map function on Series directly it turned out the same, the function got called more than it should. So it should be the bug in the iterator.

e-tang avatar Mar 02 '22 06:03 e-tang

OKay, figured out why this happened.

After going through the source code, find out whenever the series is baked, the internal iterators are called and trigger the selector function. During this process, the selector function can be call numerous times from getColumnNames(), toArray(), and toPairs().

e-tang avatar Mar 02 '22 12:03 e-tang

I believe this is due to Data-Forge's laziness and you need to write the transformer function so that it can execute without side effects.

I do realise this is a problem and I'm considering removing the the laziness in Data-Forge v2.

I've labelled this issue and will return to it for v2.

ashleydavis avatar Mar 03 '22 22:03 ashleydavis