tsdownsample icon indicating copy to clipboard operation
tsdownsample copied to clipboard

Rust MinMaxDownsampler seems to miss the global maxima

Open LeiRui opened this issue 2 years ago • 1 comments

Hi, I run into the problem that the MinMaxDownsampler seems to miss the global maxima with the following test code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from tsdownsample import M4Downsampler,EveryNthDownsampler,LTTBDownsampler,MinMaxDownsampler # MinMaxLTTBDownsampler

def read_ucr(filename,col=0): # col starting from 0
    df=pd.read_csv(filename, header=None, delimiter=r"\s+")
    df2=df.T
    v=df2.iloc[:,col]
    v=v.to_numpy()
    t = np.linspace(0, 10, num=len(v))
    return t,v

filename="sample.txt"
t,v=read_ucr(filename,1)
print(max(v)) # 78.546175

s_ds = MinMaxDownsampler().downsample(t, v, n_out=4)
# Select downsampled data
t_ds = t[s_ds]
v_ds = v[s_ds]
print(max(v_ds)) # 27.700287

The original maximal value is 78.546175 while the downsampled maximal value is 27.700287. I don't know where the bug is. Hope to get some suggestions.

sample.txt

LeiRui avatar Nov 25 '23 09:11 LeiRui

Hi @LeiRui,

Thank you for bringing up this issue!

Upon investigation, it appears that the bug is caused by the numpy array not being contiguous. You can determine if a numpy array is contiguous by using the following code: v.flags['C_CONTIGUOUS']. This issue will not manifest itself if you create a new numpy array that contains the data of v (np.array(v)).

To fix this, I guess we should check if the array is contiguous - and when this is not the case, the scalar implementation should be executed.

We should also definitely include some tests for this behavior!

Cheers, Jeroen

jvdd avatar Nov 25 '23 19:11 jvdd