Orion Anomaly score changes with length of the same input signal

Python version: 3.8.13
Operating System: windows 10

Description

When predicting anomaly score on a dataset using score_anomalies() function after training a TadGAN model. I find that different lengths of the series will be lead to different results. For example, if the signal X has 200 data points, the score of position 100th will be different when passing X[:] and X[500:150]. The sliding window size is set to 50 in fact.

Do you have any idea about what will cause this problem possibly. I have checked the issue #288. But they could be very different.

Thanks.

Dec 13 '23 22:12 dxiaosa

Hi @dxiaos, thank you for raising this question!

I suspect that the "windowing" concept of score_anomalies is the reason. The default setting for the critic window and the error window is set to be 1% of the length of the series. https://github.com/sintel-dev/Orion/blob/a00440c360309ec27fdd42b35b12f64adcdae189/orion/primitives/tadgan.py#L460-L461

you can specify the exact window you wish to use, e.g. critic_smooth_window=50.

To test whether the exact values would match up, I recommend isolating the function score_anomalies, and test it with a fixed input. For example, this is a quick sketch of how to do this:

import numpy as np
from orion.primitives.tadgan import score_anomalies

X = np.random.random((2000, 50, 1))

critic_smooth_window = 50
error_smooth_window = 50

long_errors, _, _, _ = score_anomalies(X, ..,
    critic_smooth_window=critic_smooth_window, 
    error_smooth_window=error_smooth_window
)

short_errors, _, _, _ = score_anomalies(X[50:150], ..,
    critic_smooth_window=critic_smooth_window, 
    error_smooth_window=error_smooth_window
)

Dec 18 '23 03:12 sarahmish

@sarahmish Thank you for you suggestion above! I have tried set the parameters for the score_anomalies(), but the calculated results still have some differences.

I have attached the implemented sample codes here:

import numpy as np
from orion.primitives.tadgan import score_anomalies

seed_value = 1
np.random.seed(seed_value)

X = np.random.random((2000, 50, 1))
y_hat = np.random.random((2000, 50, 1))
critic = np.random.random((2000, 50, 1))
X_index = np.arange(2000)
comb = 'mult'

critic_smooth_window = 50
error_smooth_window = 50

long_errors, _, _, _ = score_anomalies(
    X, 
    y_hat, 
    critic, 
    X_index, 
    comb=comb,    
    critic_smooth_window=critic_smooth_window, 
    error_smooth_window=error_smooth_window
)

short_errors, _, _, _ = score_anomalies(
    X[50:150], 
    y_hat[50:150], 
    critic[50:150], 
    X_index[50:150], 
    comb=comb,    
    critic_smooth_window=critic_smooth_window, 
    error_smooth_window=error_smooth_window)

you can try this on your environment and check the results.

Dec 18 '23 19:12 dxiaosa

Hi @dxiaosa! Apologies for taking long to reply.

I revisited the code source for score anomalies, and you are correct, the scores will not match. The function normalizes the final result at the end in mult mode which will make them not match. https://github.com/sintel-dev/Orion/blob/4da126f14ad09890faa68c090f9439f3ed64144b/orion/primitives/tadgan.py#L505 However, the general shape (peaks and valleys) should follow the same trajectory.

I made a colab notebook here to help clarify the results. Moreover, if the mode is set to rec score only, I believe you will get the same output and you can test it out!

Jan 15 '24 10:01 sarahmish

Hello, @sarahmish , Thank you for your patient check!

Yeah, I had tested out your supplied colab notebook and found that if I change the comb='rec' with result plotting, there is still a little bit differences between the long and short sequence, which is shown as follows.

And the 'stats.zscore()' only does the normalizing or scaling thing, if the results still have differences. I think maybe there is something with the reconstruction scores computing?

https://github.com/sintel-dev/Orion/blob/4da126f14ad09890faa68c090f9439f3ed64144b/orion/primitives/tadgan.py#L502

Thanks.

Jan 22 '24 01:01 dxiaosa

That is slightly odd @dxiaosa, I imagine it has to do with whether or not the edges of the window are inclusive in the calculation. I'll investigate this a bit further and get back to you.

Thank you for your patience!

Jan 24 '24 15:01 sarahmish

Hi @dxiaosa, apologies for the delay!

I updated the notebook to fix the issues we were observing.

first, the main difference between the two runs is that score_anomalies does in fact rescale the data, so you will observe the same shape but on different scales.
second, I added reconstruction_errors to see how in that function, we get the same output.

If you have further questions, please let me know!

Mar 11 '24 19:03 sarahmish