tslearn icon indicating copy to clipboard operation
tslearn copied to clipboard

[HELP Request] How to use multivariate softDTW barycenter?

Open RafayAK opened this issue 4 years ago • 1 comments

I have a problem similar to the one asked in 87, i.e my data set looks as follows, where each row belongs to a sequence and each row is a point in time:

sequence x0 y0 z0 x1 y1 z1 x2 y2 z2 ... z17 x18 y18 z18 x19 y19 z19 x20 y20 z20
01.avi 0.709314 0.595564 0.0 0.654333 0.577397 0.014865 0.599553 0.606103 0.006795 ... -0.091750 0.636288 0.801407 -0.114206 0.620451 0.853457 -0.118281 0.608881 0.899898 -0.119697
01.avi 0.719498 0.590370 0.0 0.638796 0.589439 0.032860 0.589824 0.661405 0.037496 ... -0.060961 0.666572 0.975270 -0.060586 0.655359 1.046923 -0.047032 0.649768 1.090954 -0.036204
01.avi 0.765117 0.503310 0.0 0.688468 0.490631 0.007665 0.633087 0.559083 -0.004564 ... -0.094739 0.706533 0.864469 -0.105250 0.692404 0.913337 -0.092633 0.683403 0.929996 -0.081646
02.avi 0.847238 0.123516 0.0 0.802507 0.064425 0.013631 0.762274 0.058606 0.016534 ... -0.046354 0.735729 0.338295 -0.041054 0.740595 0.307241 -0.028013 0.758677 0.274317 -0.017921
02.avi 0.837651 0.125420 0.0 0.792646 0.065793 0.023380 0.755813 0.068872 0.030797 ... -0.033827 0.746565 0.347425 -0.025514 0.745967 0.349261 -0.013675 0.754414 0.329223 -0.005807

According to the documentation (https://tslearn.readthedocs.io/en/stable/variablelength.html?highlight=variable%20lenght), I can do something like this:

# Each row in the multi-dimension array represents a time-series
# Each array in a row represents a feature, the number of arrays in a row are times the features were encountered
X = to_time_series_dataset([
    [[1,1,1], [1,1,1], [1,1,1]],   # time-series of length 3
    [[1,1,1], [1,1,1]],   # time-series of length 2
    [[4,4,4], [4,4,4]],   # # time-series of length 2
])
y = np.array([0, 0, 1])

X.shape
>> (3, 3, 3)

X
>> array([[[ 0.,  0.,  0.],
        [ 1.,  1.,  1.],
        [ 2.,  2.,  2.]],

       [[ 2.,  2.,  2.],
        [ 3.,  3.,  3.],
        [nan, nan, nan]],

       [[ 4.,  4.,  4.],
        [ 5.,  5.,  5.],
        [nan, nan, nan]]])

The time-series conversion seems good but throws an error when I try to run the following:

initial_barycenter = ts_zeros(sz=5)
bar = softdtw_barycenter(X, init=initial_barycenter)
...
...
...
~/anaconda3/envs/pytorch-build/lib/python3.8/site-packages/sklearn/metrics/pairwise.py in euclidean_distances(X, Y, Y_norm_squared, squared, X_norm_squared)
    260     paired_distances : distances betweens pairs of elements of X and Y.
    261     """
--> 262     X, Y = check_pairwise_arrays(X, Y)
    263 
    264     # If norms are passed as float32, they are unused. If arrays are passed as

~/anaconda3/envs/pytorch-build/lib/python3.8/site-packages/sklearn/metrics/pairwise.py in check_pairwise_arrays(X, Y, precomputed, dtype, accept_sparse, force_all_finite, copy)
    151                              (X.shape[0], X.shape[1], Y.shape[0]))
    152     elif X.shape[1] != Y.shape[1]:
--> 153         raise ValueError("Incompatible dimension for X and Y matrices: "
    154                          "X.shape[1] == %d while Y.shape[1] == %d" % (
    155                              X.shape[1], Y.shape[1]))

ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 1 while Y.shape[1] == 3

What am I doing wrong?

RafayAK avatar Oct 14 '21 12:10 RafayAK

I think the nested structure of the sequence is not supported. you can compute barycenter for each time-series set or compute time-series set as seven multidimensional time series instead of nesting them.

pattern1:
X1, X2, X3 = to_time_series_dataset([[1,1,1], [1,1,1], [1,1,1]]), to_time_series_dataset([[1,1,1], [1,1,1]]), to_time_series_dataset([[4,4,4], [4,4,4]])
bar = [softdtw_barycenter(X, init=initial_barycenter) for X in [X1, X2, X3]]

pattern2:
X = to_time_series_dataset(
    [
        [1,1,1], [1,1,1], [1,1,1], [1,1,1], [1,1,1], [4,4,4], [4,4,4]
    ]
)
bar = softdtw_barycenter(X, init=initial_barycenter)

masatakashiwagi avatar Oct 25 '21 12:10 masatakashiwagi