MLDataPattern.jl icon indicating copy to clipboard operation
MLDataPattern.jl copied to clipboard

Eachbatch on slidingwindow returns unexpected results

Open adinhobl opened this issue 5 years ago • 1 comments

I've been stuck on this for a while now and just traced it back to the behavior of eachbatch being different than what I would expect.

My data is shown below. Each row presents the values of all features at that timestep. I am using a shortened dataset below to provide an example.

5×19 DataFrame
 Row │ p (mbar)  T (degC)  Tpot (K)  Tdew (degC)  rh (%)    VPmax (mbar)  VPact (mbar)  VPdef (mbar)  sh (g/kg)  ⋯
     │ Float64   Float64   Float64   Float64      Float64   Float64       Float64       Float64       Float64    ⋯
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 0.945308  -1.98247  -2.04189     -1.91897  1.1171        -1.30285      -1.47732     -0.790424   -1.48004  ⋯
   2 │ 0.95977   -2.07837  -2.13817     -2.06096  1.04462       -1.33014      -1.53435     -0.786272   -1.53619
   3 │ 0.986284  -2.07028  -2.13244     -2.04519  1.06274       -1.32884      -1.52723     -0.788348   -1.5287
   4 │ 1.00436   -2.09801  -2.16109     -2.09682  1.00838       -1.33664      -1.54624     -0.782121   -1.54742
   5 │ 1.06101   -2.16503  -2.23215     -2.18718  0.984214      -1.35354      -1.5795      -0.782121   -1.58111  ⋯

My data is formatted using a slidingwindow where I have 19 features, and I want to predict the value of one element from the next timestep given just the current step as history. So I expect slidingwindow to produce samples where my data is a tuple of arrays, where each array is 19 x 1.

julia> data = slidingwindow(i -> i+h:i+h+f-1, Array(df)', h, stride=1)
4-element slidingwindow(::var"#49#50", ::LinearAlgebra.Adjoint{Float64,Array{Float64,2}}, 1) with eltype Tuple:
 ([0.945307599461624; -1.9824732337923627; … ; -0.06105235998429366; 1.4284340764807328], [0.9597698467321998; -2.0783721131577844; … ; -0.060029350634975095; 1.4284235902887745])
 ([0.9597698467321998; -2.0783721131577844; … ; -0.060029350634975095; 1.4284235902887745], [0.9862839667282576; -2.0702842558619055; … ; -0.05900634918042319; 1.4284123819361194])
 ([0.9862839667282576; -2.0702842558619055; … ; -0.05900634918042319; 1.4284123819361194], [1.0043617758164738; -2.098014052304919; … ; -0.05798335614623251; 1.4284004514285258])
 ([1.0043617758164738; -2.098014052304919; … ; -0.05798335614623251; 1.4284004514285258], [1.06100557762623; -2.165027727042202; … ; -0.0569603720579933; 1.4283877987721232])

Indeed, that is what it returns. Now I have 4 tuples, each with a 19 features x 1 timestep array for both the X and y. Technically, the type for each Array is 19×1 view(::LinearAlgebra.Adjoint{Float64,Array{Float64,2}}, :, 2:2) with eltype Float64:

Now, In my training loop, I would like to iterate over these samples and train a model on each X, y pair. If I do:

julia> for i in data
           @show i
       end
i = ([0.945307599461624; -1.9824732337923627; -2.0418884431001203; -1.9189727676846546; 1.1171015227337124; -1.3028511908231284; -1.4773232104641594; -0.7904236214710557; -1.480036367036059; -1.4826972088343169; 2.2185238106952334; 0.19340923901506027; 0.2211612940667541; 0.11114045471718151; 0.21792787317689125; 0.3661105594628274; 1.3660687962126672; -0.06105235998429366; 1.4284340764807328], [0.9597698467321998; -2.0783721131577844; -2.138166319274252; -2.0609637294850622; 1.0446173421369407; -1.330142569548392; -1.5343539122850658; -0.7862722979194303; -1.5361898079543617; -1.5390345155803171; 2.3257075520585757; 0.17298677377550387; 0.22210086635262455; 0.10945824511441363; 0.22779849852930875; 0.707199726148972; 1.224794368206394; -0.060029350634975095; 1.4284235902887745])
i = ([0.9597698467321998; -2.0783721131577844; -2.138166319274252; -2.0609637294850622; 1.0446173421369407; -1.330142569548392; -1.5343539122850658; -0.7862722979194303; -1.5361898079543617; -1.5390345155803171; 2.3257075520585757; 0.17298677377550387; 0.22210086635262455; 0.10945824511441363; 0.22779849852930875; 0.707199726148972; 1.224794368206394; -0.060029350634975095; 1.4284235902887745], [0.9862839667282576; -2.0702842558619055; -2.1324354933115113; -2.0451869559516838; 1.0627383872861333; -1.3288429800852843; -1.5272250745574525; -0.788347959695243; -1.5287026824985879; -1.531992352237067; 2.3239984719001643; 0.20798270241233952; 0.27626601473376766; 0.11121805128964106; 0.3240784159666602; 1.000099633755878; 1.0000592075027501; -0.05900634918042319; 1.4284123819361194])
i = ([0.9862839667282576; -2.0702842558619055; -2.1324354933115113; -2.0451869559516838; 1.0627383872861333; -1.3288429800852843; -1.5272250745574525; -0.788347959695243; -1.5287026824985879; -1.531992352237067; 2.3239984719001643; 0.20798270241233952; 0.27626601473376766; 0.11121805128964106; 0.3240784159666602; 1.000099633755878; 1.0000592075027501; -0.05900634918042319; 1.4284123819361194], [1.0043617758164738; -2.098014052304919; -2.1610896231252417; -2.096820032970014; 1.0083752518385543; -1.3366405168639308; -1.5462353084977547; -0.7821209743678049; -1.5474204961380218; -1.5531188422668172; 2.3589125379934575; 0.27034296641382005; 0.19526654910958366; 0.24690733563781467; 0.1451755627164811; 1.2248496376407976; 0.7071786439053358; -0.05798335614623251; 1.4284004514285258])
i = ([1.0043617758164738; -2.098014052304919; -2.1610896231252417; -2.096820032970014; 1.0083752518385543; -1.3366405168639308; -1.5462353084977547; -0.7821209743678049; -1.5474204961380218; -1.5531188422668172; 2.3589125379934575; 0.27034296641382005; 0.19526654910958366; 0.24690733563781467; 0.1451755627164811; 1.2248496376407976; 0.7071786439053358; -0.05798335614623251; 1.4284004514285258], [1.06100557762623; -2.165027727042202; -2.232151865063287; -2.1871779177520922; 0.9842138583062976; -1.3535351798843323; -1.5795032178932833; -0.7821209743678049; -1.5811125606890033; -1.5859822712019838; 2.446319780380877; 0.11226402369212508; 0.35081815405696193; 0.048640493357961015; 0.40205326621233467; 1.366133396461856; 0.3661120037946213; -0.0569603720579933; 1.4283877987721232])

to imitate my training loop, everything works well and I can process each sample as I would expect. I would like to do this operation using eachbatch because it is less memory-intensive and the long term goal is to batch these samples together. When I do the same with eachbatch of the data I expect roughly the same thing to happen, and the docs lead me to believe that my tuples will be passed to me 1-by-1.

If I try it manually specifying size=1, it seems to return what I want:

for i in eachbatch(data, size=1)
           @show i 
       end
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.945307599461624; -1.9824732337923627; -2.0418884431001203; -1.9189727676846546; 1.1171015227337124; -1.3028511908231284; -1.4773232104641594; -0.7904236214710557; -1.480036367036059; -1.4826972088343169; 2.2185238106952334; 0.19340923901506027; 0.2211612940667541; 0.11114045471718151; 0.21792787317689125; 0.3661105594628274; 1.3660687962126672; -0.06105235998429366; 1.4284340764807328]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9597698467321998; -2.0783721131577844; -2.138166319274252; -2.0609637294850622; 1.0446173421369407; -1.330142569548392; -1.5343539122850658; -0.7862722979194303; -1.5361898079543617; -1.5390345155803171; 2.3257075520585757; 0.17298677377550387; 0.22210086635262455; 0.10945824511441363; 0.22779849852930875; 0.707199726148972; 1.224794368206394; -0.060029350634975095; 1.4284235902887745]])
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9597698467321998; -2.0783721131577844; -2.138166319274252; -2.0609637294850622; 1.0446173421369407; -1.330142569548392; -1.5343539122850658; -0.7862722979194303; -1.5361898079543617; -1.5390345155803171; 2.3257075520585757; 0.17298677377550387; 0.22210086635262455; 0.10945824511441363; 0.22779849852930875; 0.707199726148972; 1.224794368206394; -0.060029350634975095; 1.4284235902887745]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9862839667282576; -2.0702842558619055; -2.1324354933115113; -2.0451869559516838; 1.0627383872861333; -1.3288429800852843; -1.5272250745574525; -0.788347959695243; -1.5287026824985879; -1.531992352237067; 2.3239984719001643; 0.20798270241233952; 0.27626601473376766; 0.11121805128964106; 0.3240784159666602; 1.000099633755878; 1.0000592075027501; -0.05900634918042319; 1.4284123819361194]])
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9862839667282576; -2.0702842558619055; -2.1324354933115113; -2.0451869559516838; 1.0627383872861333; -1.3288429800852843; -1.5272250745574525; -0.788347959695243; -1.5287026824985879; -1.531992352237067; 2.3239984719001643; 0.20798270241233952; 0.27626601473376766; 0.11121805128964106; 0.3240784159666602; 1.000099633755878; 1.0000592075027501; -0.05900634918042319; 1.4284123819361194]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[1.0043617758164738; -2.098014052304919; -2.1610896231252417; -2.096820032970014; 1.0083752518385543; -1.3366405168639308; -1.5462353084977547; -0.7821209743678049; -1.5474204961380218; -1.5531188422668172; 2.3589125379934575; 0.27034296641382005; 0.19526654910958366; 0.24690733563781467; 0.1451755627164811; 1.2248496376407976; 0.7071786439053358; -0.05798335614623251; 1.4284004514285258]])
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[1.0043617758164738; -2.098014052304919; -2.1610896231252417; -2.096820032970014; 1.0083752518385543; -1.3366405168639308; -1.5462353084977547; -0.7821209743678049; -1.5474204961380218; -1.5531188422668172; 2.3589125379934575; 0.27034296641382005; 0.19526654910958366; 0.24690733563781467; 0.1451755627164811; 1.2248496376407976; 0.7071786439053358; -0.05798335614623251; 1.4284004514285258]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[1.06100557762623; -2.165027727042202; -2.232151865063287; -2.1871779177520922; 0.9842138583062976; -1.3535351798843323; -1.5795032178932833; -0.7821209743678049; -1.5811125606890033; -1.5859822712019838; 2.446319780380877; 0.11226402369212508; 0.35081815405696193; 0.048640493357961015; 0.40205326621233467; 1.366133396461856; 0.3661120037946213; -0.0569603720579933; 1.4283877987721232]])

If I exclude size=1, I would expect that to be implicit, per the docs, but that's not what happens:

julia> for i in eachbatch(data)
           @show i 
       end
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.945307599461624 0.9597698467321998; -1.9824732337923627 -2.0783721131577844; -2.0418884431001203 -2.138166319274252; -1.9189727676846546 -2.0609637294850622; 1.1171015227337124 1.0446173421369407; -1.3028511908231284 -1.330142569548392; -1.4773232104641594 -1.5343539122850658; -0.7904236214710557 -0.7862722979194303; -1.480036367036059 -1.5361898079543617; -1.4826972088343169 -1.5390345155803171; 2.2185238106952334 2.3257075520585757; 0.19340923901506027 0.17298677377550387; 0.2211612940667541 0.22210086635262455; 0.11114045471718151 0.10945824511441363; 0.21792787317689125 0.22779849852930875; 0.3661105594628274 0.707199726148972; 1.3660687962126672 1.224794368206394; -0.06105235998429366 -0.060029350634975095; 1.4284340764807328 1.4284235902887745]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9597698467321998 0.9862839667282576; -2.0783721131577844 -2.0702842558619055; -2.138166319274252 -2.1324354933115113; -2.0609637294850622 -2.0451869559516838; 1.0446173421369407 1.0627383872861333; -1.330142569548392 -1.3288429800852843; -1.5343539122850658 -1.5272250745574525; -0.7862722979194303 -0.788347959695243; -1.5361898079543617 -1.5287026824985879; -1.5390345155803171 -1.531992352237067; 2.3257075520585757 2.3239984719001643; 0.17298677377550387 0.20798270241233952; 0.22210086635262455 0.27626601473376766; 0.10945824511441363 0.11121805128964106; 0.22779849852930875 0.3240784159666602; 0.707199726148972 1.000099633755878; 1.224794368206394 1.0000592075027501; -0.060029350634975095 -0.05900634918042319; 1.4284235902887745 1.4284123819361194]])
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9862839667282576 1.0043617758164738; -2.0702842558619055 -2.098014052304919; -2.1324354933115113 -2.1610896231252417; -2.0451869559516838 -2.096820032970014; 1.0627383872861333 1.0083752518385543; -1.3288429800852843 -1.3366405168639308; -1.5272250745574525 -1.5462353084977547; -0.788347959695243 -0.7821209743678049; -1.5287026824985879 -1.5474204961380218; -1.531992352237067 -1.5531188422668172; 2.3239984719001643 2.3589125379934575; 0.20798270241233952 0.27034296641382005; 0.27626601473376766 0.19526654910958366; 0.11121805128964106 0.24690733563781467; 0.3240784159666602 0.1451755627164811; 1.000099633755878 1.2248496376407976; 1.0000592075027501 0.7071786439053358; -0.05900634918042319 -0.05798335614623251; 1.4284123819361194 1.4284004514285258]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[1.0043617758164738 1.06100557762623; -2.098014052304919 -2.165027727042202; -2.1610896231252417 -2.232151865063287; -2.096820032970014 -2.1871779177520922; 1.0083752518385543 0.9842138583062976; -1.3366405168639308 -1.3535351798843323; -1.5462353084977547 -1.5795032178932833; -0.7821209743678049 -0.7821209743678049; -1.5474204961380218 -1.5811125606890033; -1.5531188422668172 -1.5859822712019838; 2.3589125379934575 2.446319780380877; 0.27034296641382005 0.11226402369212508; 0.19526654910958366 0.35081815405696193; 0.24690733563781467 0.048640493357961015; 0.1451755627164811 0.40205326621233467; 1.2248496376407976 1.366133396461856; 0.7071786439053358 0.3661120037946213; -0.05798335614623251 -0.0569603720579933; 1.4284004514285258 1.4283877987721232]])

When I do that, it only passes me back two samples, and each one looks like it has transformed my (X,y) tuples by concatenating the an X,y together and passing it back as the new X, and then concatenating the next X,y tuple together and passing it back as the new y.

This same result is returned if I use size=2 as an argument.

julia> for i in eachbatch(data, size=2)
           @show i 
       end
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.945307599461624 0.9597698467321998; -1.9824732337923627 -2.0783721131577844; -2.0418884431001203 -2.138166319274252; -1.9189727676846546 -2.0609637294850622; 1.1171015227337124 1.0446173421369407; -1.3028511908231284 -1.330142569548392; -1.4773232104641594 -1.5343539122850658; -0.7904236214710557 -0.7862722979194303; -1.480036367036059 -1.5361898079543617; -1.4826972088343169 -1.5390345155803171; 2.2185238106952334 2.3257075520585757; 0.19340923901506027 0.17298677377550387; 0.2211612940667541 0.22210086635262455; 0.11114045471718151 0.10945824511441363; 0.21792787317689125 0.22779849852930875; 0.3661105594628274 0.707199726148972; 1.3660687962126672 1.224794368206394; -0.06105235998429366 -0.060029350634975095; 1.4284340764807328 1.4284235902887745]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9597698467321998 0.9862839667282576; -2.0783721131577844 -2.0702842558619055; -2.138166319274252 -2.1324354933115113; -2.0609637294850622 -2.0451869559516838; 1.0446173421369407 1.0627383872861333; -1.330142569548392 -1.3288429800852843; -1.5343539122850658 -1.5272250745574525; -0.7862722979194303 -0.788347959695243; -1.5361898079543617 -1.5287026824985879; -1.5390345155803171 -1.531992352237067; 2.3257075520585757 2.3239984719001643; 0.17298677377550387 0.20798270241233952; 0.22210086635262455 0.27626601473376766; 0.10945824511441363 0.11121805128964106; 0.22779849852930875 0.3240784159666602; 0.707199726148972 1.000099633755878; 1.224794368206394 1.0000592075027501; -0.060029350634975095 -0.05900634918042319; 1.4284235902887745 1.4284123819361194]])
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9862839667282576 1.0043617758164738; -2.0702842558619055 -2.098014052304919; -2.1324354933115113 -2.1610896231252417; -2.0451869559516838 -2.096820032970014; 1.0627383872861333 1.0083752518385543; -1.3288429800852843 -1.3366405168639308; -1.5272250745574525 -1.5462353084977547; -0.788347959695243 -0.7821209743678049; -1.5287026824985879 -1.5474204961380218; -1.531992352237067 -1.5531188422668172; 2.3239984719001643 2.3589125379934575; 0.20798270241233952 0.27034296641382005; 0.27626601473376766 0.19526654910958366; 0.11121805128964106 0.24690733563781467; 0.3240784159666602 0.1451755627164811; 1.000099633755878 1.2248496376407976; 1.0000592075027501 0.7071786439053358; -0.05900634918042319 -0.05798335614623251; 1.4284123819361194 1.4284004514285258]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[1.0043617758164738 1.06100557762623; -2.098014052304919 -2.165027727042202; -2.1610896231252417 -2.232151865063287; -2.096820032970014 -2.1871779177520922; 1.0083752518385543 0.9842138583062976; -1.3366405168639308 -1.3535351798843323; -1.5462353084977547 -1.5795032178932833; -0.7821209743678049 -0.7821209743678049; -1.5474204961380218 -1.5811125606890033; -1.5531188422668172 -1.5859822712019838; 2.3589125379934575 2.446319780380877; 0.27034296641382005 0.11226402369212508; 0.19526654910958366 0.35081815405696193; 0.24690733563781467 0.048640493357961015; 0.1451755627164811 0.40205326621233467; 1.2248496376407976 1.366133396461856; 0.7071786439053358 0.3661120037946213; -0.05798335614623251 -0.0569603720579933; 1.4284004514285258 1.4283877987721232]])

Also, if I scale up batch sizes for time series, I would expect my (X,y) tuples of 19x1 arrays to be come 19x1xn arrays,where n is the batch size, rather than 19xn arrays, which is what happens as I increase the size.

julia> for i in eachbatch(data, size=3)
           @show i 
       end
i = (SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.945307599461624 0.9597698467321998 0.9862839667282576; -1.9824732337923627 -2.0783721131577844 -2.0702842558619055; -2.0418884431001203 -2.138166319274252 -2.1324354933115113; -1.9189727676846546 -2.0609637294850622 -2.0451869559516838; 1.1171015227337124 1.0446173421369407 1.0627383872861333; -1.3028511908231284 -1.330142569548392 -1.3288429800852843; -1.4773232104641594 -1.5343539122850658 -1.5272250745574525; -0.7904236214710557 -0.7862722979194303 -0.788347959695243; -1.480036367036059 -1.5361898079543617 -1.5287026824985879; -1.4826972088343169 -1.5390345155803171 -1.531992352237067; 2.2185238106952334 2.3257075520585757 2.3239984719001643; 0.19340923901506027 0.17298677377550387 0.20798270241233952; 0.2211612940667541 0.22210086635262455 0.27626601473376766; 0.11114045471718151 0.10945824511441363 0.11121805128964106; 0.21792787317689125 0.22779849852930875 0.3240784159666602; 0.3661105594628274 0.707199726148972 1.000099633755878; 1.3660687962126672 1.224794368206394 1.0000592075027501; -0.06105235998429366 -0.060029350634975095 -0.05900634918042319; 1.4284340764807328 1.4284235902887745 1.4284123819361194]], SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},Array{Int64,1}},false}[[0.9597698467321998 0.9862839667282576 1.0043617758164738; -2.0783721131577844 -2.0702842558619055 -2.098014052304919; -2.138166319274252 -2.1324354933115113 -2.1610896231252417; -2.0609637294850622 -2.0451869559516838 -2.096820032970014; 1.0446173421369407 1.0627383872861333 1.0083752518385543; -1.330142569548392 -1.3288429800852843 -1.3366405168639308; -1.5343539122850658 -1.5272250745574525 -1.5462353084977547; -0.7862722979194303 -0.788347959695243 -0.7821209743678049; -1.5361898079543617 -1.5287026824985879 -1.5474204961380218; -1.5390345155803171 -1.531992352237067 -1.5531188422668172; 2.3257075520585757 2.3239984719001643 2.3589125379934575; 0.17298677377550387 0.20798270241233952 0.27034296641382005; 0.22210086635262455 0.27626601473376766 0.19526654910958366; 0.10945824511441363 0.11121805128964106 0.24690733563781467; 0.22779849852930875 0.3240784159666602 0.1451755627164811; 0.707199726148972 1.000099633755878 1.2248496376407976; 1.224794368206394 1.0000592075027501 0.7071786439053358; -0.060029350634975095 -0.05900634918042319 -0.05798335614623251; 1.4284235902887745 1.4284123819361194 1.4284004514285258]])

I'm not sure if I'm making a simple mistake, or if there is something about slidingwindow that makes this different. I have also tried obsdim= 1, 2, and 3 to see if that made a difference, but it always errors on ERROR: AssertionError: obsdim === default_obsdim(A), so I don't think that's it.

Also, I'm unsure if there is a better suggested method to process and batch timeseries data. I would be happy to hear any such recommendations. :)

Thank you for any help!

adinhobl avatar Dec 20 '20 16:12 adinhobl

It seems to give me what I would expect if I convert Array() the sliding window:

julia> data
4-element slidingwindow(::var"#60#61", ::LinearAlgebra.Adjoint{Float64,Array{Float64,2}}, 1, obsdim = 2) with eltype Tuple:
 ([0.945307599461624; -1.9824732337923627; … ; -0.06105235998429366; 1.4284340764807328], [0.9597698467321998; -2.0783721131577844; … ; -0.060029350634975095; 1.4284235902887745])
 ([0.9597698467321998; -2.0783721131577844; … ; -0.060029350634975095; 1.4284235902887745], [0.9862839667282576; -2.0702842558619055; … ; -0.05900634918042319; 1.4284123819361194])
 ([0.9862839667282576; -2.0702842558619055; … ; -0.05900634918042319; 1.4284123819361194], [1.0043617758164738; -2.098014052304919; … ; -0.05798335614623251; 1.4284004514285258])
 ([1.0043617758164738; -2.098014052304919; … ; -0.05798335614623251; 1.4284004514285258], [1.06100557762623; -2.165027727042202; … ; -0.0569603720579933; 1.4283877987721232])

julia> Array(data)
4-element Array{Tuple{SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false},SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false}},1}:
 ([0.945307599461624; -1.9824732337923627; … ; -0.06105235998429366; 1.4284340764807328], [0.9597698467321998; -2.0783721131577844; … ; -0.060029350634975095; 1.4284235902887745])
 ([0.9597698467321998; -2.0783721131577844; … ; -0.060029350634975095; 1.4284235902887745], [0.9862839667282576; -2.0702842558619055; … ; -0.05900634918042319; 1.4284123819361194])
 ([0.9862839667282576; -2.0702842558619055; … ; -0.05900634918042319; 1.4284123819361194], [1.0043617758164738; -2.098014052304919; … ; -0.05798335614623251; 1.4284004514285258])
 ([1.0043617758164738; -2.098014052304919; … ; -0.05798335614623251; 1.4284004514285258], [1.06100557762623; -2.165027727042202; … ; -0.0569603720579933; 1.4283877987721232])

julia> for i in eachbatch(Array(data),size=2)
           @show i
       end
i = Tuple{SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false},SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false}}[([0.945307599461624; -1.9824732337923627; -2.0418884431001203; -1.9189727676846546; 1.1171015227337124; -1.3028511908231284; -1.4773232104641594; -0.7904236214710557; -1.480036367036059; -1.4826972088343169; 2.2185238106952334; 0.19340923901506027; 0.2211612940667541; 0.11114045471718151; 0.21792787317689125; 0.3661105594628274; 1.3660687962126672; -0.06105235998429366; 1.4284340764807328], [0.9597698467321998; -2.0783721131577844; -2.138166319274252; -2.0609637294850622; 1.0446173421369407; -1.330142569548392; -1.5343539122850658; -0.7862722979194303; -1.5361898079543617; -1.5390345155803171; 2.3257075520585757; 0.17298677377550387; 0.22210086635262455; 0.10945824511441363; 0.22779849852930875; 0.707199726148972; 1.224794368206394; -0.060029350634975095; 1.4284235902887745]), ([0.9597698467321998; -2.0783721131577844; -2.138166319274252; -2.0609637294850622; 1.0446173421369407; -1.330142569548392; -1.5343539122850658; -0.7862722979194303; -1.5361898079543617; -1.5390345155803171; 2.3257075520585757; 0.17298677377550387; 0.22210086635262455; 0.10945824511441363; 0.22779849852930875; 0.707199726148972; 1.224794368206394; -0.060029350634975095; 1.4284235902887745], [0.9862839667282576; -2.0702842558619055; -2.1324354933115113; -2.0451869559516838; 1.0627383872861333; -1.3288429800852843; -1.5272250745574525; -0.788347959695243; -1.5287026824985879; -1.531992352237067; 2.3239984719001643; 0.20798270241233952; 0.27626601473376766; 0.11121805128964106; 0.3240784159666602; 1.000099633755878; 1.0000592075027501; -0.05900634918042319; 1.4284123819361194])]
i = Tuple{SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false},SubArray{Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false}}[([0.9862839667282576; -2.0702842558619055; -2.1324354933115113; -2.0451869559516838; 1.0627383872861333; -1.3288429800852843; -1.5272250745574525; -0.788347959695243; -1.5287026824985879; -1.531992352237067; 2.3239984719001643; 0.20798270241233952; 0.27626601473376766; 0.11121805128964106; 0.3240784159666602; 1.000099633755878; 1.0000592075027501; -0.05900634918042319; 1.4284123819361194], [1.0043617758164738; -2.098014052304919; -2.1610896231252417; -2.096820032970014; 1.0083752518385543; -1.3366405168639308; -1.5462353084977547; -0.7821209743678049; -1.5474204961380218; -1.5531188422668172; 2.3589125379934575; 0.27034296641382005; 0.19526654910958366; 0.24690733563781467; 0.1451755627164811; 1.2248496376407976; 0.7071786439053358; -0.05798335614623251; 1.4284004514285258]), ([1.0043617758164738; -2.098014052304919; -2.1610896231252417; -2.096820032970014; 1.0083752518385543; -1.3366405168639308; -1.5462353084977547; -0.7821209743678049; -1.5474204961380218; -1.5531188422668172; 2.3589125379934575; 0.27034296641382005; 0.19526654910958366; 0.24690733563781467; 0.1451755627164811; 1.2248496376407976; 0.7071786439053358; -0.05798335614623251; 1.4284004514285258], [1.06100557762623; -2.165027727042202; -2.232151865063287; -2.1871779177520922; 0.9842138583062976; -1.3535351798843323; -1.5795032178932833; -0.7821209743678049; -1.5811125606890033; -1.5859822712019838; 2.446319780380877; 0.11226402369212508; 0.35081815405696193; 0.048640493357961015; 0.40205326621233467; 1.366133396461856; 0.3661120037946213; -0.0569603720579933; 1.4283877987721232])]

I just wouldn't expect to need to manually call Array() to make it work. This leads me to believe that there's an issue in the way that LabeledSlidingWindows are indexed for the purpose of batching. I also need to try it with larger timeslices than 1 step, although I don't think that should cause any issues inherently.

adinhobl avatar Dec 23 '20 01:12 adinhobl