Handle unevenly sampled time series
Is your feature request related to a problem? Please describe.
Empirical data are often unevenly sampled. In the state of the art, an interpolation is often performed but this might bias the result.
In TransitionIndicators.jl we want to offer both options: (1) the user can interpolate their data and pass it or (2) directly pass their unevenly sampled time series, in which case no interpolation happens and all the difficulties are handled internally. (Option 2) might have reduced features compared to (Option 1) but introduces no error.
We won't focus too much on (Option 2) for now but defining the basics for the API to handle it should be a step in near future.
Describe the solution you'd like
Here goes an idea of solution. It does not rely on defining an iterator (maybe a downside?) but has a quite sparse syntax:
using TransitionIndicators
abstract type TimeSeries end
struct EvenspacedTS <: TimeSeries
nt::Int64
t::AbstractVector
spacedim::Int64
x::AbstractArray
dt::Real
end
struct UnevenspacedTS <: TimeSeries
nt::Int64
t::AbstractVector
spacedim::Int64
x::AbstractArray
end
function TimeSeries(t, x)
nt = length(t)
spacedim = ndims(x) - 1
if isequispaced(t)
dt = mean(diff(t))
return EvenspacedTS(nt, t, spacedim, x, dt)
else
println(
"\n" *
" Unevenly spaced time series are work in progress. " *
"If you spot bugs, please open an issue on GitHub!\n " *
"Bare in mind that unevenly spaced time series come with limited functionalities! \n"
)
return UnevenspacedTS(nt, t, spacedim, x)
end
end
struct WindowView
width::Int
stride::Int
windows::Vector{UnitRange{Int}}
end
function window_view(x, wv::WindowView)
return [view(x, window) for window in wv.windows]
end
function WindowView(ts::EvenspacedTS; width::Int = default_window_width(ts), stride::Int = default_window_stride(ts))
si = width:stride:ts.nt
windows = [k-width+1:k for k in si]
return WindowView(width, stride, windows)
end
default_window_width(ts::EvenspacedTS) = ts.nt ÷ 100
default_window_stride(ts::EvenspacedTS) = 1
function WindowView(ts::UnevenspacedTS; twidth::Real = default_window_width(ts), stride::Int = default_window_stride(ts))
ibegin = find_nearest(twidth, ts.t)
if ibegin < 1
ibegin = 1
end
windows = [findspan(t, twidth, ts.t) for t in ts.t[ibegin:stride:end]]
return WindowView(0, stride, windows)
end
default_window_width(ts::UnevenspacedTS) = last(ts.t) / 100
default_window_stride(ts::UnevenspacedTS) = 1
function findspan(t, twidth, t_uneven)
i1 = find_nearest(t-twidth, t_uneven)
i2 = find_nearest(t, t_uneven)
if i1 == i2
error("Time series not dense enough before t=$t for the chosen window width")
else
return i1:i2
end
end
function find_nearest(t, tvec::AbstractVector)
return argmin((tvec .- t).^2)
end
t = 0:0.1:100
x = sin.(t)
ts = TimeSeries(t, x)
wv = WindowView(ts)
y = zeros(eltype(x), length(wv.windows))
@btime map!(var, y, window_view(x, wv))
tu = cumsum(0.2 .* rand(length(t)))
xu = sin.(t)
tsu = TimeSeries(tu, xu)
wvu = WindowView(tsu)
yu = zeros(eltype(x), length(wvu.windows))
@btime map!(var, yu, window_view(xu, wvu))
Describe alternatives you've considered
Additional context
One open question is whether the window width on unevenly time series should be defined by the number of points or by a time span. I think the latter one is more rigorous.
Oof, i've thought about this extensively but I haven't written anything here. Sorry, it would have saved you a lot of effort...
From my eyes, the plan is to create a package TimeseriesViewers.jl that defines a Timeseries type and then has on top of it the viewers, e.g. windowed viewers or sub-scale viewers.
Timeseries is an AbstractVector, a wrapper, that wraps a vector and a time vector. It has two concrete subtypes, an evenly and unevenly sampeld. The goal is to allow to ways to access Timeseries: by index, or by time. We can rely on Interpolations.jl to access the unevenly sampled version via interpolating.
The WindowViewers then produce views over timeseries with a given window an stride, in real time units, or integer units for convenience if the timeseris is evenly sampled.
The main takeaway here is that TransitionIndicators.jl will not care at all whether the timeseries is evenly sampled or not, because it gets in a Timeseries and a window viewer, with windows always defined in real time units. Thus, TransitionIndicators.jl works as it does now being agnostic to the nature of the timeseries. I think this is the best design.
it is unfortunate that you went though a lot of time to write your code, but I really think the best way forwards is to just rely on Interpolations.jl for creating windows of an unevenly sampled timeseries...
From a design perspective I agree with you. This would be the easiest. However, one might want to avoid interpolation in some cases, as it biases e.g. the AR1-regression coefficient and the power density spectrum.
Going the way you propose means closing that option, but it might be a choice we want to make. Let's talk about it live at some point.
However, one might want to avoid interpolation in some cases, as it biases e.g. the AR1-regression coefficient and the power density spectrum
Not sure what you mean here. Avoiding interpolation when? The user can choose to not construct the interpolation object, and rather just given in an abstract vector directly, in which case no interpolation would/coould happen.
We can rely on Interpolations.jl to access the unevenly sampled version via interpolating.
I am referring to this sentence. I understood that you (always) want to access unevenly-spaced time series by interpolations. But maybe that was not your point. Let's talk about it live to make sure we understand each other.