Question and feedback sheet when giving some talks
I have some interesting questions to share and help you better understand this research:
- [Q1] How to intuitively understand the definition of temporal operators?
I have some simple notes for discussing temporal operators in the temporal matrix factorization model, please check out this blog post.
- [Q2] Is temporal matrix factorization (TMF) applicable to other time series datasets? As we know, most experiments of this research are centered on Uber movement data.
In this research, our experiments are mostly about Uber movement data because these data are high-dimensional, sparse, and nonstationary, but we also tried some datasets such as Google flu trends dataset and energy consumption dataset. In our blog post, we provide a time series forecasting example of TMF on the fluid dynamic dataset, which is high-dimensional. The forecasts are very intuitive for showing good performance of TMF.
- [Q3] How about the computational cost of the nonstationary temporal matrix factorization model on the NYC Uber movement dataset?
Python implementation of the model is written with
numpy. On your personal computer without a GPU, the computational cost would be about 10 minutes. If you have a GPU, you can replacenumpywithcupyand the computational cost could be less than 1 minute or even faster.
- [Q4] How to expand from VAR to VARMA and the like?
In general, moving average is used to remove the trends of time series. If we can use differencing operations to achieve it, it seems to be unnecessary to add the moving average process.
- [Q5] Why should we use low-rank assumption?
First, we are working on high-dimensional time series. The basic assumption is that these time series could stem from a relatively small number of temporal patterns. If this assumption does not work in the real world, then the temporal matrix factorization framework would be ineffective.
- [Q6] Drastic dimensionality reduction gets close to clustering. Could you identify a set of uncorrelated variables that capture the shape of data? How to utilize side information?