optimize DFT decomposition for improved performance
the current DFT-based series decomposition implementation has performance gaps that can be addressed to improve training and inference speed.
current limitations
- naive implementation without batch optimization
- repeated calculations that could be cached
- ignores GPU memory layout optimization
- misses opportunities for parallel computation
proposed optimizations
1. batched operations
reimplement the DFT computation to operate on entire batches at once rather than processing each time series individually:
# Instead of:
for i in range(batch_size):
xf = torch.fft.rfft(x[i])
# ...processing...
# Use:
xf = torch.fft.rfft(x) # processes entire batch at once
2. pre-compute frequency indices
for stationary time series, the top-k frequencies often remain consistent. implement a caching mechanism to avoid recomputing these indices repeatedly:
def forward(self, x):
batch_size = x.shape[0]
if self._cached_freq_indices is None or batch_size != self._cached_batch_size:
# compute and cache indices
# use cached indices
3. optimize memory operations
restructure memory operations to minimize data movement:
- use in-place operations where possible
- align memory access patterns
- reduce intermediate tensor allocations
4. CUDA kernel implementation (optional)
for maximum performance, implement a custom CUDA kernel for the specific DFT decomposition patterns used in TimeMixer:
@torch.jit.script
def optimized_dft_decomp(x, top_k):
# optimized implementation
expected benefits
- 30-50% speedup in DFT decomposition operations
- reduced memory footprint
- improved training throughput
- lower inference latency
affected code
the main class affected is DFT_series_decomp in models/TimeMixer.py.
Thank you very much for your suggestion. Indeed, the DFT module should be further improved. We have also received many recommendations from colleagues in the signal processing community who are very interested in applying Fourier transforms to time-series modeling. If possible, you’re welcome to submit a merge request — we’ll carefully review and fully take your feedback into account.