TeaCache icon indicating copy to clipboard operation
TeaCache copied to clipboard

Why does the accumulated metric not incorporate any part of the noisy input for the Wan2.1 implementation?

Open oagrawal opened this issue 5 months ago • 0 comments

In teacache_forward():

Here is the accumulated metric in code:

self.accumulated_rel_l1_distance_even += rescale_func(((modulated_inp-self.previous_e0_even).abs().mean() / self.previous_e0_even.abs().mean()).cpu().item())

The modulated input is defined here:

modulated_inp = e0 if self.use_ref_steps else e

Where e and e0 are defined here:

e = self.time_embedding( sinusoidal_embedding_1d(self.freq_dim, t).float()) e0 = self.time_projection(e).unflatten(1, (6, self.dim))

So the accumulated metric is only a function of the time embedding. Why is the noisy input not being incorporated here, like specified in the paper? The practical effect of this is no matter what the prompt is the pattern of the accumulated metric is the same.

oagrawal avatar Aug 29 '25 20:08 oagrawal