A curiosity for the experiment results
Thank you for your excellent contribution and generous open-source efforts!
I am interested in a question. It may be solved by reading the code. But for clarity and certainty, I would like to directly seek your help.
For the B2A in Domain Adaptation Task, and the *2A in Deterministic Prediction Task, the only difference between them is the training data of different scopes? From the experiments in the paper, I noticed that sometimes using less data actually yields better performance. For example, B2A achieves an ADE of 0.66 in Table 1, whereas *2A in Table 3 has an ADE of 0.72. Could this be due to the negative effect of domain shift in the additional data?
Thanks again!
Hi @LeonardWan ,
Thanks for your interest in my work! Regarding your question, yes, your analysis is correct. More specifically, scenes A and B are part of the ETH dataset, while C, D, and E are part of the UCY dataset. The ETH dataset was manually annotated, leading to somewhat noisy ground-truth trajectories, whereas the UCY dataset was labeled using a spline-based approach, resulting in much smoother trajectories.
As you might expect, due to the intrinsic bias of the datasets, training with a large number of smooth trajectories can lead to reduced robustness when the model encounters noisy ones, especially in the best-1 prediction case. I believe this is the main reason behind that performance gap.
I'm closing this issue for now. Feel free to open another issue if you have any further questions!