TCP icon indicating copy to clipboard operation
TCP copied to clipboard

Reproducing the results of the questioning

Open zdy1013 opened this issue 1 year ago • 9 comments

I'm glad you published such a great work, why are the results I get so different from yours when I use 2 RTX 3090s and train with 60 epochs using the dataset you provided and evaluate it? Here are the results I reproduced: Avg. driving score: 52.005 Avg. route completion: 85.111 Avg. infraction penalty: 0.647 Collisions with pedestrians: 0.000 Collisions with vehicles: 0.270 Collisions with layout: 0.097 Red lights infractions: 0.070 Stop sign infractions: 0.238 Off-road infractions: 0.198 Route deviations: 0.000 Route timeouts: 0.094 Agent blocked: 0.298

zdy1013 avatar Aug 01 '24 11:08 zdy1013

Hello, can you send me a copy of the dataset

Mr-ChenSH avatar Aug 05 '24 06:08 Mr-ChenSH

Google Cloud Drive has reached its limit

Mr-ChenSH avatar Aug 05 '24 06:08 Mr-ChenSH

Google Cloud Drive has reached its limit

You can download it from https://huggingface.co/datasets/craigwu/tcp_carla_data

penghao-wu avatar Aug 06 '24 07:08 penghao-wu

I'm glad you published such a great work, why are the results I get so different from yours when I use 2 RTX 3090s and train with 60 epochs using the dataset you provided and evaluate it? Here are the results I reproduced: Avg. driving score: 52.005 Avg. route completion: 85.111 Avg. infraction penalty: 0.647 Collisions with pedestrians: 0.000 Collisions with vehicles: 0.270 Collisions with layout: 0.097 Red lights infractions: 0.070 Stop sign infractions: 0.238 Off-road infractions: 0.198 Route deviations: 0.000 Route timeouts: 0.094 Agent blocked: 0.298

I suppose you are evaluating the model in 48 routes where our model has 57.01 driving score as reported in the paper. I think the difference is reasonable considering the variance in evaluation and training.

penghao-wu avatar Aug 06 '24 08:08 penghao-wu

I'm glad you published such a great work, why are the results I get so different from yours when I use 2 RTX 3090s and train with 60 epochs using the dataset you provided and evaluate it? Here are the results I reproduced: Avg. driving score: 52.005 Avg. route completion: 85.111 Avg. infraction penalty: 0.647 Collisions with pedestrians: 0.000 Collisions with vehicles: 0.270 Collisions with layout: 0.097 Red lights infractions: 0.070 Stop sign infractions: 0.238 Off-road infractions: 0.198 Route deviations: 0.000 Route timeouts: 0.094 Agent blocked: 0.298

I suppose you are evaluating the model in 48 routes where our model has 57.01 driving score as reported in the paper. I think the difference is reasonable considering the variance in evaluation and training.

Thank you for your reply. May I ask how to operate if I want to reproduce your score of 75.14? Screenshot_2024-08-06-16-12-15-62_df198e732186825c8df26e3c5a10d7cd

zdy1013 avatar Aug 06 '24 08:08 zdy1013

We use all 420K data for training. And the ensemble of the TCP and TCP-SB models is used. Some details can be found in the paper. image

penghao-wu avatar Aug 06 '24 08:08 penghao-wu

We use all 420K data for training. And the ensemble of the TCP and TCP-SB models is used. Some details can be found in the paper. image

I would like to know, how do you integrate?What's more, is the data set you provided for 4 towns?

zdy1013 avatar Aug 06 '24 08:08 zdy1013

The dataset file should contain all 8 towns' data. For the ensemble strategy, please refer to the last sentence in the image.

penghao-wu avatar Aug 06 '24 09:08 penghao-wu

The dataset file should contain all 8 towns' data. For the ensemble strategy, please refer to the last sentence in the image.

Ok, thank you for your patient reply!

zdy1013 avatar Aug 06 '24 09:08 zdy1013

@penghao-wu Hello author, I recently reproduced this project, but the test score is similar to the above figure, between 50-60. What do you mean by 420K data? Is this 115G tcp_carla_data? If I want the results in the figure, do I also need to "If it is a trajectory specialization case, we set α=0.5, if it is a control specialization case, we take the maximum value of the brake control instead of the average value" Hope u can reply !

HXTYI avatar Aug 08 '24 00:08 HXTYI

@penghao-wu Hello author, I recently reproduced this project, but the test score is similar to the above figure, between 50-60. What do you mean by 420K data? Is this 115G tcp_carla_data? If I want the results in the figure, do I also need to "If it is a trajectory specialization case, we set α=0.5, if it is a control specialization case, we take the maximum value of the brake control instead of the average value" Hope u can reply !

Yes, the 420K data includes all 8 towns' data in the data zip. Yes.

penghao-wu avatar Aug 08 '24 11:08 penghao-wu

@penghao-wu OK,I konw it.

Mr-ChenSH avatar Aug 12 '24 09:08 Mr-ChenSH

@zdy1013 hello,I encountered the same issue as you, but my score is only 42. Could you please let me know whether you modified any training parameters and, if so, what specific configuration you used? I’d like to troubleshoot the reason for the low score. Thank you, and I look forward to your reply!

Louis-WD avatar Sep 03 '25 09:09 Louis-WD