L2CS-Net icon indicating copy to clipboard operation
L2CS-Net copied to clipboard

Confusion about `yaw` and `pitch`.

Open shaform opened this issue 1 year ago • 4 comments

In your model, the forward function returns pre_yaw_gaze and pre_pitch_gaze:

https://github.com/Ahmednull/L2CS-Net/blob/a4d8f7fa5436a2b2b9f088471623b552a85811bd/l2cs/model.py#L70

However, in the pipeline, the two variables are assigned as gaze_pitch and gaze_yaw:

https://github.com/Ahmednull/L2CS-Net/blob/a4d8f7fa5436a2b2b9f088471623b552a85811bd/l2cs/pipeline.py#L122

It seems yaw and pitch are reversed. Why would this be the case?

shaform avatar Mar 30 '24 13:03 shaform

Have you found the reason why? If left as is, can it be trained?

tiamo405 avatar Aug 22 '24 08:08 tiamo405

line 207 train.py: pitch, yaw = model(images_gaze) line 22,23: self.fc_yaw_gaze = nn.Linear(512 * block.expansion, num_bins) self.fc_pitch_gaze = nn.Linear(512 * block.expansion, num_bins) line 68 69 70: pre_yaw_gaze = self.fc_yaw_gaze(x) pre_pitch_gaze = self.fc_pitch_gaze(x) return pre_yaw_gaze, pre_pitch_gaze Because both have the same calculation structure, I think changing the return part is fine, the result when training will not affect anything.

tiamo405 avatar Aug 22 '24 08:08 tiamo405

It turns out the pipeline was not written by the authors, but it was created in https://github.com/Ahmednull/L2CS-Net/pull/18. So perhaps the PR author would know the answer.

shaform avatar Aug 22 '24 12:08 shaform

I'm pretty certain that it's flipped accidently here and then also in the draw_gaze function (vis.py). So the correct order should begaze_yaw, gaze_pitch = self.model(img)and then in the draw_gaze function flip pitchyaw[0] and pitchyaw[1]uses.

Reblexis avatar Oct 07 '24 16:10 Reblexis