Emotion-Recognition-RNN icon indicating copy to clipboard operation
Emotion-Recognition-RNN copied to clipboard

some question about pre-processing

Open HeGaoYuan opened this issue 9 years ago • 6 comments

Hi, In your PreProcessing folder, in the mapAll2FER.m file, https://github.com/saebrahimi/Emotion-Recognition-RNN/blob/master/PreProcessing/mapAll2FER.m#L18 you comment

%add border of 32 for TFD which results in 160 image size

but actually you add 48 to the pavg

HeGaoYuan avatar May 13 '16 05:05 HeGaoYuan

I updated the comments. If you map TFD to FER you can use smaller border. I did this since some datasets are cropped tighter and pooling them resulted in lower performance without aligning faces. The last block of code for verification should roughly show faces matching the mean shape.

saebrahimi avatar May 16 '16 23:05 saebrahimi

Thank you for your reply! I have some confusion that why didn't you just align every face picture to one confirmed facial keypoints position? Because when I follow your method using the mean shape of the dataset. I found that in the verification code, the face is hard to match the mean shape. Or maybe about the roughly, we understand it differently.

Here is my example using your method. 2016-05-17 8 08 28

HeGaoYuan avatar May 17 '16 00:05 HeGaoYuan

If your dataset is different from Emotiw15 or FER, you need to run a keypoint detector such as the one I referred to here: https://github.com/saebrahimi/Emotion-Recognition-RNN/tree/master/PreProcessing If your dataset is cropped tighter than FER then you need to add a border to roughly have same location for face (the line you initially asked about).

Why mean shape:

  • you don't need perfect alignment for CNNs but we found that an approximate alignment as done here helps with limited dataset size.
  • it helps if keypoint detector fails on part of your dataset and you don't need to detect them at test time as you have a fixed transformation pre-computed on train set.
  • a simple work around is to hand-label a small subset of your training data and average the keypoint's location and use it as mean-shape of your dataset. (we only map mean-shape to mean-shape to merge different datasets, if you have only one dataset you can skip this step)

saebrahimi avatar May 17 '16 00:05 saebrahimi

Your reply is very helpful! Thank you so much!

HeGaoYuan avatar May 17 '16 00:05 HeGaoYuan

Sorry that reopen the issue. You uploaded the comments,

add border of 48 for EMOTIW( 32 is enough for TFD which results in 160 image size)

Firstly, is the size of TFD images 96*96?

I know that you said

If your dataset is cropped tighter than FER then you need to add a border to roughly have same location for face

I think this is reason why we add a border, but how do we decided the size of the added border? You use 48 for EMOTIW and 32 for TFD. Do you decide it by experiment?

Thank you!

HeGaoYuan avatar May 30 '16 09:05 HeGaoYuan

@HeGaoYuan 你好!我最近也在研究这个项目,但是由于他给的readme太简陋了,我又是新手不太会,能不能麻烦您说一下,跑代码的流程,谢谢。

supercaizehua avatar Dec 30 '17 06:12 supercaizehua