"recurrent conv" in the paper
Hi,
As presented in Table 1 in the paper, I see some layers are called "recurrent conv". However, on the prototxt file specifying the models, I only see normal Caffe's "Convolution" layers. My question is what is the "recurrent conv" layer? and is it important? or a normal convolutional layer would still produce equally good results?
Thanks.
Hi, On 28/08/18 02:34, Duc Minh Nguyen wrote:
Hi,
As presented in Table 1 in the paper, I see some layers are called "recurrent conv". However, on the prototxt file specifying the models, I only see normal Caffe's "Convolution" layers. My question is what is the "recurrent conv" layer?
Paper to read: https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Lee_Recursive_Recurrent_Nets_CVPR_2016_paper.html
easy way to implement recurrent convolution in caffe is by weight sharing - ie N times convolution layer with same weights.
something like:
layer { name: "conv4_1" type: "Convolution" bottom: "bn4" top: "conv4_1" param { name: "conv4_w" } param { name: "conv4_b" } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } }
layer { name: "conv4_2" type: "Convolution" bottom: "conv4_1" top: "conv4_2" param { name: "conv4_w" } param { name: "conv4_b" } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } }
- work is done by : param { name: "conv4_w" } param { name: "conv4_b" }
in tiny.proto see conv11 and conv11_2
and is it important? or a normal convolutional layer would still produce equally good results?
the paper Recursive Recurrent Nets With Attention Modeling for OCR in the Wild says it is
Thanks.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/DeepTextSpotter/issues/52, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6jsB2T0iaZdr0FRyp8j5UOz6_b46PDks5uVJALgaJpZM4WOttp.
Hi Michal, Thank you very much for your prompt response .
Paper to read: https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Lee_Recursive_Recurrent_Nets_CVPR_2016_paper.html easy way to implement recurrent convolution in caffe is by weight sharing - ie N times convolution layer with same weights.
Thanks for the pointer and explanation.
in tiny.proto see conv11 and conv11_2
I see it now. However, as I understand, tiny.prototxt defines the RPN instead of the recognition network. So it seems like even without recurrent conv, your recognition net still performs decently. I will try using recurrent conv as well.
I have another small question: the training code in this repo is not fully "end-to-end", right? The gradients from the recognition network do not flow to the RPN network. Right now, as I understand, the two networks are trained using different solvers, and there are some normalization steps being performed in numpy to connect the outputs from the RPN to the recognition model. Am I missing anything? Have you tried making it fully end-to-end?
Thank you again and have a nice day.
On 28/08/18 12:44, Duc Minh Nguyen wrote:
Hi Michal, Thank you very much for your prompt response .
Paper to read: https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Lee_Recursive_Recurrent_Nets_CVPR_2016_paper.html easy way to implement recurrent convolution in caffe is by weight sharing - ie N times convolution layer with same weights.Thanks for the pointer and explanation.
in tiny.proto see conv11 and conv11_2I see it now. However, as I understand, tiny.prototxt defines the RPN instead of the recognition network. So it seems like even without recurrent conv, your recognition net still performs decently. I will try using recurrent conv as well.
I have another small question: the training code in this repo is not fully "end-to-end", right? The gradients from the recognition network do not flow to the RPN network. Right now, as I understand, the two networks are trained using different solvers, and there are some normalization steps being performed in numpy to connect the outputs from the RPN to the recognition model. Am I missing anything? Have you tried making it fully end-to-end?
You are right - it is just 2 networks - OCR net is just learning on "inperfect" proposals so you can see it as additional data augmentation.
we tried full end-to-end with no success (usual story - you make it fully differentiable and then you need to brake it - something like FOTS approach https://arxiv.org/abs/1801.01671v2 ).
Thank you again and have a nice day.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/DeepTextSpotter/issues/52#issuecomment-416537952, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6jsLwDQa7tNIqsIP9xA3NuHszpdqlFks5uVR8DgaJpZM4WOttp.
You are right - it is just 2 networks - OCR net is just learning on "inperfect" proposals so you can see it as additional data augmentation. we tried full end-to-end with no success (usual story - you make it fully differentiable and then you need to brake it - something like FOTS approach https://arxiv.org/abs/1801.01671v2 ).
Got it. Thank you very much for your clarification and for kindly sharing the code and models of course.
@MichalBusta hi,i think there is a problem in "recurrent conv". You said it appears in tiny.proto, but in paper "recurrent conv" is in Table 1. Fully-Convolutional Network for Text Recognition( this is the model_cv.proto). In model_cv.proto , there is not any conv with reuse param.
Hi, shared version is just fast demo (goal to fit 1GB GPU for demonstration almost 2 years ago). There are quite a lot of deviations from full "paper" version. (small detection network, smaller OCR ... )
Hi, shared version is just fast demo (goal to fit 1GB GPU for demonstration almost 2 years ago). There are quite a lot of deviations from full "paper" version. (small detection network, smaller OCR ... )
So that's it, thank you.