training_extensions icon indicating copy to clipboard operation
training_extensions copied to clipboard

Training pytorch text recognition was limited output 29 character, how to change the number of output character

Open cnxdeveloper opened this issue 3 years ago • 4 comments

Thank you very much for the wonderful project. But I got a problem when I trained text recognition text-recognition-0016 (YATR). My dataset output maximum is 40 characters. So I changed config of config_0016.yaml like below. But when I test the model maximum character output just 29 characters. How to I solve it. Thank you very much for supporting head: type: TextRecognitionHeadAttention encoder_input_size: 1024 encoder_dim_internal: 1024 encoder_num_layers: 3 decoder_input_feature_size: [3,12] decoder_max_seq_len: 40 decoder_dim_hidden: 1024 decoder_sos_index: 0 decoder_rnn_type: "GRU" dropout_ratio: 0.3

cnxdeveloper avatar Feb 12 '22 13:02 cnxdeveloper

Hi, @cnxdeverloper!

Thank you for your interest to the project. I will try the best to provide help to you with your issue. I think, the most probable reason for this is that at some step model predicts end symbol, and every symbol after the end symbol is removed in the post-processing stage. Could you, please, share the details, how (where) do you see the maximum number of characters of 29? Is it the number of the output characters (decoded) or the number of predicted by the model indices?

morkovka1337 avatar Feb 14 '22 04:02 morkovka1337

Hi, @cnxdeverloper!

Thank you for your interest to the project. I will try the best to provide help to you with your issue. I think, the most probable reason for this is that at some step model predicts end symbol, and every symbol after the end symbol is removed in the post-processing stage. Could you, please, share the details, how (where) do you see the maximum number of characters of 29? Is it the number of the output characters (decoded) or the number of predicted by the model indices?

I check result in the report in dir val_result. In log file txt the predict text and GT text was showed side by side. Try show you some example bellow. I saw the predict miss some character in the end I don't know why. Yeah I know when got end symbol the postprocess will stop encode. How to solve it bro. Thank you very much Predict: 1 9 2 3 ) i s a c a n a d i a n p r o f e s GT: 1 9 2 3 ) i s a c a n a d i a n p r o f e s s o r Predict: b ả n m ẫ u : t h á n g t r o n g n ă m 1 GT: b ả n m ẫ u : t h á n g t r o n g n ă m 1 9 0 4

cnxdeveloper avatar Feb 14 '22 06:02 cnxdeveloper

I saw the predict miss some character in the end

The chance of missing a character is higher when decoding procedure going through the phrase due to the architecture of the recurrent units. The reason that model puts end_token in a particular step is that it is not sure in the tokens after end token (i.e. they have low confidence values).

Yeah I know when got end symbol the postprocess will stop encode. How to solve it bro.

You can change the value of ignore_end_token in the function here

morkovka1337 avatar Feb 14 '22 07:02 morkovka1337

I saw the predict miss some character in the end

The chance of missing a character is higher when decoding procedure going through the phrase due to the architecture of the recurrent units. The reason that model puts end_token in a particular step is that it is not sure in the tokens after end token (i.e. they have low confidence values).

Yeah I know when got end symbol the postprocess will stop encode. How to solve it bro.

You can change the value of ignore_end_token in the function here How could i config architecture of RNN to get more character :D. yeah but when the encode stop when I ignore_end_token. I think end_token is mark end of word when decode

cnxdeveloper avatar Feb 17 '22 11:02 cnxdeveloper

Close this issue, please reopen it if there is any problem.

sungmanc avatar Apr 20 '23 04:04 sungmanc