DeepTextSpotter How to prepare training data for retraining?

I use the icdar data , i have changed it into : 0 0.5203125 0.21458333333333332 0.5325 0.135 0 PROPER 0 0.5234375 0.5104166666666666 0.535 0.205 0 FOOD 0 0.521875 0.7947916666666667 0.53 0.13125 0 PRONTO But how should i do next? I am new to caffe, i saw mnist demo should change data to imdb format. I don't know how to prepare data for this project.

And i have another question: in tiny.proto , data layer named "OnDiskData", but i can't find this layer in project file. I think it is strange.

Thanks a lot.

Oct 16 '18 12:10 xxlxx1

You can see this issue https://github.com/MichalBusta/DeepTextSpotter/issues/10

Oct 16 '18 13:10 Gitchenguang

ondisk_data_layer.cpp is in the caffe code. In caffe/src/caffe/layers/.

Make a text file with the list of image filenames (png or jpg), with either absolute pathnames or names relative to the directory that holds that text file. In the tiny.prototext, set the source as your text file of image filenames. data_param { source: "/path/to/your/list/list_icdar2015.txt" }

Oct 16 '18 15:10 mattroos

@xxlxx1 How did you change your icdar data into that???

Oct 17 '18 02:10 ghost

@a41888936 https://github.com/MichalBusta/dataset_conversions in dup_boxes_icdar17.py

Oct 17 '18 07:10 xxlxx1

@linchenguang @mattroos Thanks a lot. Now i can train, but there is still some question.

What is the role of cmp_trie in validation? I am not vary clear to icdar competition
There is too many "Train net output #484763: trans" when trainning, why? I just train for 20 itert but there is hundreds of thousands of "Train net output #484763: trans".

Oct 18 '18 07:10 xxlxx1

What is the role of cmp_trie in validation? I am not vary clear to icdar competition

none, it is just for dictionary decoding

There is too many "Train net output #484763: trans" when trainning, why? I just train for 20 itert but there is hundreds of thousands of "Train net output #484763: trans". default caffe stuff: it is expected that leaf blobs have just one output (loss value)

to suppress it, you can change log level

Oct 18 '18 08:10 MichalBusta

In python:

os.environ['GLOG_minloglevel'] = '2'
import caffe

You have to set it before importing caffe.

The levels are: 0 - debug 1 - info (still a LOT of outputs) 2 - warnings 3 - errors https://stackoverflow.com/questions/29788075/setting-glog-minloglevel-1-to-prevent-output-in-shell-from-caffe

Oct 18 '18 12:10 mattroos