Jonghwan Mun
Jonghwan Mun
I also wonder how to obtain the scores in BBC and OVSD.
Video mask is used to indicate the length of video and guide the network to compute attention weights only on unmasked positions. This is required in batch-level training. For example,...
Sorry for syntax errors. I updated the codes as yon mentioned. Thanks!
Sorry for typo. I will revise it. Thanks!
Evaluation scores with metrics are calculated in line 118-124 "ranking_caps.py" To access the 118-124 lines, "eval_after_rerank" should be on and "isTest" is off. Run following line. stdbuf -oL python ranking_caps.py...
No, I did not train the original skip-thought model. I wrote a code to obtain weights of vocabulary for Text-guided attention model from the whole weights trained using Theano ([original...
The class "Accumulator" collects the results (e.g., [email protected]) of individual instances and return their averaged value. So, value means score of metric while count denotes ```the number of instances``` (or...
At each iteration of test time, the value (i.e., score) is computed as averaged one for the ```batch``` (\i.e., v/nb) and count is added by 1 for the iteration. Then,...
"resNet/prediction_result/res_predictions_10NN_test5000.json" file consists of output captions obtained by running "run_evaluation.sh" on "TextguidedATT/002_inference/resNet/" and I wrote the run_inference.sh to run the run_evaluation.sh file automatically. Can you check that you really have...
I write train.lua file that trains models to use both 'cudnn' and 'nn', but layers of ResNet that I downloaded from "https://github.com/facebook/fb.resnet.torch" is constructed to use 'cudnn', thus you need...