Where does the score file in FastRerank come from?
In config.py, on line 26-28
it seems like the preprocessing step need article.txt, title.txt, template, samples.index.json and _score.json to be prepared in config.py to run the whole process
But after doing retrieve, I only got train/test/dev.sample.index other than the original article and title file
So how can I get all the other data I need such as sample.index.json and _score.json ?
You can use the index file to get the templates (summaries of the corresponding training article), and each line contains 30 indices of one sample. The score is the ROUGE-1 of template evaluated with true summary. I'm sorry I didn't include this code, because when I implemented this project, there was no suitable python wrapper for ROUGE evaluation. I had to use the perl version seperatedly for this job. Which was, however, overwritten by other codes.
Could you give an quick example of what does these 2 json file look like? Cause I’m now getting a text file after doing Retrieve part and not sure how the require .json file suppose to format.
The score.json looks like this, [{"art_idx":"0","scores":[0.25,0.1333333333,0.25,0.25,0.1111111111,0.125,0.1428571429,0.2666666667,0.2857142857,0.2666666667,0.2666666667,0.2,0.25,0.125,0.2666666667,0.2666666667,0.2666666667,0.2,0.1538461538,0.375,0.625,0.25,0.5,0.2666666667,0.125,0.5333333333,0.2666666667,0.2666666667,0.25,0.125],"tp_idx":[280563,468740,2977802,2978740,1305283,810428,143628,3305902,96755,227145,227356,228893,2668569,2669230,2669605,2579854,2579826,86884,54116,88311,186211,342885,414963,558914,1305361,897042,2608945,2832328,554728,98514]}]
Actually I didn't use the sample.index.json in my code, it is the previous version which I forgot to delete.