sidenet icon indicating copy to clipboard operation
sidenet copied to clipboard

How can I get *.summary.final.org_sents and *.gold files?

Open friskit-china opened this issue 7 years ago • 3 comments

Hi,

I tried your code and got an error after the first epoch:

...
STEP A: Epoch 1 : Covered 83540/83564 : Minibatch CE Loss= 19.836004, Minibatch Accuracy= 0.878014
STEP A: Epoch 1 : Covered 83560/83564 : Minibatch CE Loss= 10.907549, Minibatch Accuracy= 0.878505
STEP A: Epoch 1 : Saving model after epoch completion
STEP A: Epoch 1 : Performance on the validation data
STEP A: Epoch 1 : Validation (1220) accuracy= 0.852109
STEP A: Epoch 1 : Writing final validation summaries
Writing predictions and final summaries ...
Traceback (most recent call last):
  File "document_summarizer_gpu2.py", line 252, in <module>
    tf.app.run()
  File "/s1_md0/v-botsh/anaconda/tf0.10_py36/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "document_summarizer_gpu2.py", line 245, in main
    train()
  File "document_summarizer_gpu2.py", line 169, in train
    validation_data.write_prediction_summaries(validation_logits, "step-a.model.ckpt.epoch-"+str(epoch), session=sess)
  File "/s1_md0/v-botsh/Research/Repo/sidenet/data_utils.py", line 61, in write_prediction_summaries
    self.process_predictions_rankedtopthree(modelname+"."+self.data_type)
  File "/s1_md0/v-botsh/Research/Repo/sidenet/data_utils.py", line 116, in process_predictions_rankedtopthree
    docsents = open(sent_filename).readlines()
IOError: [Errno 2] No such file or directory: 'CNN-DailyMail/JP-Hermann/cnn/validation-sent/8f6b39e6c63b0ae3546cdfeb8209693f292b060e.summary.final.org_sents'

I only have "*.summary.final" files instead of ".summary.final.org_sents"

Where can I get those org_sents files?

besides, I noticed that "*.gold" files are also important for rouge evaluation. How can I get these files?

Thanks!

friskit-china avatar Oct 24 '18 02:10 friskit-china

They can be found in "Dataset with sideinfo: http://kinloch.inf.ed.ac.uk/public/cnn-dm-sideinfo-data.zip". But, I have also updated the Readme page to directly download those preprocessed files.

shashiongithub avatar Oct 24 '18 09:10 shashiongithub

@shashiongithub Thank you for your response.

The file you provided only contains "*.summary.final" files *.summary.final.org_sents files and *.gold files are not included in the zip.

Or is there any script that I can use to generate .gold and .summary.final.org_sent files from these summary.final files?

Thank you :)

friskit-china avatar Oct 24 '18 09:10 friskit-china

Ohh thanks, I have uploaded them as well. Shashi

shashiongithub avatar Oct 24 '18 10:10 shashiongithub