PreSumm icon indicating copy to clipboard operation
PreSumm copied to clipboard

raw text -mode test_text -task ext --> min/max lenght not working

Open ghost opened this issue 5 years ago • 3 comments

Hello.

When doing extractive summarization of raw text using bertext_cnndm_transformer and trying to twist min/max lenght, the output is always the same.

eg. python train.py -mode test_text -task ext -test_from /home/peter/projects/summary/PreSumm/models/bertext_cnndm_transformer.pt -text_src /home/peter/projects/summary/PreSumm/raw_data/raw_data.txt -min_length 200 -max_length 1000

and

python train.py -mode test_text -task ext -test_from /home/peter/projects/summary/PreSumm/models/bertext_cnndm_transformer.pt -text_src /home/peter/projects/summary/PreSumm/raw_data/raw_data.txt -min_length 1000 -max_length 2000

produces the very same output.

Has anyone run into the same problem?

ghost avatar May 15 '20 08:05 ghost

Hi, I am also trying to test for raw_text but I am unable to find the output file. Can you tell me where I can find this. I can only see .candidate and .gold file in results folder. The candidate file contains the same text from the raw text file.

After running this command, it only says " Validation xent: 0 at step -1"

AyeshaSarwar avatar May 18 '20 14:05 AyeshaSarwar

There should be the final output in .candidate file. If you got this type of problem, see this comment https://github.com/nlpyang/PreSumm/issues/130#issuecomment-600965008

ghost avatar Jun 05 '20 07:06 ghost

If this is still a problem, I have found a hard-coded limit here of 3 sentences within one of the files, and tried changing that value, which resulted in the model producing longer extractive summaries.

You can change that into whatever length summaries you desire. It seems you have access to args but I haven't tried tinkering with that.

# if ((not cal_oracle) and (not self.args.recall_eval) and len(_pred) == 3): break

Andrei997 avatar Jul 08 '20 08:07 Andrei997