parseq
parseq copied to clipboard
The model hallucinates blank images and creates output out of thin air
When using the ParseQ model for inference, if a blank image is input, the output content is out of thin air.
To solve this problem, I generated 10,000 blank images of varying lengths and set their labels to a single space ‘ ’, but these data were ignored when finetuning the model:
https://github.com/baudm/parseq/blob/1902db043c029a7e03a3818c616c06600af574be/strhub/data/dataset.py#L115
How can I solve this problem? Should I simply set remove_whitespace to false? For blank images of varying lengths, should I set their labels to a single space or to different numbers of spaces depending on the length of the image?