keras-image-ocr
keras-image-ocr copied to clipboard
Words with repeated characters omit the second occurrence of that character

I solved this by replacing the decode_predict_ctc with the decode_batch function.
# For a real OCR application, this should be beam search with a dictionary
# and language model. For this example, best path is sufficient.
def decode_batch(test_func, word_batch):
out = test_func([word_batch])[0]
ret = []
for j in range(out.shape[0]):
out_best = list(np.argmax(out[j, 2:], 1))
out_best = [k for k, g in itertools.groupby(out_best)]
outstr = labels_to_text(out_best)
ret.append(outstr)
return ret