Yuntian Deng
Yuntian Deng
Are you using the raw images? That could be too large, we are cropping the images such that only the equations are kept (script `preprocess_images.py`).
Hmm did you subsample the images to make the size X0.5 smaller? We keep the original size because this math dataset contains fractions, integrals and matrices, which has varying heights,...
thanks for the reply! But it would throw an OOM error on a single Titan X GPU, it'd be nice if there's a flag like accumulate-gradients/update-freq to be able to...
Sorry, but did not quite get the question. Did you mean how to prepare the training data from the sythetic word dataset? The number (71711, 64541, etc.) corresponds to the...
And you can simply verify if the generated file is correct, since the characters are also indicated in the filename, e.g., ./3000/7/182_slinking_71711.jpg 71711, the word is "slinking".
Thanks for pointing this out! I'm still working on cleaning up the evaluation script, but the script we used is already included as part of this repo: https://github.com/da03/markup2im/blob/main/eval_utils/image_evals.py
Yeah, it was just a placeholder, q is actually not being used in that mode.
Nope, it's not included here. Relaxed alignment underperforms baseline soft attention (see Table 1 in our paper), and since we need to pretrain with E_p \log p and then finetune...
Sorry I just noticed this issue... It's because the mixture approach is only used on GSM8K but not on multiplication. Multiplication's CoT is deterministic given the input by design, so...