wangpeng
wangpeng
yes, there reports this error when I run the code :  we can see that the shape of z is [4, 2048, 4096, 2], while Q is [4, 4096,...
Another question, I encounter this error:  so I move the conv1 before upsample  I wonder why put the conv1 affter upsample?
Which paper are you reading?
Hi, Currently, MGP-STR is unable to process Chinese as the model has not been trained on Chinese data, and we have not found an effective method for segmenting Chinese words....
Of course, it is possible. One thing to note is that you will need to gather data in your language and retrain the language model of LevOCR.
Thanks for your reply, I will try it.
Answer the question using a single word or phrase.
Our evaluation is based on the VLMEvalKit framework with some minor modifications. We have adjusted MIN_PIXELS and MAX_PIXELS to 1,003,520 and 4,014,080, respectively. The prompt format is as follows: `...