AR Vision issues

Results 9 issues of


                                            AR Vision

Export pix2struct models to ONNX

How can the pix2struct models be exported to ONNX? I think these models are not available yet to make the transformation with: !python -m transformers.onnx --model=google/pix2struct-docvqa-base --feature=vision2seq-lm scratch/onnx --atol 1e-3

DocVQA memory error while generating hugging face dataset for fine tunning

I'm trying to fine-tune donut from scratch (lot of images) using the provided notebooks by nielsr. While I'm trying to generate the dataset with hugging face libraries my RAM memory...

DIT classifier fine tuning

I would like to fine tune the dit classification model, with a different number of classes and starting from the dit-rvlcdip model Is there a tutorial notebook ready for this...

DIT Text Detection Inference

I would like to perform a simple inference from the dit model for the text detection you give, and an input image The readme of this component only details how...

DIT classifier trainning custom dataset

I would like to train the dit classifier rvlcdip model in a custom dataset. My dataset would be organized in folders, where each folder would correspond to a class. To...

UDOP different models

Good morning @NielsRogge ! As I understand it, the UDOP model can be used for different tasks such as docvqa, classification or information extraction. Looking at the notebooks you have...

UDOP - Fine tuning with bad metrics

I have obtained a finetune model in funds following the steps in your [notebook](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/UDOP/Fine_tune_UDOPEncoderModel_on_FUNSD_(HuggingFace_Trainer).ipynb), the only change introduced is the model base: `"microsoft/udop-large-512-300k"` Train configuration: ``` training_args = TrainingArguments(output_dir="test", max_steps=3000,...

Different image inferences with same result

Hi, Im doing different tests with `demo/clipiqa_single_image_demo.py` and the `attribute_list = ['Quality', 'Brightness', 'Sharpness', 'Noisiness', 'Colorfulness', 'Contrast']`. First, I’ve seen that fitting a good size to the input image is...

Issues Fine Tuning for new Task

Hi, I am fine-tuning a new key-value extraction task for the Florence model. I started from the notebook from https://colab.research.google.com/drive/1hKDrJ5AH_o7I95PtZ9__VlCTNAo1Gjpf?usp=sharing#scrollTo=zqDWEWDcaSxN Parameters: ``` EPOCHS = 200 LR = 2e-6 Model base...