AR Vision
AR Vision
How can the pix2struct models be exported to ONNX? I think these models are not available yet to make the transformation with: !python -m transformers.onnx --model=google/pix2struct-docvqa-base --feature=vision2seq-lm scratch/onnx --atol 1e-3
I'm trying to fine-tune donut from scratch (lot of images) using the provided notebooks by nielsr. While I'm trying to generate the dataset with hugging face libraries my RAM memory...
I would like to fine tune the dit classification model, with a different number of classes and starting from the dit-rvlcdip model Is there a tutorial notebook ready for this...
I would like to perform a simple inference from the dit model for the text detection you give, and an input image The readme of this component only details how...
I would like to train the dit classifier rvlcdip model in a custom dataset. My dataset would be organized in folders, where each folder would correspond to a class. To...
Good morning @NielsRogge ! As I understand it, the UDOP model can be used for different tasks such as docvqa, classification or information extraction. Looking at the notebooks you have...
I have obtained a finetune model in funds following the steps in your [notebook](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/UDOP/Fine_tune_UDOPEncoderModel_on_FUNSD_(HuggingFace_Trainer).ipynb), the only change introduced is the model base: `"microsoft/udop-large-512-300k"` Train configuration: ``` training_args = TrainingArguments(output_dir="test", max_steps=3000,...
Hi, Im doing different tests with `demo/clipiqa_single_image_demo.py` and the `attribute_list = ['Quality', 'Brightness', 'Sharpness', 'Noisiness', 'Colorfulness', 'Contrast']`. First, I’ve seen that fitting a good size to the input image is...
Hi, I am fine-tuning a new key-value extraction task for the Florence model. I started from the notebook from https://colab.research.google.com/drive/1hKDrJ5AH_o7I95PtZ9__VlCTNAo1Gjpf?usp=sharing#scrollTo=zqDWEWDcaSxN Parameters: ``` EPOCHS = 200 LR = 2e-6 Model base...