UNO icon indicating copy to clipboard operation
UNO copied to clipboard

Will release in-context data generation pipelines~?

Open kelisiya opened this issue 1 year ago • 4 comments

kelisiya avatar Apr 15 '25 06:04 kelisiya

We plan to release some data synthesization scripts, but we have to pass the internal control approval in bytedance first. Our planB is to upload a demo dataset with just a few samples(10-).

So if you are hurry about that you can go ahead to the paper, especially supplementary F. It is not hard to prototype replicates our process from the description in the paper which starts from a t2i script.😊

fenfenfenfan avatar Apr 15 '25 07:04 fenfenfenfan

I read the paper and tried to reproduce the process. I have some questions. When I have two images, for me, should the input image be the entire image on the left: the ref image, or the ref image cropped after text detection?

Image

kelisiya avatar Apr 16 '25 08:04 kelisiya

I read the paper and tried to reproduce the process. I have some questions. When I have two images, for me, should the input image be the entire image on the left: the ref image, or the ref image cropped after text detection?

Image

If you are trying to train a single-image conditioned S2I model, we recommend treating the entire image (either the left or right one) as the ref_img and the other as the tgt_img. This is the approach we used in our paper.

fenfenfenfan avatar Apr 17 '25 13:04 fenfenfenfan

Hello, we have open-sourced the dataset (UNO-1M) used in our paper and released all instructions used in in-context data generation pipelines, and we hope these will help you.

fenfenfenfan avatar Aug 18 '25 07:08 fenfenfenfan