guided-diffusion-keras icon indicating copy to clipboard operation
guided-diffusion-keras copied to clipboard

Training sample from custom dataset

Open axel588 opened this issue 2 years ago • 3 comments

I could'nt manage to train on a custom dataset, many parts in the code in the sample training call external dataset. Is it possible to have a sample training code on custom datasets, using utils_load_dataset didn't work for the training case. The embedding is for what I've understood a clip encoded list of strings using their tokeniser. But much of this is hard to setup. The idea would be to have a simple, possible to train on custom dataset, training sample, it's something truely missing in many repositories.

Thanks for the work you've done !

axel588 avatar Feb 15 '23 23:02 axel588

Hi - what dataset are you trying to use? For the text embedding you can use the get_text_encodings function and the images can just be resized to the appropriate size and saved as an numpy file.

apapiu avatar Feb 16 '23 00:02 apapiu

thanks @apapiu for your answer the main issue I have is with 16_16_latent_embeddings.npy I have no idea how to reproduce this kind of file, Not sure how to transform images to 'latent embedding'. I have a folder of images.png and images.txt ... , I don't know how to convert this to a latent embedding, my attempt until then was to create a dataset that return a numpy array of the imahe a,d called get _ text encodings for the text in getitem without success.

axel588 avatar Feb 18 '23 19:02 axel588