guided-diffusion-keras Training sample from custom dataset

I could'nt manage to train on a custom dataset, many parts in the code in the sample training call external dataset. Is it possible to have a sample training code on custom datasets, using utils_load_dataset didn't work for the training case. The embedding is for what I've understood a clip encoded list of strings using their tokeniser. But much of this is hard to setup. The idea would be to have a simple, possible to train on custom dataset, training sample, it's something truely missing in many repositories.

Thanks for the work you've done !

Feb 15 '23 23:02 axel588

Hi - what dataset are you trying to use? For the text embedding you can use the get_text_encodings function and the images can just be resized to the appropriate size and saved as an numpy file.

Feb 16 '23 00:02 apapiu

thanks @apapiu for your answer the main issue I have is with 16_16_latent_embeddings.npy I have no idea how to reproduce this kind of file, Not sure how to transform images to 'latent embedding'. I have a folder of images.png and images.txt ... , I don't know how to convert this to a latent embedding, my attempt until then was to create a dataset that return a numpy array of the imahe a,d called get _ text encodings for the text in getitem without success.

Feb 18 '23 19:02 axel588