e4t-diffusion
e4t-diffusion copied to clipboard
Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models
**How many images and resources are needed for pre training ?** Thanks! I use single A100 GPU (40G), a image cost 24 hours , on the pre-training.
This line, assert ckpt_path in MODELS, f"Choose from {list(MODELS.keys())}" and "e4t-diffusion-ffhq-celebahq-v1" is the only key of MODELS. So, in function load_e4t_unet, if os.path.exists(ckpt_path) is False, you WILL get a assert...
Hello, I really admire your implementation and I am planning to use e4t code but unfortunately I can't run your code. The main problem is in the https://github.com/mkshing/e4t-diffusion/blob/main/pretrain_e4t.py#L238 where we...
the accelerate yaml file ``` compute_environment: LOCAL_MACHINE distributed_type: MULTI_GPU downcast_bf16: 'no' gpu_ids: all machine_rank: 0 main_training_function: main mixed_precision: fp8 num_machines: 1 num_processes: 2 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster:...
Thank you for your implementation! However I have some questions about settings and results. I use your **pretained encoder.pt and weight_offsets.pt** on celebahq and FFFHQ. I had to use the...
In the fine tune code, there is an assert hard code the special token as {placeholder_tokne}. ```assert ( "{placeholder_token}" in args.prompt_template ), "You must specify the location of placeholder token...
Hi. Your implementation is working well for me, but in some cases the output is less than ideal. Would it be possible to upgrade this to work with Stable Diffusion...
which is mentioned in the following image. 
I ran the pretraining script _CUDA_VISIBLE_DEVICES=1 accelerate launch pretrain_e4t.py --pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" --clip_model_name_or_path="ViT-H-14::laion2b_s32b_b79k" --domain_class_token="cat" --placeholder_token="*s" --prompt_template=normal --save_sample_prompt="a photo of the *s, a photo of the *s in monet style" --reg_lambda=0.01 --domain_embed_scale=0.1 --output_dir="pretrained-cat"...