xiaotingxuan issues

Results 5 issues of


                                            xiaotingxuan

Do the text encoders vary between different clip models

when we load clip model, eg model_1, preprocess = clip.load("RN50", device=device, jit=False) model_2, preprocess = clip.load("ViT-B/16", device=device, jit=False) Obviously, the image encoders in model_1 and model_2 are different(ResNet and ViT),...

Why we use diffusion for prior model

Hi , I am a greenhorn for diffusion model According to Dalle2 paper，Prior model is used to predict clip image embeddings from clip text embeddings. I think they design this...

Better performance when sample timesteps is smaller？

Hi,I am a greenhorn in diffusion model I find something strange when I use diffusion prior model to generate image embedding. First , I set prior_cond_scale = 2. and sample...

Inquiry about Video Caption Generation in the WebVid Dataset

Hello，Thanks for sharing the data. Could you please tell me the method used to generate the video captions within the WebVid dataset. Please provide some insights into whether the captions...

Question About top-p sampling

Hello , thanks for sharing your code, it is really helpful. I notice there is a hyperparameter top-p, the code is [here](https://github.com/Shark-NLP/DiffuSeq/blob/8bfafcbb26df218073b8117234afb9de9dfcbec9/diffuseq/gaussian_diffusion.py#L381-390). When we run decode, this hyperparameter is set...