DAI comments

Results 20 comments of

DAI

Can LlamaGen predict a [EOS] token when inferencing?

No， we donot need to

Can LlamaGen predict a [EOS] token when inferencing?

emm.. It will work but it will be just useless since the model will do the exact step during inference since the image has a fix latent size.

Can LlamaGen predict a [EOS] token when inferencing?

But it will be usefull if you want to training on different spatial ratio image and add the information as the start token

Can LlamaGen predict a [EOS] token when inferencing?

Yes, And my another suggest based on my training in many t2i models, add cross attention instead of add the token to the front will produce more promising result.

About ROPE in sample process

the PE Is trying to tell the model about the relative position to the image, for example, a pixel should have more relation with the nearly by pixel. But in...

About ROPE in sample process

Oh， I see， so the text only make effect via xv?

About ROPE in sample process

And it will also mean the text attention mask is useless?

Which parameters are trainable? Are the encoder and decoder in VQGAN fixed? Is the llama fixed?

If you only train the VQGAN, then obviously the VQGAN are trainable. If you train the GPT for the image generation, then you only need to trained the GPT model...

Mask guidance, inpaiting and outpaiting

The autoregreesive model acutally is not really good for mask guidance image generation in my opinion

Mask guidance, inpaiting and outpaiting

If the mask is just like casual mask I think it will be great, but I do not think we always has the casual mask in real life