jinxixiang

Results 19 comments of jinxixiang

Same question here. I am wondering whether the TPS transform is applicable to high resolution simply by increasing the number of key points and TPS.

I had the same issue. My server has no internet connection and this is painful. I found a workaround solution to this issue and may help you. **Step 1**: download...

Thank you for your help! The training loss and accuracy of masked prediction are attached. notes: - i2t_train_acc, t2i_train_acc: contrastive top1 acc - mim_image_train_acc: monomodal image acc - mim_train_acc: vl-ffn...

And the plot of **MIM + MLM loss: ( same as BEIT3)**

we set batch size = 1024. How does the contrastive loss on the VL-FFN help? since we only use the V-FFN and L-FFN to compute cosine similarity for retrieval.

ok, thank you for your advice. I followed the implementation of contrastive loss from VLMO. But maybe vl_i2t and vl_t2i are not the main reasons to prevent convergence? Also, I...

Thank you for your reply. _torchscale_ is a helpful toolkit for large model training, and we are happy to try it out later. But I suppose that the issue is...

I don't know whether the spatial palette mode in T2I adapter can fulfill your requirements. https://github.com/TencentARC/T2I-Adapter

> yes , we urgently need a control for color, img2img is not very good for color control because img2img not only influences the color of the output but also...

@loboere Sure, we plan to release a Webui demo later.