Zhenwei comments

Results 17 comments of


                                            Zhenwei

Can you provide a log file of lxmert pretraining

The pretrained model you provided(at http://nlp.cs.unc.edu/data/model_LXRT.pth) was trained after 20 epoch or 12 epoch? Can the 12-epoch pretrained model achieve 79% accuracy on VQA dataset?

Questions about dtype of model weights.

Thanks for your reply. But I'm not sure if I've made my point clear. And, sorry, I still have some questions. > I assume the issue with `print(model.config.torch_dtype)` might be...

Segmentation fault on MacOS

## platform MacBook Air (Retina, 13-inch, 2020), 1.1 GHz 4core Intel Core i5 OS: macOS Catalina, v10.15.7 (19H2) ## python version as you can see in the screenshot ## pip...

Do the training data for the pretrained OFA include samples from COCO val set?

Thanks for your reply.

hw_submission(邵镇炜): add hw3_20230215

deploy视频输出还在debug中。（因为没有重新folk，files changed里面还能看到上次作业的文件更改，直接折叠看chapter3的文件就好，见谅）

> https://github.com/ParadoxZW/LLaVA-UHD-Better/blob/main/llava_uhd/adapt_llava.py#L136-L138 > > 这里由于The first token is for CLS，是不是需要把 > > ```python > m[:w * h] = True > ``` > > 改成 > > ```python > m[:w *...

Attention mask的计算？

> 以及check了下clip的code，attention mask在clip里应该-float('inf')才是表示mask而不是0或1表示mask？这是我从`~/miniconda3/envs/llv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py`的`CLIPEncoder`类的`forward`函数的注释里拷贝出来的 ``` attention_mask (`torch.Tensor` of shape `(batch_size, sequence_length)`, *optional*): Mask to avoid performing attention on padding token indices. Mask values selected in `[0, 1]`: - 1 for...

Attention mask的计算？

感谢您的关注与反馈。 P.S. 对于开放视觉encoder训练的支持，还有一点小问题，我会尽快更新（主要我这两天没卡了，训练推迟了）。如果您要跑实验，建议先不打开这个选项。

Attention mask的计算？

> > > 以及check了下clip的code，attention mask在clip里应该-float('inf')才是表示mask而不是0或1表示mask？ > > > > > > 这是我从`~/miniconda3/envs/llv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py`的`CLIPEncoder`类的`forward`函数的注释里拷贝出来的 > > ``` > > attention_mask (`torch.Tensor` of shape `(batch_size, sequence_length)`, *optional*): > > Mask to avoid performing...

Attention mask的计算？

如果您愿意，你可以基于现在的代码版本提交您的pr修复上述问题，我可以进行merge。当然我来update也可以。