zhangao comments

Results 26 comments of


                                            zhangao

gqa_bal_qla_testdev.json and gqa_all_qla_testdev.json missing

It seems like the gqa_bal_qla_val.json is actually gqa_bal_qla_testdev.json. The real gqa_all_qla_val.json is merged to gqa_all_qla_train.json.

code problem

When applying DataParallel to model.full_im_net, the full_im_net will scatter every param into several part. However, only one param rois need to be divided. You can use "rois_feature = self.full_im_net(full_im.repeat(gpu_num, 1,1,1),...

bug

> /data/anaconda3/envs/scene_graph_benchmark/lib/python3.8/site-packages/torch/include/ATen/core/TensorBody.h:322:1: note: declared here > T * data() const { > ^ ~~ > /data/swfcode/Scene-Graph-Benchmark.pytorch/maskrcnn_benchmark/csrc/cuda/deform_conv_kernel_cuda.cu:855:106: warning: ‘T* at::Tensor::data() const [with T = c10::Half]’ is deprecated [-Wdeprecated-declarations] > const scalar_t...

scene graph generation for custom image

Yes, you just need to provide the image. The visualization results are of [vg1800_motif_sgcls_I90_E100.pth](https://thunlp.oss-cn-qingdao.aliyuncs.com/ietrans/vg1800_motif_sgcls_I90_E100.pth). The corresponding command is: ```shell bash cmds/1000/motif/demo_sgdet.sh visualization/demo_imgs/ ```

about the mretrics of VG-1800?

Sorry for the late reply. Please copy the file tools/vg1800/eval.py to your output dir (the dir with eval_results.pytorch). Then, run `python eval.py`. The relational triplet (e.g. top50 triplet: xxxx) results...

实时多模态交互代码demo部分源码大概什么时候可以开源呢？可以说一下思路吗？

感谢关注。这个实时交互的demo整体是一个pipeline系统，由ChatGPT API整合多帧描述，效果确实不太稳定。为了尽量缓解不稳定效果，我们实现的框架思路中大致如下: 1. 尽量让OmniLMM/MiniCPM-V对每帧描述的简短一些，防止过多的无关内容，我们用的prompt是"What is happening in the image?"。 2. 每次回答问题后加了一个历史总结功能来防止上下文过长，使用的prompt是： ``` Before coming to the next round, please make summarization. You need to first illustrate the overall event. For...

zhangao

gqa_bal_qla_testdev.json and gqa_all_qla_testdev.json missing

code problem

bug

scene graph generation for custom image

about the mretrics of VG-1800?

实时多模态交互代码demo部分源码大概什么时候可以开源呢？可以说一下思路吗？

About Oscar/Oscar+ model

About Oscar/Oscar+ model

Any detail report of minicpm v2?

SGCLS