ddk-sysu issues

Results 9 issues of


                                            ddk-sysu

is is a bug when flip the image?

hi @suyukun666 i wonder these [two lines ](https://github.com/suyukun666/S2CNet/blob/main/data/augmentations.py#L334-L335) are bug? since flip the image, it should flip the x1, x2 correspondingly to make sure x2 >= x1. when i try...

still has a plan to release the training code and the training dataset collected by the paper?

about the input format of bbox or roi

hi @suyukun666 , in the croppingdataset.py file, the code to process the bbox or roi in TransformFunction as below: ``` MOS = [] for annotation in annotations: transformed_bbox['xmin'].append(math.floor(float(annotation[1]) * scale_width))...

when inference with a video using 32 frames, the inference speed is too slow for mplug_owl3

hi @LukeForeverYoung when inference with a video using 32 frames, using params = {'num_beams': 3, 'repetition_penalty': 1.2,}, the inference time is about 30s~120s. any idea to solve it?

can internvideo-2.5 support mult-images as input?

hi the example given in the HF, is using a video path as input, so can it support multi-images as input? if support, can give the code example to how...

what kind of torch version use?

when load the torch model, it meets 'rms_norm' not found, and google and find that torch>=2.4.0 can support it

when try to use flash_attention_v2, not find `_prepare_4d_causal_attention_mask_for_sdpa`

after installing the `flash_attn` and run the scorer.py script, meet the problem: ``` NameError: name '_prepare_4d_causal_attention_mask_for_sdpa' is not defined. Did you mean: '_prepare_4d_causal_attention_mask'? ``` any idea to solve it?

i find the reason why the compostiion acc is lower when train end2end

hi @bo-zhang-cs when test your release model, i find that the acc is lower than the paper, which is 73%. and i try to figure out how to fix this...