Hubert issues

Results 4 issues of


                                            Hubert

不定长的识别问题

你好，用您提供的开源模型进行不定长测试，有这两种问题： **1.图片不定长：** transformer = dataset.resizeNormalize((280, 32))，非280会报错，CRNN的处理是按照32的高然后同比例缩放图片的宽，因此输入是（x,32） **2.文字不定长：** 可能是因为训练的时候都是10个字，预测的时候不管图片里面几个字，预测结果还都是10个字左右？举个例子，把图片 ![20436312_1683447152](https://user-images.githubusercontent.com/29619323/54262929-75863f80-45aa-11e9-91d3-db90dec0dc04.jpg) 中的字去掉几个后，还是280*32输入识别， ![2043](https://user-images.githubusercontent.com/29619323/54262972-9189e100-45aa-11e9-8be8-39202d7f7f39.jpg) 结果是这样： predict_str:，__不愿意意意资（9个字） => prob:0.002346405293792486 ![20437421_](https://user-images.githubusercontent.com/29619323/54267706-aa978f80-45b4-11e9-8837-8f83854d9ad9.jpg) predict_str:中国通信信位主办、《 (10个字) => prob:0.05960559844970703 ![20437421_21](https://user-images.githubusercontent.com/29619323/54267730-b6835180-45b4-11e9-9bde-992c4161c62d.jpg) predict_str:，（通信学会主主府（9个字） => prob:0.000349084148183465 ![204](https://user-images.githubusercontent.com/29619323/54267929-1ed23300-45b5-11e9-82db-6cb1675b06e8.jpg) predict_str:叶国通信学会主里”《（10个字） =>...

how to generate a polygon label for "polygons in polygons"

i have the images and masks, and can get the "BitMasks", but transfiner need another groundtruth,that`s poly_masks https://github.com/SysCV/transfiner/blob/5b61fb53d8df5484f44c8b7d8415f398fd283ddc/detectron2/data/detection_utils.py#L437 i dont know how to express the instances with mask_format=='polygon'. my mask...

足球时序定位的项目，里面没有用音频特征而是PCM？

根据文档一步一步走下来，在提取特征的时候，保存pkl存了3种数据，分别是 video_features = { 'image_feature': np_image_features, 'audio_feature': np_audio_features, 'pcm_feature': np_pcm_features } 但是在get_instance_for_bmn.py 里面，并没有用audio_feature feature_video = np.concatenate((image_feature, pcm_feature), axis=1) 而且train_proposal/configs/bmn_football_v2.0.yaml 里 feat_dim: 2688 #train bmn with image feature. If add audio...

OverflowError: cannot fit 'int' into an index-sized integer

python -m scripts.animate --config configs/prompts/v1/v1-1-ToonYou.yaml File "/export/software/anaconda3/envs/animatediff/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3370, in _pad encoded_inputs["attention_mask"] = encoded_inputs["attention_mask"] + [0] * difference OverflowError: cannot fit 'int' into an index-sized integer