xjtupanda

Results 81 comments of xjtupanda

Met the same issue when loading multilingual pre-trained wav2vec 2.0 (XLSR) models, and I used the sample code from documentation. ``` import torch import fairseq cp_path = './ckpt/xlsr_53_56k.pt' model, cfg,...

> Any update on this issue? I have a similar issue, though when trying to run `fairseq.checkpoint_utils.load_model_ensemble_and_task` on a wav2vec model that I fine tuned myself with `fairseq-hydra-train`. My issue...

@kaleko Basically this is because there are some keys missing in the AudioPretrainingConfig and leads to inconsistency. I guess by dropping those keys may solve your problem, but I don't...

@KiriKoppelgaard I suppose not since I was just trying to extract features using pretrained models. But to work around this, in the end I used transformers package and loaded the...

This is because the Bert encoder still needs to be downloaded from huggingface. You may download it on [Link](https://huggingface.co/bert-base-uncased/tree/main) and put those files into ```~/.cache/huggingface/hub/models--bert-base-uncased/```

I solved it by downloading the files specified above and change ```text_encoder_type``` in https://github.com/IDEA-Research/GroundingDINO/blob/60d796825e1266e56f7e4e9e00e88de662b67bd3/groundingdino/models/GroundingDINO/groundingdino.py#L107-L108 to the path of downloaded files.

Download the files and save them to a local folder; then put in parentheses the path of the folder.

> > Download the files and save them to a local folder; then put in parentheses the path of the folder. > > when i down all the files, and...

The work has been added to our repo. Please consider citing our works: ``` @article{yin2023survey, title={A Survey on Multimodal Large Language Models}, author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui...

The item has been updated. Please consider citing our works: ``` @article{yin2023survey, title={A Survey on Multimodal Large Language Models}, author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui and Li, Ke...