Artanic30 comments

Results 6 comments of


                                            Artanic30

ValueError: Asking to pad but the tokenizer does not have a padding token.

> This fixed it for my code: > > ```python > dataset = dataset.map(lambda samples: tokenizer.__call__(samples["text"]), batched=True) > > tokenizer.pad_token = tokenizer.eos_token > > tokenizer.add_special_tokens({'pad_token': '[PAD]'}) > model.resize_token_embeddings(len(tokenizer)) > >...

which llama 7b version use?

We use original [llama2 7b](https://huggingface.co/meta-llama/llama-2-7b-hf) from meta without any modification. There is no usage of InternViT−6B. More details could be found in our technical report, which will be released soon.

which llama 7b version use?

We clarify that the specific version of llama2 text encoder is [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/llama-2-7b-hf).

Batch inference

Hi, I observe a significant performance drop with batch inference. The accuracy for mmmu decreased from 41 to 34 when inference with bs 8 in bunny-1.1 4B. I'm wondering if...

Batch inference

> @mtsysin > > You may refer to [batch_inference.py](https://github.com/BAAI-DCAI/Bunny/blob/main/script/batch_inference.py). > > However, we failed to set the `attention_mask` of left-padding tokens to be `0`. So the `attention_mask` of inputs are...

Error in FID evaluation

Thanks for fast reply. I'm currenting testing reproduced models, maybe I will try it later.