yqy2001

Results 21 comments of yqy2001

@cherry956 Just delete the [`scheduler`](https://github.com/haotian-liu/LLaVA/blob/3e337ad269da3245643a2724a1d694b5839c37f9/scripts/zero3_offload.json#L22) field in [zero3_offload.json](https://github.com/haotian-liu/LLaVA/blob/main/scripts/zero3_offload.json) can make the lr scheduler bahave properly

Yes, this is a normal phenomenon as the output format of pretrained model is hard to control. The behavior is also similar to that of LLaMA-1-30B, especially when the model...

Great! Thanks for the prompt response.

> Hi @yqy2001 now our [inference example](https://github.com/FoundationVision/VAR/blob/main/demo_sample.ipynb), [demo platform](https://var.vision/demo), and [model weights](https://huggingface.co/FoundationVision/var) are ready. I'm cleaning up the training code and it'll be ready in the very few days. Stay...

Hi there! I think this is a critical issue and have an urgent need for it, in my attempt to train on a super large-scale dataset using `datasets`. It is...

Hi there, what is the current status of this PR? It seems that everything works well. Will this be merged?

+1. I delete the [`scheduler`](https://github.com/haotian-liu/LLaVA/blob/3e337ad269da3245643a2724a1d694b5839c37f9/scripts/zero3_offload.json#L22) field in [zero3_offload.json](https://github.com/haotian-liu/LLaVA/blob/main/scripts/zero3_offload.json), which makes the lr scheduler bahave properly

Hi there, it doesn't matter without an A100. You could try to convert the Flan-T5 model to `float16` with the latest version `transformers` (the old versions may have some bugs...

> > Anyone found out why this happens? > > The reason seem obvious > > ``` > >>> import torch > >>> torch.compiled_with_cxx11_abi() > False > >>> torch.__version__ >...