bug-fixed

Results 7 comments of bug-fixed

The problem will exist in `Zero3-offload`. It seems the problem lies in the partition parameter's part in Zero3 if the model has multiple parallel modules or frozen parameters, the offload...

@tjruwase please try this example (https://github.com/haotian-liu/LLaVA/blob/main/scripts/v1_5/finetune.sh) with zero3-offload. Thanks.

@jomayeri , thanks for the response. The file needed in the script can be downloaded in here: https://huggingface.co/liuhaotian/llava-v1.5-mlp2x-336px-pretrain-vicuna-13b-v1.5/tree/main. Unfortunately, I think it's difficult for me to prepare a more concise...

> @bug-fixed Does the same thing happen when you offload to CPU? @jomayeri The machine I'm working on has very limited memory and is shared with others. it is difficult...

Same here. The `generate` speed in `gemma 2 9b` is very slow. Any ideas here? Thanks.

Hello, I'm also confused on a related problem. In the `ssl_default_config.yaml`, the params are `batch_size_per_gpu: 64` and `OFFICIAL_EPOCH_LENGTH: 1250`. And in the README, it says `Run DINOv2 training on 4...