Thomas-MMJ comments

Results 55 comments of


                                            Thomas-MMJ

Issue in deepspeed installation with python 3.8

> i tried `pip install triton==1.0.0` There is no pypi 1.0.0 build for triton for windows. There is nothing recent. You will have to get triton built from source on...

Optimize VRAM use in textual inversion training

Any progress on the blockers?

corrupted device_map in accelerate in the test_models_unet.py triggered by certain tests

Updated to latest, but I still get the two fails if I run the one test first. pytest ./tests/test_layers_utils.py::AttentionBlockTest ./tests/test_models_unet.py ================================================================================= short test summary info ================================================================================== FAILED tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_accelerate - IndexError:...

corrupted device_map in accelerate in the test_models_unet.py triggered by certain tests

Now this is bizarre, if I run test_attention_block_default then test_from_pretrained_accelerate immediately after each other, then test_from_pretrained_accelerate passes; if I run all three, the first two pass, and the third fails....

corrupted device_map in accelerate in the test_models_unet.py triggered by certain tests

Changing the order changes the results also, here none fail. pytest ./tests/test_layers_utils.py::AttentionBlockTests::test_attention_block_default ./tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_accelerate_wont_change_results ./tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_accelerate ./tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_hub here 1 fails, pytest ./tests/test_layers_utils.py::AttentionBlockTests::test_attention_block_default ./tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_accelerate ./tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_accelerate_wont_change_results ./tests/test_models_unet.py::UNetLDMModelTests::test_from_pretrained_hub

corrupted device_map in accelerate in the test_models_unet.py triggered by certain tests

Note that it is reproducable in wsl linux debian on this same device. the debian is using different pytorch, etc.

corrupted device_map in accelerate in the test_models_unet.py triggered by certain tests

@sgugger suggested that the memory wasn't being cleared, if I add ``` def clear_memory(self): if torch.cuda.is_available(): torch.cuda.synchronize() torch.cuda.empty_cache() # https://forums.fast.ai/t/clearing-gpu-memory-pytorch/14637 gc.collect() ``` from https://www.programcreek.com/python/?CodeExample=clear+memory and run it at the end...

DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.

I've submitted a pull request to fix this, hopefully will be reviewed and committed this coming week.

DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.

note has been discussed and there are changes I need to make before accepted.

Dreambooth

To work on a 3090 with 12GB you need to use deepspeed. ``` accelerate launch --use_deepspeed --zero_stage=2 --gradient_accumulation_steps=1 --offload_param_device=cpu --offload_optimizer_device=cpu train_dreambooth.py \ --pretrained_model_name_or_path=$MODEL_NAME --use_auth_token \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR...