laksjdjf
laksjdjf
It seems that at least one of the inputs must have "requires_grad=True" for the [torch.utils.checkpoint](https://pytorch.org/docs/stable/checkpoint.html) to work. A simple solution to this problem for UNet-only training is to set `unet.conv_in.requires_grad_(True)`
> I face the same problem when opening gradient checkpointing, Is there a way to solve this problem under text-encoder and unet joint training? Kohya's repo seems to have solved...
> Interesting, but what if you didn't want to train the embeddings? In the case of LoRA, the embeddings parameter is not passed to the optimizer. Therefore, it is not...
maybe p14 in https://arxiv.org/abs/2202.00512
> Let's fix the tests :) @sayakpaul I removed meaningless spaces for test. Are there any other operations required?
@sayakpaul Is this error relevant to this pr? ``` =========================== short test summary info ============================ FAILED tests/pipelines/unidiffuser/test_unidiffuser.py::UniDiffuserPipelineFastTests::test_attention_slicing_forward_pass - requests.exceptions.ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 869e35f5-5627-4acf-9853-7187ce7d0656)') ====...
Deepspeed does not seem to support 8bitAdam. https://github.com/huggingface/diffusers/pull/735. I succeeded to work train_db.py with deepspeed by removing 8bitAdam option. Deepspeed offloads optimizer states to cpu so VRAM usage is reduced...
> on windows? if so howd you install deepspeed? linux.
same issue as #38
I implemented it like this. https://github.com/laksjdjf/dezero-diffusion/blob/a223c7e2bb06e149ff0a8b0714fcc88fb38b08b7/modules/unet.py#L10-L40