[Dtype] Align dtype casting behavior with Transformers and Accelerate
Some recent changes in transformers: https://github.com/huggingface/transformers/pull/20602 and accelerate: https://github.com/huggingface/accelerate/pull/920 that force us to also align the behavior in diffusers. For more information also have a look at: https://discuss.pytorch.org/t/discrepancy-between-loading-models-with-meta-tensors-and-normal-load-from-state-dict/168295
The documentation is not available anymore as the PR was closed or merged.
This pretty much reverses: https://github.com/huggingface/diffusers/pull/1449
Wait until https://github.com/huggingface/accelerate/pull/920 is merged.
@pcuenca @patil-suraj feel free to merge whenever. Maybe a nice message here explaining what changed would make sense as well