diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

[Core] introduce `ContinuousTransformer2DModelBlock` to replace intermediate `Transformer2DModel`

Open sayakpaul opened this issue 1 year ago • 1 comments

What does this PR do?

This PR introduces a ContinuousTransformer2DModelBlock to replace the intermediate Transformer2DModel from model classes like UNet2DConditionModel.

Currently, all our UNets that have cross-attention (UNetMotionModel, UNet2DConditionModel, and UNet3DConditionModel) use Transformer2DModel as an intermediate block.

However, this is antithetical to our design. Transformer2DModel inherits ModelMixin and ConfigMixin while still being used as an intermediate block here. ModelMixin and ConfigMixin should be reserved for top-level model classes such as PixArtTransformer2DModel. Intermediate blocks should only inherit from nn.Module.

Since Transformer2DModel is only used when the input is continuous, in ContinuousTransformer2DModelBlock we can be very specific, thereby keeping the code only specific to dealing with continuous inputs.

sayakpaul avatar Jul 03 '24 08:07 sayakpaul

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 14 '24 15:09 github-actions[bot]