CogVideo
CogVideo copied to clipboard
Visual Video DIT Attention map
System Info / 系統信息
visualize attention maps for video generation models based on the Diffusers Transformer。
Information / 问题信息
- [ ] The official example scripts / 官方的示例脚本
- [X] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
1 ref paper ”https://arxiv.org/abs/2412.18597“ 2 code https://github.com/Passenger12138/attention-map-diffusers-vdm.git
Expected behavior / 期待表现
text-2-text attention map
text-2-video attention map
word "jacket"
image
video https://github.com/user-attachments/assets/8ae0f67e-abdb-4aa4-bb65-95b31feae222