Query Regarding Input Specifications for Animation Generation
Dear MagicAnimate Contributors,
I hope this message finds you well. I am reaching out to inquire about the specific input materials required for generating animations using the MagicAnimate tool. While the README.md provides a comprehensive guide, I would appreciate further clarification on the following points:
- What are the preferred dimensions and file formats for the input human images?
- Is there a recommended resolution or aspect ratio that ensures optimal animation results?
- Could you provide examples of the target description or reference that should accompany the input images for animation?
- Are there any limitations or considerations we should be aware of when selecting images for animation (e.g., background complexity, clothing type, pose)?
- Additionally, I am interested in understanding the process for training the MagicAnimate model with custom datasets. Could you provide insights or documentation on how to train the model from scratch or fine-tune it with specific data? This information would be invaluable for tailoring the animations to the unique requirements of my project.
I am keen to utilise MagicAnimate for an upcoming project and want to ensure that I prepare the input materials correctly to achieve the best possible outcomes.
Thank you for developing such an innovative tool and for your assistance with my query. I look forward to your guidance.
Best wishes, yihong1120
Hi, thanks for the interest. Below are my answers:
- There is no preferred dimensions, but we only experimented on 512X512. We use an image format of png.
- If you directly use our inference code, the recommended resolution is 512X512.
- We use null text.
- There is no limitations, but if the reference image contains full body or close to the targe pose, the result could be better.
- We haven't prepared any documents for training or fine-tuning.
yes i tested and 768x768 produces horrific results
@zcxu-eric can we use other custom fine tuned models? i am yet to test
such as realistic vision 5.1