magic-animate icon indicating copy to clipboard operation
magic-animate copied to clipboard

Query Regarding Input Specifications for Animation Generation

Open yihong1120 opened this issue 2 years ago • 2 comments

Dear MagicAnimate Contributors,

I hope this message finds you well. I am reaching out to inquire about the specific input materials required for generating animations using the MagicAnimate tool. While the README.md provides a comprehensive guide, I would appreciate further clarification on the following points:

  1. What are the preferred dimensions and file formats for the input human images?
  2. Is there a recommended resolution or aspect ratio that ensures optimal animation results?
  3. Could you provide examples of the target description or reference that should accompany the input images for animation?
  4. Are there any limitations or considerations we should be aware of when selecting images for animation (e.g., background complexity, clothing type, pose)?
  5. Additionally, I am interested in understanding the process for training the MagicAnimate model with custom datasets. Could you provide insights or documentation on how to train the model from scratch or fine-tune it with specific data? This information would be invaluable for tailoring the animations to the unique requirements of my project.

I am keen to utilise MagicAnimate for an upcoming project and want to ensure that I prepare the input materials correctly to achieve the best possible outcomes.

Thank you for developing such an innovative tool and for your assistance with my query. I look forward to your guidance.

Best wishes, yihong1120

yihong1120 avatar Dec 07 '23 07:12 yihong1120

Hi, thanks for the interest. Below are my answers:

  1. There is no preferred dimensions, but we only experimented on 512X512. We use an image format of png.
  2. If you directly use our inference code, the recommended resolution is 512X512.
  3. We use null text.
  4. There is no limitations, but if the reference image contains full body or close to the targe pose, the result could be better.
  5. We haven't prepared any documents for training or fine-tuning.

zcxu-eric avatar Dec 07 '23 08:12 zcxu-eric

yes i tested and 768x768 produces horrific results

@zcxu-eric can we use other custom fine tuned models? i am yet to test

such as realistic vision 5.1

FurkanGozukara avatar Dec 07 '23 19:12 FurkanGozukara