[Feature] HunyuanVideo-1.5
Feature Summary
Support for tencent's new 8.3B video generation model
Detailed Description
In their own words:
HunyuanVideo-1.5 is a video generation model that delivers top-tier quality with only 8.3B parameters, significantly lowering the barrier to usage. It runs smoothly on consumer-grade GPUs, making it accessible for every developer and creator. This repository provides the implementation and tools needed to generate creative videos.
https://huggingface.co/tencent/HunyuanVideo-1.5
Additional context
This achievement is built upon several key components, including meticulous data curation, an advanced DiT architecture with selective and sliding tile attention(SSTA), enhanced bilingual understanding through glyph-aware text encoding , progressive pre-training and post-training, and an efficient video super-resolution network.
Also "Flex-Block-Attention": https://github.com/Tencent-Hunyuan/flex-block-attn
wip
"Requirements: Hopper (SM90) GPUs, or other architectures with SM90 PTX ISA support"
So it's not feasible for most people.
"Requirements: Hopper (SM90) GPUs, or other architectures with SM90 PTX ISA support"
So it's not feasible for most people.
That is only really referencing their specific implementation and has not much to do with us.