[Feature Request] Implement batch inference for multiple prompts in single forward pass
Description
I would like to request the implementation of batch inference in LightX2V, allowing multiple prompts to be processed in a single forward pass to improve horizontal scaling and performance.
Motivation
Currently, LightX2V processes individual prompts through the generate() method of LightX2VPipeline. To process multiple prompts, the current approach requires using multiple servers in parallel as shown in post_multi_servers_tv2.py .
Implementing batch inference would provide:
- Better performance: Reduced overhead by processing multiple prompts in a single forward pass
- Horizontal scalability: Easier processing of large volumes of requests
- Resource optimization: Better utilization of GPU memory and compute
Current Behavior
-
LightX2VPipeline.generate()accepts individual parameters:seed,prompt,negative_prompt, etc. - Models like
WanModelandHunyuanVideo15Modelprocess withbatch_size=1 - The server handles tasks one by one to manage GPU memory effectively
Proposed Solution
-
Extend pipeline interface: Modify
generate()to accept lists of prompts -
Batch support in models: Add batch dimension in
_infer_cond_uncond()methods - Memory management: Adjust memory handling for larger batches
- Maintain compatibility: Preserve current API for individual use
Possible Implementations
# Proposed API
pipe.generate_batch(
prompts=["prompt1", "prompt2", "prompt3"],
negative_prompts=["neg1", "neg2", "neg3"],
seeds=[42, 43, 44],
save_result_paths=["out1.mp4", "out2.mp4", "out3.mp4"]
)
Expected Impact
- Significant reduction in processing time for multiple videos
- Better GPU resource utilization
- Easier high-volume production deployments
Note: I'm including the batch inference implementation from diffusers for your reference. I hope it helps! :D
Wan2.2: https://github.com/huggingface/diffusers/blob/262ce19b/src/diffusers/pipelines/wan/pipeline_wan.py (The important sections are lines 191 to 194, lines 384 to 392, lines 510 to 554, and lines 592 to 610)
HunyuanVideo 1.5: https://github.com/huggingface/diffusers/blob/262ce19b/src/diffusers/pipelines/hunyuan_video1_5/pipeline_hunyuan_video1_5.py (The important sections are, from line 397 to 407, lines 546 to 553, lines 664 to 669, lines 707 to 717 and lines 741 to 803)