elmuz

Results 19 comments of elmuz

I also would appreciate this

Sure! ```python @pipeline_def(num_threads=2, device_id=0) def SpeechPipeline(sample_list_path: Union[str, Path], shuffle: bool = False): frames, video_id, frame_num = fn.readers.video( name="speech_reader", device="gpu", file_list=f"{sample_list_path}", sequence_length=5, step=1, random_shuffle=shuffle, initial_fill=128, file_list_frame_num=True, enable_frame_num=True, file_list_include_preceding_frame=True, ) return frames,...

One thing that I noticed is that the only "boosting parameter" is the `sequence_length`. In fact, for example pushing that value to 20, I get: ``` Total time: 11.23sec (89.08FPS)...

Hi @JanuszL, thanks for the message. Honestly, this performance wouldn't justify the idea of abandoning a preprocess step frames extraction in favor of hw decoding on the fly: sure, there's...

I see thanks. It seems the experimental decoder is more similar to the `torchvision.io.read_video` function (with the same memory limitation). Please keep me posted. I will do the same in...

Ideally both. I am working on synthetic lipsync and (depending on models involved) during training I randomly sample 1 or few consecutive frames, randomly chosen from different "talking" videos. This...

In the meanwhile I wrote some sort of wrapper for my PyTorchLightning project. It basically, creates the _sliding window_ approach by unfolding a linear tensor. So, if for example we...

Yes, I agree. In fact this is only for the `predict_dataloader()` (in Lightning perspective)

I understand. It all makes sense. Thank you.

@klecki sorry to bother again. I am using PyTorch Lightning in my system and I am writing the DALI block as a `DALIGenericIterator` to be used as a `DataLoader`. However...