Shakeel Ahmad issues

Results 9 issues of


                                            Shakeel Ahmad

kaldi Like Data Augmentation

How can we do Kaldi like data augmentation in the API only on acoustic data

Feature Extraction using Pre-trained Conformer Model

Is there any possibility to use pre-trained conformer model for feature extraction on another speech dataset. Have you uploaded your pre-trained model and is there any tutorial how to extract...

Dice Loss is Negative for some steps

Hi, I am trying to follow the 3D segmentation Tutorial, When I run the code on CPU local machine, the tutorial is working fine, But the moment I start training...

Extraction of features with AV HuBERT

The tutorial mentioned for feature extraction. Are these the learned representations of AV-HuBERT or just extracting the features from input video file which needs to be passed to the AV...

Cannot register duplicate model (av_hubert)

When I try to extract features after following the installation. I am getting the following error When I run the python code --> feature extraction Do I need to run...

Could you please help me with the label files. [https://raw.githubusercontent.com/apple/ml-stuttering-events-dataset/main/SEP-28k_labels.csv](url) What does the numbers 0, 1, 2,3 mean here? because it is confusing Could you please clear it bit For...

'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

When I try to read TextGrid File Using ` grid = textgrids.TextGrid(annot)` it is showing the following error _'utf-8' codec can't decode byte 0xff in position 0: invalid start byte_...

How to extract embeddings from a specific layer

And How to specify a particular layer for feature extraction in `feature, _ = model.extract_finetune(source={'video': frames, 'audio': None}, padding_mask=None, output_layer=None) ` What should be passed to `output_layer` to have specific...

Dataset Directory structure

What is the format for inputting data. I mean what data tree structure should we use, if we use it for any other speech problem ? Say a simple binary...