datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Audio preview in dataset viewer for audio array data without a path/filename

Open Lauler opened this issue 1 year ago • 0 comments

Feature request

Huggingface has quite a comprehensive set of guides for audio datasets. It seems, however, all these guides assume the audio array data to be decoded/inserted into a HF dataset always originates from individual files. The Audio-dataclass appears designed with this assumption in mind. Looking at its source code it returns a dictionary with the keys path, array and sampling_rate.

However, sometimes users may have different pipelines where they themselves decode the audio array. This feature request has to do with wishing some clarification in guides on whether it is possible, and in such case how users can insert already decoded audio array data into datasets (pandas DataFrame, HF dataset or whatever) that are later saved as parquet, and still get a functioning audio preview in the dataset viewer.

Do I perhaps need to write a tempfile of my audio array slice to wav and capture the bytes object with io.BytesIO and pass that to Audio()?

Motivation

I'm working with large audio datasets, and my pipeline reads (decodes) audio from larger files, and slices the relevant portions of audio from that larger file based on metadata I have available.

The pipeline is designed this way to avoid having to store multiple copies of data, and to avoid having to store tens of millions of small files.

I tried test-uploading parquet files where I store the audio array data of decoded slices of audio in an audio column with a dictionary with the keys path, array and sampling_rate. But I don't know the secret sauce of what the Huggingface Hub expects and requires to be able to display audio previews correctly.

Your contribution

I could contribute a tool agnostic guide of creating HF audio datasets directly as parquet to the HF documentation if there is an interest. Provided you help me figure out the secret sauce of what the dataset viewer expects to display the preview correctly.

Lauler avatar Oct 02 '24 16:10 Lauler