ffmpeg-python How to use FFMPEG-PYTHON in continuous stream of audio chunks?

I get OPUS-WEBM audio chunks from web-browser which I need convert it to PCM16 format. I tried using ffmpeg-python with the continuous chunks of data. The code actually converts the first chunk of the stream which contains the EBML header and throws an error when the subsequent chunks with no headers arrive like: EBML header not found, Invalid data.

Is there anyother way to handle the headerless chunks coming from a websocket chunks with ffmpeg-python or any other alternative to ffmpeg-python that handles continuous stream of audio data as chunks

import subprocess

class OpusToPCMConverter:
    def __init__(self):
        # Start an ffmpeg process that keeps running, converting chunks of Opus WebM to PCM
        self.ffmpeg_process = subprocess.Popen(
            [
                "ffmpeg",
                "-f", "webm",  # Input format
                "-i", "pipe:0",  # Input from stdin (we will feed chunks here)
                "-f", "s16le",  # Output format (PCM 16-bit little endian)
                "-acodec", "pcm_s16le",  # PCM codec
                "-ar", "16000",  # Audio sample rate (16kHz as an example)
                "-ac", "1",  # Mono channel (you can adjust if needed)
                "pipe:1"  # Output to stdout (we will read the PCM output from here)
            ],
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            bufsize=10**6  # Buffer size to handle streamed data efficiently
        )

    def convert_chunk(self, opus_chunk):
        """
        Convert a chunk of WebM/Opus data to PCM.

        Parameters:
        opus_chunk (bytes): The chunk of WebM/Opus data.

        Returns:
        bytes: The converted PCM data, or None if an error occurred.
        """
        
        # Feed the chunk into the ffmpeg process



          self.ffmpeg_process.stdin.write(opus_chunk)
          self.ffmpeg_process.stdin.flush()

          # Read the output PCM data
          pcm_data = self.ffmpeg_process.stdout.read(4096)  # Read in chunks of PCM

          return pcm_data
    

    def close(self):
        # Close the ffmpeg process properly
        self.ffmpeg_process.stdin.close()
        self.ffmpeg_process.stdout.close()
        self.ffmpeg_process.stderr.close()
        self.ffmpeg_process.terminate()

Sep 25 '24 11:09 ThiruRJST

Hey, did you find a solution for this? I have the same question.

Sep 23 '25 20:09 vyashh

Also have the same question. I think the solution is to strip the header from the first and prepend to each chunk.

I'm going to try it.

This article has some great specifics https://darkcoding.net/software/reading-mediarecorders-webm-opus-output/

Sep 29 '25 14:09 lukedupin

I tried a bunch of different methods. Messing with the headers is extremely error prone. However, ffmpeg has a great streaming feature. Here is some code that can take webm chunks and output pm16 in realtime.

class AudioStreamer:
    def __init__(self, sample_rate=16000, channels=1):
        self.pcm_queue = queue.Queue()

        self.ffmpeg = subprocess.Popen(
            [
                'ffmpeg',
                '-f', 'webm',
                '-i', 'pipe:0',
                '-f', 's16le',
                '-ar', str(sample_rate),
                '-ac', str(channels),
                '-loglevel', 'warning',  # Reduce stderr noise
                'pipe:1'
            ],
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            bufsize=8096#10 ** 8
        )

        # Start thread to read PCM output
        self.reader_thread = threading.Thread(target=self._read_output)
        self.reader_thread.daemon = True
        self.reader_thread.start()

    def _read_output(self):
        """Continuously read PCM data from FFmpeg stdout"""
        while True:
            chunk = self.ffmpeg.stdout.read(4096)
            if not chunk:
                break
            self.pcm_queue.put(chunk)

    def write_chunk(self, webm_chunk):
        """Write WebM chunk to FFmpeg"""
        self.ffmpeg.stdin.write(webm_chunk)
        self.ffmpeg.stdin.flush()

    def get_pcm(self, block=True, timeout=None):
        try:
            """Get PCM data (blocks until available)"""
            return self.pcm_queue.get(block=block, timeout=timeout)

        except:
            return None

    def close(self):
        """Close the stream"""
        self.ffmpeg.stdin.close()
        self.reader_thread.join()
        self.ffmpeg.wait()

Usage

    streamer = AudioStreamer(
        sample_rate=16000,
        channels=1
    )
           
    streamer.write_chunk(data)
    while (out := streamer.get_pcm(block=False)) is not None:

Sep 30 '25 08:09 lukedupin