picamera2 icon indicating copy to clipboard operation
picamera2 copied to clipboard

[BUG] Unresponsive picam2.stop_encoder() when using with threads

Open kullachartp opened this issue 2 years ago • 11 comments

Please only report one bug per issue! A live stream program unresponsive at picam2.stop_encoder() (also, picam2.stop_recording())

Describe the bug I wrote a python program to manage pi camera that it started recording only when there is any client request, and when there is no client left, the pi camera will stop working to reduce CPU usage and temperature.

Unfortunately, after I upgraded from picam2 v.0.3.12 to 0.3.14, the program stop working at picam2.stop_recording(). I tried separated the command into two part and discovered that the problem is with the line picam2.stop_encoder()

To Reproduce This is the part of program that manage start/stop recording pi camera.

def generate():
    try:
        if output is None:
            output = StreamingOutput()

        with output.condition:
            active_sessions += 1
            if active_sessions == 1:
                picam2.start_recording(MJPEGEncoder(), FileOutput(output))

        while True:
            try:
                with output.condition:
                    output.condition.wait(1)
                    if output.frame is not None:
                        frame = output.frame
                    else:
                        continue
                yield (b'--FRAME\r\n'
                       b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')
            except KeyboardInterrupt:
                break

    except Exception as e:
        syslog.syslog(syslog.LOG_WARNING, f'Removed streaming client: {str(e)}')

    finally:
        with output.condition:
            output.in_use = False
            active_sessions -= 1

            if active_sessions == 0:
                try:
                    output = None
                    picam2.stop()
                    picam2.stop_encoder()
                except Exception as e:
                    print(e)

Expected behaviour picam2.stop_encoder() stop encoding properly and run the next command

Console Output, Screenshots N/A

Hardware : Raspberry pi Zero with Pi Camera V.2

Additional context My Raspberry Pi runs with Raspberry Pi OS Bookworm

kullachartp avatar Nov 10 '23 12:11 kullachartp

I wonder if trying to stop the encoders while holding the output.condition is causing a deadlock. Because they might still be trying to flush out the last frames, and won't be able to get that lock. It certainly feels like something to avoid. Maybe try moving picam2.stop() and picam2.stop_encoder() outside with with output.condition.

davidplowman avatar Nov 10 '23 14:11 davidplowman

I wonder if trying to stop the encoders while holding the output.condition is causing a deadlock. Because they might still be trying to flush out the last frames, and won't be able to get that lock. It certainly feels like something to avoid. Maybe try moving picam2.stop() and picam2.stop_encoder() outside with with output.condition.

I tried revising the code by your suggestion and the result seems better. Now for one client session, the camera stream can start and stop camera properly. But for two or more clients access the streaming concurrently, After all sessions were closed, a new client cannot access anymore. Try debugging and see that the program stop at picam2.stop_encoder().

Below is my code revision, also added the StreamingOutput class. I am not good at threading (and threading.condition) usage. I wondered if it is my improper programming, or the library. However, with picamera2 0.3.12, the code runs with multiple client sessions properly.

class StreamingOutput(io.BufferedIOBase):
    def __init__(self):
        self.frame = None
        self.condition = threading.Condition()

    def write(self, buf):
        with self.condition:
            self.frame = buf
            self.condition.notify_all()

def generate():
    global picam2, output, active_sessions

    try:
        if output is None:
            output = StreamingOutput()

        with output.condition:
            active_sessions += 1
            if active_sessions == 1:
                picam2.start_recording(MJPEGEncoder(), FileOutput(output))
        print(active_sessions)
        while True:
            try:
                with output.condition:
                    output.condition.wait()
                    if output.frame is not None:
                        frame = output.frame
                    else:
                        continue
                yield (b'--FRAME\r\n'
                       b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')
            except KeyboardInterrupt:
                break

    except Exception as e:
        syslog.syslog(syslog.LOG_WARNING, f'Removed streaming client: {str(e)}')

    finally:
        active_sessions -= 1
        if active_sessions == 0:
            output = None
            picam2.stop()
            picam2.stop_encoder()

kullachartp avatar Nov 11 '23 03:11 kullachartp

I think this issue relates to MJPEGEncoder's parent Class. v4l2_encoder.py. that the latest commit address the issue of preventing dropping final frames.

kullachartp avatar Nov 11 '23 11:11 kullachartp

As you say, the thing that's changed is that when you stop an encoder, it flushes any pending output (this was reported as a bug previously because it had received and encoded frames which were then lost).

It's certainly true that when when you take a lock inside an API using some kind of callback (as you are in StreamingOutput.write), then you probably shouldn't hold the same lock when you call into the API. It's quite common for libraries to behave like this, and that's what was happening previously.

I can only debug really code the I'm sent if it's completely self-contained (I can just copy/paste and it works), preferably avoids using other packages that I don't have or don't understand, and are absolutely as short as you can make them. On this occasion, I've tried to turn your example into something I can run:

import io
import time
from threading import Condition, Event, Lock, Thread

from picamera2 import Picamera2
from picamera2.encoders import MJPEGEncoder
from picamera2.outputs import FileOutput

class StreamingOutput(io.BufferedIOBase):
    def __init__(self):
        self.frame = None
        self.condition = Condition()

    def write(self, buf):
        with self.condition:
            self.frame = buf
            self.condition.notify_all()

output = StreamingOutput()
picam2 = Picamera2()
lock = Lock()
active_sessions = 0

def generate(abort):
    global output, picam2, active_sessions

    with lock:
        active_sessions += 1
        print("active_sessions", active_sessions)
        if active_sessions == 1:
            print("Starting")
            picam2.start_recording(MJPEGEncoder(), FileOutput(output))

    while True:
        with output.condition:
            output.condition.wait(1)
            if output.frame:
                yield()

        if abort.is_set():
            break

    with lock:
        active_sessions -= 1
        print("active_sessions", active_sessions)
        if active_sessions == 0:
            print("Stopping")
            picam2.stop()
            picam2.stop_encoder()

def thread_func(abort):
    for _ in generate(abort):
        pass

aborts = [Event() for _ in range(5)]
threads = [Thread(target=thread_func, args=(abort, )) for abort in aborts]

for thread in threads:
    time.sleep(1)
    thread.start()

for i, abort in enumerate(aborts):
    time.sleep(1)
    abort.set()
    threads[i].join()

If your application is still seeing deadlocks, maybe put a print statement just before each attempt to grab each lock, another when you acquire the lock, and a final print every time you release it.

davidplowman avatar Nov 13 '23 10:11 davidplowman

Thank you for your advice. Your code help me a lot. It solved the problems of RuntimeError and BrokenPipeError that I am researching for solution. I will learn more on Python thread programming :)

I tried running your code without modification. When I run the program using Picamera2 v0.3.12, it runs without errors and exits completely. However, when I use Picamera2 v0.3.14, the program stops at 'Stopping' and waits indefinitely.

$ python3 test.py
[0:21:53.035673678] [2559]  INFO Camera camera_manager.cpp:284 libcamera v0.1.0+99-4a23664b
[0:21:53.263820789] [2563]  WARN RPiSdn sdn.cpp:39 Using legacy SDN tuning - please consider moving SDN inside rpi.denoise
[0:21:53.281247645] [2563]  WARN RPI vc4.cpp:390 Mismatch between Unicam and CamHelper for embedded data usage!
[0:21:53.287279595] [2563]  INFO RPI vc4.cpp:444 Registered camera /base/soc/i2c0mux/i2c@1/imx219@10 to Unicam device /dev/media3 and ISP device /dev/media0
[0:21:53.289452577] [2563]  INFO RPI pipeline_base.cpp:1142 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
[0:21:53.307860424] [2559]  INFO Camera camera_manager.cpp:284 libcamera v0.1.0+99-4a23664b
[0:21:53.512487730] [2566]  WARN RPiSdn sdn.cpp:39 Using legacy SDN tuning - please consider moving SDN inside rpi.denoise
[0:21:53.529040593] [2566]  WARN RPI vc4.cpp:390 Mismatch between Unicam and CamHelper for embedded data usage!
[0:21:53.536173534] [2566]  INFO RPI vc4.cpp:444 Registered camera /base/soc/i2c0mux/i2c@1/imx219@10 to Unicam device /dev/media3 and ISP device /dev/media0
[0:21:53.538350516] [2566]  INFO RPI pipeline_base.cpp:1142 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
active_sessions 1
Starting
[0:21:54.625777525] [2570]  INFO Camera camera.cpp:1181 configuring streams: (0) 1280x720-XBGR8888 (1) 1920x1080-SBGGR10_CSI2P
[0:21:54.630337488] [2566]  INFO RPI vc4.cpp:608 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 1920x1080-SBGGR10_1X10 - Selected unicam format: 1920x1080-pBAA
active_sessions 2
active_sessions 3
active_sessions 4
active_sessions 5
active_sessions 4
active_sessions 3
active_sessions 2
active_sessions 1
active_sessions 0
Stopping

kullachartp avatar Nov 13 '23 17:11 kullachartp

Thanks for the update. I'm quite confused because I'm running this on a Pi 4 with the latest software (0.3.14) and an imx219, and for me it's behaving correctly. Can you put in some more print statements just to confirm where it is locking up? Maybe after picam2.stop(), after picam2.stop_encoder() and after threads[i].join().

If you have a spare SD card somewhere, are you able to try out a fresh image as I have done? Thanks.

davidplowman avatar Nov 13 '23 17:11 davidplowman

I modified the code by your suggestion:

    with lock:
        active_sessions -= 1
        print("active_sessions", active_sessions)
        if active_sessions == 0:
            print("Stopping")
            picam2.stop()
            print("After stop")
            picam2.stop_encoder()
            print("After stop_encoder")

def thread_func(abort):
    for _ in generate(abort):
        pass

aborts = [Event() for _ in range(5)]
threads = [Thread(target=thread_func, args=(abort, )) for abort in aborts]

for thread in threads:
    time.sleep(1)
    thread.start()

for i, abort in enumerate(aborts):
    time.sleep(1)
    abort.set()
    threads[i].join()

print("After thread.join")

with picamera2 v0.3.14 (installed via apt), the result are:

active_sessions 2
active_sessions 3
active_sessions 4
active_sessions 5
active_sessions 4
active_sessions 3
active_sessions 2
active_sessions 1
active_sessions 0
Stopping
After stop

and definitely, with picamera v0.3.12 (installed via pip [having already uninstalled v0.3.14 via apt]):

active_sessions 2
active_sessions 3
active_sessions 4
active_sessions 5
active_sessions 4
active_sessions 3
active_sessions 2
active_sessions 1
active_sessions 0
Stopping
After stop
After stop_encoder
After thread.join
$

Certainly, I will help test this issue!

kullachartp avatar Nov 13 '23 21:11 kullachartp

Thanks for doing that. I've been trying hard to reproduce this, without any luck at all until I got hold of a Pi Zero. On this platform it happens quite a lot, though at times it seems to go away, so it's all a bit random.

I don't really see anything wrong with the v4l2 encoder, though it does look as though the last frame queued into the v4l2 device just doesn't come out. I guess it's the slowness of the Pi Zero that somehow makes this show up.

So I'm a bit suspicious of the kernel driver, though I wouldn't really want to point the finger without investigating further. Nonetheless it should be easy enough to put in a workaround up in the v4l2 encoder along the lines of "if we've been told to stop, and have outstanding frames but the v4l2 device keeps timing out, then give up and quit anyway". I'll look into that.

davidplowman avatar Nov 14 '23 16:11 davidplowman

Today, I just upgraded the raspi-firmware to version (1:1.20231024+ds-1+rpt1, 1:1.20231024+ds-1+rpt2). After the upgrade, I reinstalled picamera2 v0.3.14 and conducted a test. Fortunately, the problem has been resolved. Your analysis appears to be accurate based on this evidence.

I will continue to run tests to see if any incidents occur.

kullachartp avatar Nov 15 '23 02:11 kullachartp

Hmm, I not entirely convinced. I installed the latest image and firmware yesterday and found that the problem comes and goes a bit. My inclination is to include the workaround as there's not much to it, and it will have no effect when things work properly. See here: https://github.com/raspberrypi/picamera2/pull/864.

davidplowman avatar Nov 15 '23 10:11 davidplowman

I saw similar behavior in my photobooth code. It was a bit random, more likely on a rpi 3b+ to happen, less likely on the pi 4. unfortunately I could not reduce it to a code to confirm, so I hope this workaround proposed will solve my case also 👍

mgineer85 avatar Nov 15 '23 10:11 mgineer85