[BUG] Unresponsive picam2.stop_encoder() when using with threads
Please only report one bug per issue! A live stream program unresponsive at picam2.stop_encoder() (also, picam2.stop_recording())
Describe the bug I wrote a python program to manage pi camera that it started recording only when there is any client request, and when there is no client left, the pi camera will stop working to reduce CPU usage and temperature.
Unfortunately, after I upgraded from picam2 v.0.3.12 to 0.3.14, the program stop working at picam2.stop_recording(). I tried separated the command into two part and discovered that the problem is with the line picam2.stop_encoder()
To Reproduce This is the part of program that manage start/stop recording pi camera.
def generate():
try:
if output is None:
output = StreamingOutput()
with output.condition:
active_sessions += 1
if active_sessions == 1:
picam2.start_recording(MJPEGEncoder(), FileOutput(output))
while True:
try:
with output.condition:
output.condition.wait(1)
if output.frame is not None:
frame = output.frame
else:
continue
yield (b'--FRAME\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')
except KeyboardInterrupt:
break
except Exception as e:
syslog.syslog(syslog.LOG_WARNING, f'Removed streaming client: {str(e)}')
finally:
with output.condition:
output.in_use = False
active_sessions -= 1
if active_sessions == 0:
try:
output = None
picam2.stop()
picam2.stop_encoder()
except Exception as e:
print(e)
Expected behaviour picam2.stop_encoder() stop encoding properly and run the next command
Console Output, Screenshots N/A
Hardware : Raspberry pi Zero with Pi Camera V.2
Additional context My Raspberry Pi runs with Raspberry Pi OS Bookworm
I wonder if trying to stop the encoders while holding the output.condition is causing a deadlock. Because they might still be trying to flush out the last frames, and won't be able to get that lock. It certainly feels like something to avoid. Maybe try moving picam2.stop() and picam2.stop_encoder() outside with with output.condition.
I wonder if trying to stop the encoders while holding the
output.conditionis causing a deadlock. Because they might still be trying to flush out the last frames, and won't be able to get that lock. It certainly feels like something to avoid. Maybe try movingpicam2.stop()andpicam2.stop_encoder()outside withwith output.condition.
I tried revising the code by your suggestion and the result seems better. Now for one client session, the camera stream can start and stop camera properly. But for two or more clients access the streaming concurrently, After all sessions were closed, a new client cannot access anymore. Try debugging and see that the program stop at picam2.stop_encoder().
Below is my code revision, also added the StreamingOutput class. I am not good at threading (and threading.condition) usage. I wondered if it is my improper programming, or the library. However, with picamera2 0.3.12, the code runs with multiple client sessions properly.
class StreamingOutput(io.BufferedIOBase):
def __init__(self):
self.frame = None
self.condition = threading.Condition()
def write(self, buf):
with self.condition:
self.frame = buf
self.condition.notify_all()
def generate():
global picam2, output, active_sessions
try:
if output is None:
output = StreamingOutput()
with output.condition:
active_sessions += 1
if active_sessions == 1:
picam2.start_recording(MJPEGEncoder(), FileOutput(output))
print(active_sessions)
while True:
try:
with output.condition:
output.condition.wait()
if output.frame is not None:
frame = output.frame
else:
continue
yield (b'--FRAME\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')
except KeyboardInterrupt:
break
except Exception as e:
syslog.syslog(syslog.LOG_WARNING, f'Removed streaming client: {str(e)}')
finally:
active_sessions -= 1
if active_sessions == 0:
output = None
picam2.stop()
picam2.stop_encoder()
I think this issue relates to MJPEGEncoder's parent Class. v4l2_encoder.py. that the latest commit address the issue of preventing dropping final frames.
As you say, the thing that's changed is that when you stop an encoder, it flushes any pending output (this was reported as a bug previously because it had received and encoded frames which were then lost).
It's certainly true that when when you take a lock inside an API using some kind of callback (as you are in StreamingOutput.write), then you probably shouldn't hold the same lock when you call into the API. It's quite common for libraries to behave like this, and that's what was happening previously.
I can only debug really code the I'm sent if it's completely self-contained (I can just copy/paste and it works), preferably avoids using other packages that I don't have or don't understand, and are absolutely as short as you can make them. On this occasion, I've tried to turn your example into something I can run:
import io
import time
from threading import Condition, Event, Lock, Thread
from picamera2 import Picamera2
from picamera2.encoders import MJPEGEncoder
from picamera2.outputs import FileOutput
class StreamingOutput(io.BufferedIOBase):
def __init__(self):
self.frame = None
self.condition = Condition()
def write(self, buf):
with self.condition:
self.frame = buf
self.condition.notify_all()
output = StreamingOutput()
picam2 = Picamera2()
lock = Lock()
active_sessions = 0
def generate(abort):
global output, picam2, active_sessions
with lock:
active_sessions += 1
print("active_sessions", active_sessions)
if active_sessions == 1:
print("Starting")
picam2.start_recording(MJPEGEncoder(), FileOutput(output))
while True:
with output.condition:
output.condition.wait(1)
if output.frame:
yield()
if abort.is_set():
break
with lock:
active_sessions -= 1
print("active_sessions", active_sessions)
if active_sessions == 0:
print("Stopping")
picam2.stop()
picam2.stop_encoder()
def thread_func(abort):
for _ in generate(abort):
pass
aborts = [Event() for _ in range(5)]
threads = [Thread(target=thread_func, args=(abort, )) for abort in aborts]
for thread in threads:
time.sleep(1)
thread.start()
for i, abort in enumerate(aborts):
time.sleep(1)
abort.set()
threads[i].join()
If your application is still seeing deadlocks, maybe put a print statement just before each attempt to grab each lock, another when you acquire the lock, and a final print every time you release it.
Thank you for your advice. Your code help me a lot. It solved the problems of RuntimeError and BrokenPipeError that I am researching for solution. I will learn more on Python thread programming :)
I tried running your code without modification. When I run the program using Picamera2 v0.3.12, it runs without errors and exits completely. However, when I use Picamera2 v0.3.14, the program stops at 'Stopping' and waits indefinitely.
$ python3 test.py
[0:21:53.035673678] [2559] INFO Camera camera_manager.cpp:284 libcamera v0.1.0+99-4a23664b
[0:21:53.263820789] [2563] WARN RPiSdn sdn.cpp:39 Using legacy SDN tuning - please consider moving SDN inside rpi.denoise
[0:21:53.281247645] [2563] WARN RPI vc4.cpp:390 Mismatch between Unicam and CamHelper for embedded data usage!
[0:21:53.287279595] [2563] INFO RPI vc4.cpp:444 Registered camera /base/soc/i2c0mux/i2c@1/imx219@10 to Unicam device /dev/media3 and ISP device /dev/media0
[0:21:53.289452577] [2563] INFO RPI pipeline_base.cpp:1142 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
[0:21:53.307860424] [2559] INFO Camera camera_manager.cpp:284 libcamera v0.1.0+99-4a23664b
[0:21:53.512487730] [2566] WARN RPiSdn sdn.cpp:39 Using legacy SDN tuning - please consider moving SDN inside rpi.denoise
[0:21:53.529040593] [2566] WARN RPI vc4.cpp:390 Mismatch between Unicam and CamHelper for embedded data usage!
[0:21:53.536173534] [2566] INFO RPI vc4.cpp:444 Registered camera /base/soc/i2c0mux/i2c@1/imx219@10 to Unicam device /dev/media3 and ISP device /dev/media0
[0:21:53.538350516] [2566] INFO RPI pipeline_base.cpp:1142 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
active_sessions 1
Starting
[0:21:54.625777525] [2570] INFO Camera camera.cpp:1181 configuring streams: (0) 1280x720-XBGR8888 (1) 1920x1080-SBGGR10_CSI2P
[0:21:54.630337488] [2566] INFO RPI vc4.cpp:608 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 1920x1080-SBGGR10_1X10 - Selected unicam format: 1920x1080-pBAA
active_sessions 2
active_sessions 3
active_sessions 4
active_sessions 5
active_sessions 4
active_sessions 3
active_sessions 2
active_sessions 1
active_sessions 0
Stopping
Thanks for the update. I'm quite confused because I'm running this on a Pi 4 with the latest software (0.3.14) and an imx219, and for me it's behaving correctly. Can you put in some more print statements just to confirm where it is locking up? Maybe after picam2.stop(), after picam2.stop_encoder() and after threads[i].join().
If you have a spare SD card somewhere, are you able to try out a fresh image as I have done? Thanks.
I modified the code by your suggestion:
with lock:
active_sessions -= 1
print("active_sessions", active_sessions)
if active_sessions == 0:
print("Stopping")
picam2.stop()
print("After stop")
picam2.stop_encoder()
print("After stop_encoder")
def thread_func(abort):
for _ in generate(abort):
pass
aborts = [Event() for _ in range(5)]
threads = [Thread(target=thread_func, args=(abort, )) for abort in aborts]
for thread in threads:
time.sleep(1)
thread.start()
for i, abort in enumerate(aborts):
time.sleep(1)
abort.set()
threads[i].join()
print("After thread.join")
with picamera2 v0.3.14 (installed via apt), the result are:
active_sessions 2
active_sessions 3
active_sessions 4
active_sessions 5
active_sessions 4
active_sessions 3
active_sessions 2
active_sessions 1
active_sessions 0
Stopping
After stop
and definitely, with picamera v0.3.12 (installed via pip [having already uninstalled v0.3.14 via apt]):
active_sessions 2
active_sessions 3
active_sessions 4
active_sessions 5
active_sessions 4
active_sessions 3
active_sessions 2
active_sessions 1
active_sessions 0
Stopping
After stop
After stop_encoder
After thread.join
$
Certainly, I will help test this issue!
Thanks for doing that. I've been trying hard to reproduce this, without any luck at all until I got hold of a Pi Zero. On this platform it happens quite a lot, though at times it seems to go away, so it's all a bit random.
I don't really see anything wrong with the v4l2 encoder, though it does look as though the last frame queued into the v4l2 device just doesn't come out. I guess it's the slowness of the Pi Zero that somehow makes this show up.
So I'm a bit suspicious of the kernel driver, though I wouldn't really want to point the finger without investigating further. Nonetheless it should be easy enough to put in a workaround up in the v4l2 encoder along the lines of "if we've been told to stop, and have outstanding frames but the v4l2 device keeps timing out, then give up and quit anyway". I'll look into that.
Today, I just upgraded the raspi-firmware to version (1:1.20231024+ds-1+rpt1, 1:1.20231024+ds-1+rpt2). After the upgrade, I reinstalled picamera2 v0.3.14 and conducted a test. Fortunately, the problem has been resolved. Your analysis appears to be accurate based on this evidence.
I will continue to run tests to see if any incidents occur.
Hmm, I not entirely convinced. I installed the latest image and firmware yesterday and found that the problem comes and goes a bit. My inclination is to include the workaround as there's not much to it, and it will have no effect when things work properly. See here: https://github.com/raspberrypi/picamera2/pull/864.
I saw similar behavior in my photobooth code. It was a bit random, more likely on a rpi 3b+ to happen, less likely on the pi 4. unfortunately I could not reduce it to a code to confirm, so I hope this workaround proposed will solve my case also 👍