-
Notifications
You must be signed in to change notification settings - Fork 1.7k
mmal_vc_port_disable() stuck in mmal_vc_sendwait_message() #1428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you try adding |
I have been trying for more than a week, but haven't succeeded to reproduce it with |
I haven't succeeded reproducing this with
|
That last assert in ve_worker_close is that it's asked part of the codec to disable, and it hasn't done so within a second, so something would appear to have gone fairly badly wrong. Can I confirm that this is H264 encoding (not MJPEG)? How "busy" a scene are you encoding? Sorry I'm busy on other things at present, but could you try dropping the bitrate lower still and see if that increases the reproduction rate? |
Yes, this is H264 encoding. I just leave the camera on my desk (in the office), which is not busy.
Sure, I'll try that and update later. |
I shorten the sleep time from 3s to 1s between raspivid invocations, and try to make a "busy" scene by shaking a paper card in front of the camera. The tests run for 5 minutes each.
|
Thanks. |
I seem to be seeing the same problem on the RPi4. It looks like @janus926 has already got a lot further, but here's my details anyway. Let me know if I can provide more useful info.
for _ in camera.record_sequence(filename_seq, sps_timing=True, format="h264"):
camera.wait_recording(60) Today the video reader process on one of our devices froze. Nothing obviously unique about the device we saw it on; this is the first time I have seen this problem in ~10k hours of the same code running across a small fleet of identical devices. There was a partly written video file still open with no further bytes being appended. Restarting the process didn't help but rebooting the device fixed it. I'm not familiar with how to debug this kind of problem but I ran the following, which led me via Google to this ticket.
|
Seen again today, I think. I had the same symptom of the Python process getting stuck, running the same code and hardware type as yesterday but on a different physical device. The stack trace is slightly different. It does involve Please let me know if this turns out to be irrelevant, and I will file a separate ticket.
NB there were another 57 (!) stack frames but I've left them out since they were all in Python. |
Hi all - this problem is holding back our machine vision project. @6by9 I wonder if you've had the chance to look into it? I could pay for your/someone's time to bring it up the priority list if you think you can solve it! |
@Omccormack-blip have you confirmed that it is exactly the same assert firing in your case? There's the ugly way of solving this (just shut down always), and the clean way (find the right flags and set them appropriately when RC asks for a skip). I'll see what I can come up with. |
I have tried putting |
Any update on this? We have gone to some lengths to write a watchdog to detect this problem and reboot on our fleet of ~100 devices. This seems to work except just now and then the problem appears to spill over into stuck kernel workers and other general carnage so even our watchdog can't reboot the device. |
It has been a while, are there any chances to move this forward? Thanks. |
I think I've stumbled upon the same issue. After a number of restarts of the binary that uses the camera and invokes the
Upon the next start of the binary, the camera is stuck and the only way to resolve this is to reboot the pi. |
Describe the bug
I was working with the RPi camera using picamera v1.13 and found the python doesn't return from stop_recording() sometimes. It is blocked in mmal_vc_sendwait_message() from mmal_port_disable(). So I tried to reproduce with the default image and raspivid, and found it is also reproducible.
To reproduce
Run command
while true; do raspivid -w 640 -h 480 -b 450000 -fps 24 -t 5000 -v -o tcp://x.x.x.x:yyyy; sleep 3; done
, then just leave it there for a while (usually it takes hours to reproduce).Expected behaviour
mmal_port_disable() to return.
Actual behaviour
mmal_port_disable() stuck in mmal_vc_sendwait_message(). If I stop raspivid (Ctrl+C) when the issue occurs and rerun it, it will be blocked in the first call to mmal_vc_sendwait_message(). A reboot is required to recover.
System
Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:
Pi Zero v1.3 and Pi Camera v2.1
cat /etc/rpi-issue
)?Raspberry Pi reference 2020-05-27
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 825107f04027269db77426046f5085475b1a22f, stage4
vcgencmd version
)?Apr 15 2020 11:44:24
Copyright (c) 2012 Broadcom
version 82f9bb929ce2186eb1824178c1ae82902ad6275c (clean) (release) (start_x)
uname -a
)?Linux raspberrypi 4.19.118+ Rpi4 eth_led0 set to 4, amber light stays on #1311 Mon Apr 27 14:16:15 BST 2020 armv6l GNU/Linux
Logs
Additional context
I added some logs to vchiq_2835_arm.c in the linux kernel, and found vchiq_doorbell_irq() for BELL0 wasn't triggered.
The text was updated successfully, but these errors were encountered: