Skip to content

Vchiq errors when plug in HDMI cable #875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Aiexis opened this issue Sep 13, 2017 · 22 comments
Open

Vchiq errors when plug in HDMI cable #875

Aiexis opened this issue Sep 13, 2017 · 22 comments
Labels

Comments

@Aiexis
Copy link

Aiexis commented Sep 13, 2017

Hello
as said in a previous closed issue #253 I get some odds messages with dmesg when a plug the hdmi cable:

[ 5374.930044] WARNING: CPU: 0 PID: 9633 at drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c:2501 vchiq_release_internal+0xbc/0x254
[ 5374.932348] Modules linked in: brcmfmac brcmutil cfg80211 rfkill bcm2835_gpiomem uio_pdrv_genirq fixed uio snd_bcm2835 snd_pcm snd_timer snd ipv6
[ 5374.934879] CPU: 0 PID: 9633 Comm: HTV Notify Not tainted 4.9.35-v7+ #1014
[ 5374.937262] Hardware name: BCM2835
[ 5374.939660] [<8010fb3c>] (unwind_backtrace) from [<8010c058>] (show_stack+0x20/0x24)
[ 5374.942118] [<8010c058>] (show_stack) from [<80455880>] (dump_stack+0xd4/0x118)
[ 5374.944556] [<80455880>] (dump_stack) from [<8011d344>] (__warn+0xf8/0x110)
[ 5374.946957] [<8011d344>] (__warn) from [<8011d42c>] (warn_slowpath_null+0x30/0x38)
[ 5374.949463] [<8011d42c>] (warn_slowpath_null) from [<80600010>] (vchiq_release_internal+0xbc/0x254)
[ 5374.952029] [<80600010>] (vchiq_release_internal) from [<80600cc4>] (vchiq_ioctl+0x800/0x1968)
[ 5374.954654] [<80600cc4>] (vchiq_ioctl) from [<802835d4>] (do_vfs_ioctl+0xac/0x820)
[ 5374.957308] [<802835d4>] (do_vfs_ioctl) from [<80283d8c>] (SyS_ioctl+0x44/0x6c)
[ 5374.959962] [<80283d8c>] (SyS_ioctl) from [<801080c0>] (ret_fast_syscall+0x0/0x1c)
[ 5374.962624] ---[ end trace d651be8fc8262ca7 ]---
[ 5374.965561] vchiq: vchiq_ioctl: cmd VCHIQ_IOC_RELEASE_SERVICE returned error -1 for service TVNT:9630

To reproduce this error you can download a complete minibian image here:
https://drive.google.com/file/d/0B7wUr-s2z0lNTjZkUWlyUloxUVU/view?usp=sharing
(login root / raspberry)
Tested on pi3 with 2A power supply, and this screen:
https://www.amazon.fr/gp/product/B01FFAAZRC/ref=oh_aui_detailpage_o01_s00?ie=UTF8&psc=1

Just plug and unplug hdmi cable...
There is a small video loop starting at boot, but I'm not sure this is related to the issue.
Note that that this issue seems to have no effect, but well... may be this is just a part of an iceberg.

Thank you !

@Aiexis Aiexis changed the title Vchiq errors when plugin HDMI cable Vchiq errors when plug in HDMI cable Sep 13, 2017
@Aiexis Aiexis closed this as completed Sep 14, 2017
@Aiexis Aiexis reopened this Sep 14, 2017
@popcornmix
Copy link
Contributor

I can reproduce. Had a chat with @pelwell and the belief is this is harmless.
It is related to reference counting of vchiq services used by suspend which isn't supported on pi.
The bug appears to be in tvservice which reference counts hdmi attach/detach events.

I'll try to log exactly what is occurring and get a fix, but in the mean time it should be safe to ignore it.

@Aiexis
Copy link
Author

Aiexis commented Sep 15, 2017

OK @popcornmix thank you !
Indeed it's harmless in most cases, but one in thousand times (very approximated of course) it causes omxplayer to freeze. I had this situation this morning at home.

To be more precise, we run about 200 raspberry pi for a in-store network, and everyday we have to reboot one or two of them because omxplayer is freezed. In the same time the load average increases dramaticaly to 4, 5 or even 10 11 if we do nothing. Somethimes a "reboot" command doesn't work and we have to do
echo 1 > /proc/sys/kernel/sysrq ; echo b > /proc/sysrq-trigger
to be able to reboot.
Along with this, while freezed, all tvservice/vcgendcm commands hang

I investigate on this issue for 2 years... tried everything that is in my limited debian knowledge: upgrades/downgrades of raspbian/minibian/omxplayer, changing pi cards, power supply, sd card brand, even hdmi cable...
We can have 90 days without problem, and suddenly 4 reboot needed in the same week (with the same videos).
I can upload a complete image if you want to test, but maybe you will have to wait 3 month...

So, maybe this the issue here is related to the freeze problems but I don't know.
Thanks for your help !

@popcornmix
Copy link
Contributor

Can you try getting this libbcm_host.so and replacing the one in /opt/vc/lib
Does that fix the kernel warning?

@Aiexis
Copy link
Author

Aiexis commented Sep 15, 2017

Warning is still here, even after replacing file and reboot

[ 11.657331] brcmfmac: power management disabled
[ 12.211904] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 12.211958] brcmfmac: power management disabled
[ 12.639176] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 12.639402] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 13.076043] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 47.608978] random: crng init done
[ 108.856767] ------------[ cut here ]------------
[ 108.856808] WARNING: CPU: 2 PID: 916 at drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c:2501 vchiq_release_internal+0xbc/0x254
[ 108.856819] Modules linked in: brcmfmac brcmutil cfg80211 rfkill bcm2835_gpiomem uio_pdrv_genirq fixed uio snd_bcm2835 snd_pcm snd_timer snd ipv6
[ 108.856893] CPU: 2 PID: 916 Comm: HTV Notify Not tainted 4.9.28-v7+ #998
[ 108.856903] Hardware name: BCM2835
[ 108.856932] [<8010fb3c>] (unwind_backtrace) from [<8010c058>] (show_stack+0x20/0x24)
[ 108.856957] [<8010c058>] (show_stack) from [<804552c0>] (dump_stack+0xd4/0x118)
[ 108.856978] [<804552c0>] (dump_stack) from [<8011d2c4>] (__warn+0xf8/0x110)
[ 108.856998] [<8011d2c4>] (__warn) from [<8011d3ac>] (warn_slowpath_null+0x30/0x38)
[ 108.857017] [<8011d3ac>] (warn_slowpath_null) from [<805ff68c>] (vchiq_release_internal+0xbc/0x254)
[ 108.857037] [<805ff68c>] (vchiq_release_internal) from [<80600340>] (vchiq_ioctl+0x800/0x1968)
[ 108.857058] [<80600340>] (vchiq_ioctl) from [<80283214>] (do_vfs_ioctl+0xac/0x820)
[ 108.857082] [<80283214>] (do_vfs_ioctl) from [<802839cc>] (SyS_ioctl+0x44/0x6c)
[ 108.857104] [<802839cc>] (SyS_ioctl) from [<801080c0>] (ret_fast_syscall+0x0/0x1c)
[ 108.857147] ---[ end trace 7b8b90b64a21086b ]---
[ 108.857186] vchiq: vchiq_ioctl: cmd VCHIQ_IOC_RELEASE_SERVICE returned error -1 for service TVNT:913

@popcornmix
Copy link
Contributor

Yes. Can reproduce (now with raspbian).
I think my testing was unplugging and replugging while video was playing (which works).
The failure comes from unplugging, a new omxplayer starting, and then replugging.

@Aiexis
Copy link
Author

Aiexis commented Sep 17, 2017

I don't notice the same behavior. The failure can occurs while playing a video without changing or stopping it. But the failure never occurs when playing the first video after a reboot. Omxplayer must have stop at least once before there is a problem.
And indeed I had this issue with raspbian since Wheezy.

@Aiexis
Copy link
Author

Aiexis commented Sep 18, 2017

Just a question, is that normal that memory usage rapidly falls down to 10M (and sometimes below) after a few seconds of omxplayer use ?
~# free -htl
total used free shared buffers cached
Mem: 923M 912M 10M 17M 832K 806M
Low: 923M 912M 10M
High: 0B 0B 0B
-/+ buffers/cache: 105M 817M
Swap: 0B 0B 0B
Total: 923M 912M 10M

I have never noticed that (but I have never checked). Maybe it's related to my omxplayer's freezes issues...

@P33M
Copy link

P33M commented Sep 18, 2017

http://www.linuxatemyram.com/

This is normal behaviour for any program that does lots of file I/O.

@popcornmix
Copy link
Contributor

popcornmix commented Sep 19, 2017

Can you test this libbrcm_host.so ?

@Aiexis
Copy link
Author

Aiexis commented Sep 19, 2017

No more message !
Perfect, thank you.
Is there any chance that it also fixes the freeze issue ?

(And thank you @P33M for your answer, sorry I missed it)

@popcornmix
Copy link
Contributor

I think the belief is still that the kernel warning is harmless, so this fix probably won't be related to the freezing.
Check if freezing still occurs and if it does try to find an example that freezes quickly.

@Aiexis
Copy link
Author

Aiexis commented Sep 22, 2017

Freezing just occurs, and dmesg shows thoses messages:

[79869.927589] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[79869.927613]       Not tainted 4.9.28-v7+ #998
[79869.927622] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[79869.927633] HCEC Notify     D    0 18535      1 0x00000005
[79869.927683] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[79869.927705] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[79869.927727] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[79869.927750] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[79869.927773] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[79869.927798] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[79869.927819] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[79869.927842] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[79869.927862] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[79869.927883] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[79869.927901] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[79869.927924] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[79869.927946] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[79869.927967] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[79992.807592] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[79992.807615]       Not tainted 4.9.28-v7+ #998
[79992.807625] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[79992.807635] HCEC Notify     D    0 18535      1 0x00000005
[79992.807686] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[79992.807708] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[79992.807729] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[79992.807753] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[79992.807776] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[79992.807799] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[79992.807820] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[79992.807842] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[79992.807863] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[79992.807883] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[79992.807902] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[79992.807924] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[79992.807947] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[79992.807968] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80115.687598] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80115.687620]       Not tainted 4.9.28-v7+ #998
[80115.687630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80115.687640] HCEC Notify     D    0 18535      1 0x00000005
[80115.687688] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80115.687709] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80115.687730] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80115.687752] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80115.687775] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80115.687797] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80115.687818] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80115.687839] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80115.687860] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80115.687878] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80115.687897] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80115.687919] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80115.687941] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80115.687962] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80238.567626] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80238.567651]       Not tainted 4.9.28-v7+ #998
[80238.567661] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80238.567672] HCEC Notify     D    0 18535      1 0x00000005
[80238.567723] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80238.567745] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80238.567767] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80238.567791] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80238.567813] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80238.567836] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80238.567857] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80238.567879] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80238.567900] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80238.567919] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80238.567939] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80238.567961] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80238.567983] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80238.568005] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80361.447596] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80361.447620]       Not tainted 4.9.28-v7+ #998
[80361.447630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80361.447640] HCEC Notify     D    0 18535      1 0x00000005
[80361.447689] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80361.447711] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80361.447732] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80361.447754] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80361.447777] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80361.447800] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80361.447820] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80361.447843] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80361.447864] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80361.447885] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80361.447903] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80361.447925] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80361.447947] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80361.447968] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80484.327609] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80484.327631]       Not tainted 4.9.28-v7+ #998
[80484.327641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80484.327652] HCEC Notify     D    0 18535      1 0x00000005
[80484.327702] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80484.327724] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80484.327745] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80484.327768] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80484.327792] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80484.327815] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80484.327835] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80484.327857] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80484.327879] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80484.327898] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80484.327917] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80484.327939] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80484.327962] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80484.327983] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80607.207593] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80607.207617]       Not tainted 4.9.28-v7+ #998
[80607.207626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80607.207637] HCEC Notify     D    0 18535      1 0x00000005
[80607.207687] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80607.207709] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80607.207731] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80607.207755] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80607.207778] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80607.207801] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80607.207822] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80607.207844] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80607.207865] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80607.207885] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80607.207904] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80607.207926] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80607.207948] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80607.207969] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80730.087590] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80730.087611]       Not tainted 4.9.28-v7+ #998
[80730.087620] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80730.087630] HCEC Notify     D    0 18535      1 0x00000005
[80730.087677] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80730.087699] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80730.087719] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80730.087742] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80730.087764] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80730.087787] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80730.087807] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80730.087829] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80730.087849] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80730.087867] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80730.087885] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80730.087907] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80730.087929] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80730.087950] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80852.967593] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80852.967617]       Not tainted 4.9.28-v7+ #998
[80852.967626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80852.967637] HCEC Notify     D    0 18535      1 0x00000005
[80852.967688] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80852.967711] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80852.967732] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80852.967756] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80852.967778] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80852.967802] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80852.967822] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80852.967844] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80852.967865] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80852.967886] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80852.967905] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80852.967927] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80852.967949] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80852.967969] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)
[80975.847597] INFO: task HCEC Notify:18535 blocked for more than 120 seconds.
[80975.847620]       Not tainted 4.9.28-v7+ #998
[80975.847630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[80975.847640] HCEC Notify     D    0 18535      1 0x00000005
[80975.847690] [<80713b70>] (__schedule) from [<80714094>] (schedule+0x50/0xa8)
[80975.847712] [<80714094>] (schedule) from [<80717304>] (schedule_timeout+0x220/0x364)
[80975.847733] [<80717304>] (schedule_timeout) from [<80715d48>] (__down+0x88/0xc0)
[80975.847756] [<80715d48>] (__down) from [<8016ace8>] (down+0x54/0x68)
[80975.847780] [<8016ace8>] (down) from [<805ff958>] (vchiq_release+0x134/0x31c)
[80975.847802] [<805ff958>] (vchiq_release) from [<80271d4c>] (__fput+0x9c/0x1e8)
[80975.847823] [<80271d4c>] (__fput) from [<80271f08>] (____fput+0x18/0x1c)
[80975.847845] [<80271f08>] (____fput) from [<8013aad0>] (task_work_run+0xcc/0xfc)
[80975.847866] [<8013aad0>] (task_work_run) from [<801212d4>] (do_exit+0x370/0xaec)
[80975.847887] [<801212d4>] (do_exit) from [<80121ae8>] (do_group_exit+0x4c/0xe4)
[80975.847905] [<80121ae8>] (do_group_exit) from [<8012cb68>] (get_signal+0x370/0x6d4)
[80975.847927] [<8012cb68>] (get_signal) from [<8010b2cc>] (do_signal+0x74/0x3e4)
[80975.847949] [<8010b2cc>] (do_signal) from [<8010b824>] (do_work_pending+0xb8/0xd0)
[80975.847970] [<8010b824>] (do_work_pending) from [<801080e8>] (slow_work_pending+0xc/0x20)

After 22 hours of non-stop video playing.
I have no idea of how to reproduce this quicky...

@popcornmix
Copy link
Contributor

Can you add start_debug=1 to config.txt, reboot and when the next crash happens post the output of

sudo vcdbg log msg
sudo vcdbg log assert

Also can you confirm you are up to date? (e.g. vcgencmd version and uname -a have recent dates).

@Aiexis
Copy link
Author

Aiexis commented Sep 22, 2017

vcgencmd version
May 15 2017 16:57:15
Copyright (c) 2012 Broadcom
version 9469ea3706e34c4de62f38a5008f69a429b4b43e (clean) (release)

Uname -a
Linux Raspberry 4.9.28-v7+ #998 SMP Mon May 15 16:55:39 BST 2017 armv7l GNU/Linux

Rebooted with start_debug, now waiting... see you later, thank you for help

popcornmix added a commit that referenced this issue Sep 22, 2017
kernel: Touchscreen spurious event fix
See: raspberrypi/linux#2207

firmware: tvservice: Avoid unmatched reference counting on attach/unplugged events
See: #875
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Sep 22, 2017
kernel: Touchscreen spurious event fix
See: raspberrypi/linux#2207

firmware: tvservice: Avoid unmatched reference counting on attach/unplugged events
See: raspberrypi/firmware#875
popcornmix added a commit that referenced this issue Sep 24, 2017
kernel: Touchscreen spurious event fix
See: raspberrypi/linux#2207

firmware: tvservice: Avoid unmatched reference counting on attach/unplugged events
See: #875
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Sep 24, 2017
kernel: Touchscreen spurious event fix
See: raspberrypi/linux#2207

firmware: tvservice: Avoid unmatched reference counting on attach/unplugged events
See: raspberrypi/firmware#875
@Aiexis
Copy link
Author

Aiexis commented Sep 29, 2017

It crashed this night, after about 6 days.
log msg:
https://pastebin.com/eN8uNtLV
(can't access the first lines)

log assert:
https://pastebin.com/bRWRf8aa

popcornmix added a commit to raspberrypi/userland that referenced this issue Oct 6, 2017
jfmherokiller pushed a commit to jfmherokiller/userland that referenced this issue Feb 4, 2018
@JamesH65
Copy link
Contributor

@Aiexis Any luck with this?

This issue will be closed within 30 days unless further interactions are posted. If you wish this issue to remain open, please add a comment. A closed issue may be reopened if requested.

@JamesH65 JamesH65 added the Close within 30 days This issue will be closed within 30 days unless further interactions are posted. If you wish this is label Jan 10, 2019
@Aiexis
Copy link
Author

Aiexis commented Jan 10, 2019

hi, no luck, i'm still experiment omxplayer's freezes.
We have about 350 RPI running with 24/24 videos, and we got an average 10 RPI by day suffuring from this issue. I've made a script that checks load average and "[tvservice]" or "[omxplayer]" in processes every minute and reboot if so.
I gave up finding a solution...

@JamesH65 JamesH65 added bug and removed Close within 30 days This issue will be closed within 30 days unless further interactions are posted. If you wish this is labels Jan 10, 2019
@JamesH65
Copy link
Contributor

Are you using the most recent Raspbian? If not what are you using?

@Aiexis
Copy link
Author

Aiexis commented Jan 10, 2019

We use different versions, based on minibian stretch
4.14.34-v7+ #1110 SMP Mon Apr 16 15:18:51 BST 2018 armv7l GNU/Linux (the last for us)
but also some older version based on jessie
4.9.35-v7+ #1014
4.9.28+ #998
4.1.18-v7+ #846
4.4.41-v7+ #942
etc

@JamesH65
Copy link
Contributor

I guess the first question to ask is does this still happen on the latest Raspbian? Seems that Minibian hasn't been updated for a few years (in the links I saw), although you must have updated the kernel fairly recently.

@JamesH65
Copy link
Contributor

Also, reading your logs, I suspect that the latest error is actually different from the original one. The latest logs show an error in the h264 system somewhere.

What sort of videos are you playing back, and how do you schedule them? Is this some sort of billboard system? We do have a lot of customers who do that sort of
thing without problem.

@Aiexis
Copy link
Author

Aiexis commented Jan 10, 2019

I haven't tested on raspbian since several months, so I don't know... But there was the issue on the last raspbian I tested.
The last minibian provided in the official page is outdated but I always update the kernel to the last stretch version.

Videos are most of the time mp4 h264. It's indeed a billboard system with, to simplify, a simple php script playing a playlist of videos like this:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants