Skip to content

Polling /sys/class/thermal/thermal_zone0/temp eventually results in kernel backtrace #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MilhouseVH opened this issue Jan 7, 2013 · 15 comments

Comments

@MilhouseVH
Copy link

Link to Forum Discussion thread

Repeatedly reading/polling /sys/class/thermal/thermal_zone0/temp will eventually result in a kernel backtrace and the process attempting to read the temperature will hang. This can occur after a couple of days when polling /sys/class/thermal/thermal_zone0/temp every two seconds, or sooner if polling more frequently.

With the following Raspbian 512MB system:

pi@raspberrypi /var/log $ vcgencmd version
Oct 29 2012 23:13:42
Copyright (c) 2012 Broadcom
version 346980 (release)
pi@raspberrypi /var/log $ uname -a
Linux raspberrypi 3.2.27+ #250 PREEMPT Thu Oct 18 19:03:02 BST 2012 armv6l GNU/Linux

I ran the following test, commencing Jan 6 17:06

X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done

On Jan 7 at 01:48:46, the process became hung:

Mon Jan  7 01:48:46 GMT 2013: 57300 (1386574)
Mon Jan  7 01:48:46 GMT 2013: 57838 (1386575)
Mon Jan  7 01:48:46 GMT 2013: 57838 (1386576)

In syslog and kern.log, the following corresponding entries were found:

Jan  7 01:48:46 raspberrypi kernel: [31346.634917] bcm2708_vcio: mbox chan 8 overflow - drop 5d892008
Jan  7 01:52:18 raspberrypi kernel: [31558.618283] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:52:18 raspberrypi kernel: [31558.618299] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:52:18 raspberrypi kernel: [31558.618310] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:52:18 raspberrypi kernel: [31558.618354] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:52:18 raspberrypi kernel: [31558.618389] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:52:18 raspberrypi kernel: [31558.618414] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:52:18 raspberrypi kernel: [31558.618443] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:52:18 raspberrypi kernel: [31558.618466] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:52:18 raspberrypi kernel: [31558.618497] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:52:18 raspberrypi kernel: [31558.618521] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:52:18 raspberrypi kernel: [31558.618547] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:52:18 raspberrypi kernel: [31558.618573] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:52:18 raspberrypi kernel: [31558.618588] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:52:18 raspberrypi kernel: [31558.618596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:52:18 raspberrypi kernel: [31558.618606] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:52:18 raspberrypi kernel: [31558.618631] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:52:18 raspberrypi kernel: [31558.618651] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:52:18 raspberrypi kernel: [31558.618672] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:52:18 raspberrypi kernel: [31558.618693] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:52:18 raspberrypi kernel: [31558.618716] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:52:18 raspberrypi kernel: [31558.618739] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:52:18 raspberrypi kernel: [31558.618760] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:52:18 raspberrypi kernel: [31558.618789] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:52:18 raspberrypi kernel: [31558.618818] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:52:18 raspberrypi kernel: [31558.618843] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:52:18 raspberrypi kernel: [31558.618862] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:52:18 raspberrypi kernel: [31558.618883] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 01:54:18 raspberrypi kernel: [31678.608927] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:54:18 raspberrypi kernel: [31678.608942] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:54:18 raspberrypi kernel: [31678.608954] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:54:18 raspberrypi kernel: [31678.608997] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:54:18 raspberrypi kernel: [31678.609033] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:54:18 raspberrypi kernel: [31678.609058] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:54:18 raspberrypi kernel: [31678.609087] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:54:18 raspberrypi kernel: [31678.609109] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:54:18 raspberrypi kernel: [31678.609141] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:54:18 raspberrypi kernel: [31678.609165] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:54:18 raspberrypi kernel: [31678.609190] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:54:18 raspberrypi kernel: [31678.609216] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:54:18 raspberrypi kernel: [31678.609231] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:54:18 raspberrypi kernel: [31678.609239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:54:18 raspberrypi kernel: [31678.609249] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:54:18 raspberrypi kernel: [31678.609274] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:54:18 raspberrypi kernel: [31678.609294] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:54:18 raspberrypi kernel: [31678.609314] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:54:18 raspberrypi kernel: [31678.609337] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:54:18 raspberrypi kernel: [31678.609360] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:54:18 raspberrypi kernel: [31678.609383] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:54:18 raspberrypi kernel: [31678.609404] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:54:18 raspberrypi kernel: [31678.609433] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:54:18 raspberrypi kernel: [31678.609463] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:54:18 raspberrypi kernel: [31678.609487] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:54:18 raspberrypi kernel: [31678.609508] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:54:18 raspberrypi kernel: [31678.609528] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 01:56:18 raspberrypi kernel: [31798.599576] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:56:18 raspberrypi kernel: [31798.599591] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:56:18 raspberrypi kernel: [31798.599602] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:56:18 raspberrypi kernel: [31798.599646] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:56:18 raspberrypi kernel: [31798.599681] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:56:18 raspberrypi kernel: [31798.599707] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:56:18 raspberrypi kernel: [31798.599735] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:56:18 raspberrypi kernel: [31798.599758] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:56:18 raspberrypi kernel: [31798.599790] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:56:18 raspberrypi kernel: [31798.599815] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:56:18 raspberrypi kernel: [31798.599839] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:56:18 raspberrypi kernel: [31798.599867] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:56:18 raspberrypi kernel: [31798.599881] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:56:18 raspberrypi kernel: [31798.599889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:56:18 raspberrypi kernel: [31798.599899] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:56:18 raspberrypi kernel: [31798.599924] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:56:18 raspberrypi kernel: [31798.599944] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:56:18 raspberrypi kernel: [31798.599964] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:56:18 raspberrypi kernel: [31798.599987] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:56:18 raspberrypi kernel: [31798.600010] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:56:18 raspberrypi kernel: [31798.600034] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:56:18 raspberrypi kernel: [31798.600055] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:56:18 raspberrypi kernel: [31798.600083] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:56:18 raspberrypi kernel: [31798.600112] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:56:18 raspberrypi kernel: [31798.600137] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:56:18 raspberrypi kernel: [31798.600157] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:56:18 raspberrypi kernel: [31798.600178] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 01:58:18 raspberrypi kernel: [31918.590229] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:58:18 raspberrypi kernel: [31918.590244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:58:18 raspberrypi kernel: [31918.590255] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:58:18 raspberrypi kernel: [31918.590299] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:58:18 raspberrypi kernel: [31918.590334] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:58:18 raspberrypi kernel: [31918.590360] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:58:18 raspberrypi kernel: [31918.590388] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:58:18 raspberrypi kernel: [31918.590411] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:58:18 raspberrypi kernel: [31918.590443] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:58:18 raspberrypi kernel: [31918.590468] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:58:18 raspberrypi kernel: [31918.590493] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:58:18 raspberrypi kernel: [31918.590519] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:58:18 raspberrypi kernel: [31918.590533] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:58:18 raspberrypi kernel: [31918.590542] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:58:18 raspberrypi kernel: [31918.590551] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:58:18 raspberrypi kernel: [31918.590576] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:58:18 raspberrypi kernel: [31918.590596] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:58:18 raspberrypi kernel: [31918.590617] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:58:18 raspberrypi kernel: [31918.590639] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:58:18 raspberrypi kernel: [31918.590662] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:58:18 raspberrypi kernel: [31918.590686] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:58:18 raspberrypi kernel: [31918.590706] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:58:18 raspberrypi kernel: [31918.590735] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:58:18 raspberrypi kernel: [31918.590764] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:58:18 raspberrypi kernel: [31918.590789] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:58:18 raspberrypi kernel: [31918.590809] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:58:18 raspberrypi kernel: [31918.590829] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 02:00:18 raspberrypi kernel: [32038.580902] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 02:00:18 raspberrypi kernel: [32038.580919] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 02:00:18 raspberrypi kernel: [32038.580930] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 02:00:18 raspberrypi kernel: [32038.580974] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 02:00:18 raspberrypi kernel: [32038.581010] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 02:00:18 raspberrypi kernel: [32038.581036] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 02:00:18 raspberrypi kernel: [32038.581065] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 02:00:18 raspberrypi kernel: [32038.581088] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 02:00:18 raspberrypi kernel: [32038.581121] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 02:00:18 raspberrypi kernel: [32038.581145] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 02:00:18 raspberrypi kernel: [32038.581170] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 02:00:18 raspberrypi kernel: [32038.581197] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 02:00:18 raspberrypi kernel: [32038.581212] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 02:00:18 raspberrypi kernel: [32038.581220] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 02:00:18 raspberrypi kernel: [32038.581229] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 02:00:18 raspberrypi kernel: [32038.581254] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 02:00:18 raspberrypi kernel: [32038.581274] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 02:00:18 raspberrypi kernel: [32038.581296] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 02:00:18 raspberrypi kernel: [32038.581319] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 02:00:18 raspberrypi kernel: [32038.581341] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 02:00:18 raspberrypi kernel: [32038.581364] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 02:00:18 raspberrypi kernel: [32038.581385] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 02:00:18 raspberrypi kernel: [32038.581414] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 02:00:18 raspberrypi kernel: [32038.581444] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 02:00:18 raspberrypi kernel: [32038.581468] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 02:00:18 raspberrypi kernel: [32038.581488] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 02:00:18 raspberrypi kernel: [32038.581508] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)

In addition, when running two (or more) concurrent tests, the values returned by the thermal driver are often invalid (ie. large, negative values). The likelihood of invalid values being returned by the thermal driver increases with the number of concurrently running tests. This might suggest there is an underlying concurrency issue in the thermal driver.

@MilhouseVH
Copy link
Author

Reproducible with latest firmware and kernel:

pi@raspberrypi ~ $ vcgencmd version
Jan  6 2013 20:27:42
Copyright (c) 2012 Broadcom
version 360762 (release)
pi@raspberrypi ~ $ uname -a
Linux raspberrypi 3.6.11+ #348 PREEMPT Tue Jan 1 16:33:22 GMT 2013 armv6l GNU/Linux

Just run the following test:

X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &
X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &
X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &
X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &

And you should see the processes all hang within 20 seconds, and the following stacktrace will appear in syslog/kern.log after a few minutes:

Jan  7 20:30:04 raspberrypi kernel: [   89.930207] bcm2708_vcio: mbox chan 8 overflow - drop 5e044008
Jan  7 20:32:35 raspberrypi kernel: [  241.102918] INFO: task kworker/0:3:289 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.102933] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.102941] kworker/0:3     D c0397440     0   289      2 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.102995] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0397e24>] (schedule_preempt_disabled+0x24/0x34)
Jan  7 20:32:35 raspberrypi kernel: [  241.103016] [<c0397e24>] (schedule_preempt_disabled+0x24/0x34) from [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103048] [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c) from [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103066] [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8) from [<c001a738>] (bcm_mailbox_property+0x68/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103090] [<c001a738>] (bcm_mailbox_property+0x68/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103108] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02af348>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 20:32:35 raspberrypi kernel: [  241.103127] [<c02af348>] (thermal_zone_device_update+0x2c/0x240) from [<c0035010>] (process_one_work+0x150/0x40c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103142] [<c0035010>] (process_one_work+0x150/0x40c) from [<c00355f0>] (worker_thread+0x150/0x498)
Jan  7 20:32:35 raspberrypi kernel: [  241.103163] [<c00355f0>] (worker_thread+0x150/0x498) from [<c003a7b4>] (kthread+0x88/0x94)
Jan  7 20:32:35 raspberrypi kernel: [  241.103186] [<c003a7b4>] (kthread+0x88/0x94) from [<c000e9fc>] (kernel_thread_exit+0x0/0x8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103214] INFO: task cat:3003 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.103219] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.103224] cat             D c0397440     0  3003   2207 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.103252] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0396180>] (schedule_timeout+0x158/0x1e0)
Jan  7 20:32:35 raspberrypi kernel: [  241.103270] [<c0396180>] (schedule_timeout+0x158/0x1e0) from [<c0396e70>] (__down+0x88/0xc0)
Jan  7 20:32:35 raspberrypi kernel: [  241.103292] [<c0396e70>] (__down+0x88/0xc0) from [<c00400c0>] (down+0x60/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103313] [<c00400c0>] (down+0x60/0x64) from [<c001a698>] (bcm_mailbox_read+0x60/0x98)
Jan  7 20:32:35 raspberrypi kernel: [  241.103330] [<c001a698>] (bcm_mailbox_read+0x60/0x98) from [<c001a784>] (bcm_mailbox_property+0xb4/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103348] [<c001a784>] (bcm_mailbox_property+0xb4/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103364] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02ae9c0>] (temp_show+0x28/0x54)
Jan  7 20:32:35 raspberrypi kernel: [  241.103386] [<c02ae9c0>] (temp_show+0x28/0x54) from [<c0233c48>] (dev_attr_show+0x1c/0x48)
Jan  7 20:32:35 raspberrypi kernel: [  241.103407] [<c0233c48>] (dev_attr_show+0x1c/0x48) from [<c011dc38>] (sysfs_read_file+0x90/0x134)
Jan  7 20:32:35 raspberrypi kernel: [  241.103433] [<c011dc38>] (sysfs_read_file+0x90/0x134) from [<c00c0b78>] (vfs_read+0x98/0x16c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103450] [<c00c0b78>] (vfs_read+0x98/0x16c) from [<c00c0c84>] (sys_read+0x38/0x70)
Jan  7 20:32:35 raspberrypi kernel: [  241.103468] [<c00c0c84>] (sys_read+0x38/0x70) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
Jan  7 20:32:35 raspberrypi kernel: [  241.103476] INFO: task cat:3009 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.103482] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.103487] cat             D c0397440     0  3009   2219 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.103515] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0397e24>] (schedule_preempt_disabled+0x24/0x34)
Jan  7 20:32:35 raspberrypi kernel: [  241.103533] [<c0397e24>] (schedule_preempt_disabled+0x24/0x34) from [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103555] [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c) from [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103572] [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8) from [<c001a738>] (bcm_mailbox_property+0x68/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103590] [<c001a738>] (bcm_mailbox_property+0x68/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103605] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02ae9c0>] (temp_show+0x28/0x54)
Jan  7 20:32:35 raspberrypi kernel: [  241.103622] [<c02ae9c0>] (temp_show+0x28/0x54) from [<c0233c48>] (dev_attr_show+0x1c/0x48)
Jan  7 20:32:35 raspberrypi kernel: [  241.103639] [<c0233c48>] (dev_attr_show+0x1c/0x48) from [<c011dc38>] (sysfs_read_file+0x90/0x134)
Jan  7 20:32:35 raspberrypi kernel: [  241.103658] [<c011dc38>] (sysfs_read_file+0x90/0x134) from [<c00c0b78>] (vfs_read+0x98/0x16c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103675] [<c00c0b78>] (vfs_read+0x98/0x16c) from [<c00c0c84>] (sys_read+0x38/0x70)
Jan  7 20:32:35 raspberrypi kernel: [  241.103692] [<c00c0c84>] (sys_read+0x38/0x70) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
Jan  7 20:32:35 raspberrypi kernel: [  241.103701] INFO: task cat:3010 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.103706] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.103711] cat             D c0397440     0  3010   2244 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.103738] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0397e24>] (schedule_preempt_disabled+0x24/0x34)
Jan  7 20:32:35 raspberrypi kernel: [  241.103757] [<c0397e24>] (schedule_preempt_disabled+0x24/0x34) from [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103777] [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c) from [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103793] [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8) from [<c001a738>] (bcm_mailbox_property+0x68/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103811] [<c001a738>] (bcm_mailbox_property+0x68/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103827] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02ae9c0>] (temp_show+0x28/0x54)
Jan  7 20:32:35 raspberrypi kernel: [  241.103844] [<c02ae9c0>] (temp_show+0x28/0x54) from [<c0233c48>] (dev_attr_show+0x1c/0x48)
Jan  7 20:32:35 raspberrypi kernel: [  241.103861] [<c0233c48>] (dev_attr_show+0x1c/0x48) from [<c011dc38>] (sysfs_read_file+0x90/0x134)
Jan  7 20:32:35 raspberrypi kernel: [  241.103880] [<c011dc38>] (sysfs_read_file+0x90/0x134) from [<c00c0b78>] (vfs_read+0x98/0x16c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103897] [<c00c0b78>] (vfs_read+0x98/0x16c) from [<c00c0c84>] (sys_read+0x38/0x70)
Jan  7 20:32:35 raspberrypi kernel: [  241.103914] [<c00c0c84>] (sys_read+0x38/0x70) from [<c000da60>] (ret_fast_syscall+0x0/0x30)

@popcornmix
Copy link
Contributor

Ah, I see an issue. I think we need a mutex around bcm_mailbox_property.
I'm guessing we are getting a second bcm_mailbox_property call before the first has responded.
I'll have a look.

@popcornmix
Copy link
Contributor

Update is pushout out. Use rpi-update and test please.

@simonjhall
Copy link

Out of interest, will this affect me doing mailbox execute? My calls are all synchronous (from my POV) but that doesn't mean I can't fall foul of a mutex ;-)

@MilhouseVH
Copy link
Author

Thanks Dom.

Hopefully I've picked up the right version:

pi@raspberrypi ~ $ vcgencmd version
Jan  7 2013 21:59:59
Copyright (c) 2012 Broadcom
version 360961 (release)
pi@raspberrypi ~ $ uname -a
Linux raspberrypi 3.6.11+ #350 PREEMPT Mon Jan 7 21:51:11 GMT 2013 armv6l GNU/Linux

4 concurrent read loops seem to be progressing nicely, so I think this can be closed.

However, I should add that I still notice the occasional bogus value being returned, eg:

Mon Jan  7 22:52:48 GMT 2013: 59451 (1368)
Mon Jan  7 22:52:48 GMT 2013: -650788984 (1363)
Mon Jan  7 22:52:48 GMT 2013: 85000 (1372)
Mon Jan  7 22:52:49 GMT 2013: 59451 (1364)

Both the second and third values are clearly not valid.

Any idea if that needs to be looked at? I see these bogus values even with a single loop, for example when starting a second ssh window you'll see a bogus value appear in the window running the loop.

@popcornmix
Copy link
Contributor

The bogus values are a different issue to the hang.

The currently disabled message "Failed to get temperature" does get output when we get a spurious value,
so the temperature driver spots the problem.

I could add a retry, which would avoid the problem, but I'd prefer to understand it...

@MilhouseVH
Copy link
Author

Sure, understood - I'll open a separate issue for the bogus value just so that it's on the list, though it's a very minor problem and easily worked around with a retry until a sane value is acquired.

I've been running the 4-loop torture test for about 4.5 hours now, and there have been no hangs and syslog/kern.log is clean, so I'm going to close this issue as fixed. Many thanks for the quick resolution.

@popcornmix
Copy link
Contributor

@simonjhall
Do you use the bcm_mailbox_property function?
If so then previously you could have been clobbered by another mailbox message (like a temperature read).
However you would know about it ("mbox chan 8 overflow" and kernel task hang).

@simonjhall
Copy link

I use your ioctl + IOCTL_MBOX_PROPERTY system, called from user code.

@popcornmix
Copy link
Contributor

Yes, that makes use of bcm_mailbox_property function.

@simonjhall
Copy link

Oooh. I send run lots of VPU jobs, and I could imagine that this could clash if someone had a CPU temp display on their desktop or something. Also, what about the clock speed freq system?

@popcornmix
Copy link
Contributor

Yes, on demand cpufreq scheduler will cause mailbox property writes.

@stupid-boy
Copy link

fix from Jan 7 doesn't solve problem. most likely it only stashes it temporally.
other symptoms for may be same bug and new logs with Jan 7 build:
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=31&t=27908

@popcornmix
Copy link
Contributor

@stupid-boy
Your problem is unrelated. Looks like the GPU has been hosed (which eventually caused a timeout from cpufreq driver not getting a response). Best open another issue.

@rce1086
Copy link

rce1086 commented Jul 28, 2015

I see this lockup when attempting to read the core temp regularly, since I'm trying to use RPIMonitor... same kernel messages as above; this particular Pi hosed at around midday today, even though it wasn't yet running RPImonitor at that point, so I don't think anything would have been reading temp, but I do have the 'on demand' CPU govenor running...
I updated yesterday to "3.18.16+ #796 PREEMPT Fri Jun 19 21:01:03 BST 2015 armv6l" via a targetted rpi-update, so should be pretty current for firmware.
Is there anyting I can do to help you understand this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants