Polling /sys/class/thermal/thermal_zone0/temp eventually results in kernel backtrace #132

MilhouseVH · 2013-01-07T19:25:37Z

Repeatedly reading/polling /sys/class/thermal/thermal_zone0/temp will eventually result in a kernel backtrace and the process attempting to read the temperature will hang. This can occur after a couple of days when polling /sys/class/thermal/thermal_zone0/temp every two seconds, or sooner if polling more frequently.

With the following Raspbian 512MB system:

pi@raspberrypi /var/log $ vcgencmd version
Oct 29 2012 23:13:42
Copyright (c) 2012 Broadcom
version 346980 (release)
pi@raspberrypi /var/log $ uname -a
Linux raspberrypi 3.2.27+ #250 PREEMPT Thu Oct 18 19:03:02 BST 2012 armv6l GNU/Linux

I ran the following test, commencing Jan 6 17:06

X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done

On Jan 7 at 01:48:46, the process became hung:

Mon Jan  7 01:48:46 GMT 2013: 57300 (1386574)
Mon Jan  7 01:48:46 GMT 2013: 57838 (1386575)
Mon Jan  7 01:48:46 GMT 2013: 57838 (1386576)

In syslog and kern.log, the following corresponding entries were found:

Jan  7 01:48:46 raspberrypi kernel: [31346.634917] bcm2708_vcio: mbox chan 8 overflow - drop 5d892008
Jan  7 01:52:18 raspberrypi kernel: [31558.618283] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:52:18 raspberrypi kernel: [31558.618299] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:52:18 raspberrypi kernel: [31558.618310] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:52:18 raspberrypi kernel: [31558.618354] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:52:18 raspberrypi kernel: [31558.618389] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:52:18 raspberrypi kernel: [31558.618414] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:52:18 raspberrypi kernel: [31558.618443] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:52:18 raspberrypi kernel: [31558.618466] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:52:18 raspberrypi kernel: [31558.618497] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:52:18 raspberrypi kernel: [31558.618521] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:52:18 raspberrypi kernel: [31558.618547] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:52:18 raspberrypi kernel: [31558.618573] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:52:18 raspberrypi kernel: [31558.618588] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:52:18 raspberrypi kernel: [31558.618596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:52:18 raspberrypi kernel: [31558.618606] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:52:18 raspberrypi kernel: [31558.618631] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:52:18 raspberrypi kernel: [31558.618651] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:52:18 raspberrypi kernel: [31558.618672] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:52:18 raspberrypi kernel: [31558.618693] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:52:18 raspberrypi kernel: [31558.618716] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:52:18 raspberrypi kernel: [31558.618739] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:52:18 raspberrypi kernel: [31558.618760] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:52:18 raspberrypi kernel: [31558.618789] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:52:18 raspberrypi kernel: [31558.618818] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:52:18 raspberrypi kernel: [31558.618843] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:52:18 raspberrypi kernel: [31558.618862] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:52:18 raspberrypi kernel: [31558.618883] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 01:54:18 raspberrypi kernel: [31678.608927] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:54:18 raspberrypi kernel: [31678.608942] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:54:18 raspberrypi kernel: [31678.608954] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:54:18 raspberrypi kernel: [31678.608997] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:54:18 raspberrypi kernel: [31678.609033] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:54:18 raspberrypi kernel: [31678.609058] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:54:18 raspberrypi kernel: [31678.609087] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:54:18 raspberrypi kernel: [31678.609109] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:54:18 raspberrypi kernel: [31678.609141] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:54:18 raspberrypi kernel: [31678.609165] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:54:18 raspberrypi kernel: [31678.609190] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:54:18 raspberrypi kernel: [31678.609216] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:54:18 raspberrypi kernel: [31678.609231] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:54:18 raspberrypi kernel: [31678.609239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:54:18 raspberrypi kernel: [31678.609249] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:54:18 raspberrypi kernel: [31678.609274] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:54:18 raspberrypi kernel: [31678.609294] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:54:18 raspberrypi kernel: [31678.609314] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:54:18 raspberrypi kernel: [31678.609337] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:54:18 raspberrypi kernel: [31678.609360] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:54:18 raspberrypi kernel: [31678.609383] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:54:18 raspberrypi kernel: [31678.609404] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:54:18 raspberrypi kernel: [31678.609433] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:54:18 raspberrypi kernel: [31678.609463] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:54:18 raspberrypi kernel: [31678.609487] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:54:18 raspberrypi kernel: [31678.609508] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:54:18 raspberrypi kernel: [31678.609528] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 01:56:18 raspberrypi kernel: [31798.599576] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:56:18 raspberrypi kernel: [31798.599591] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:56:18 raspberrypi kernel: [31798.599602] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:56:18 raspberrypi kernel: [31798.599646] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:56:18 raspberrypi kernel: [31798.599681] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:56:18 raspberrypi kernel: [31798.599707] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:56:18 raspberrypi kernel: [31798.599735] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:56:18 raspberrypi kernel: [31798.599758] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:56:18 raspberrypi kernel: [31798.599790] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:56:18 raspberrypi kernel: [31798.599815] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:56:18 raspberrypi kernel: [31798.599839] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:56:18 raspberrypi kernel: [31798.599867] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:56:18 raspberrypi kernel: [31798.599881] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:56:18 raspberrypi kernel: [31798.599889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:56:18 raspberrypi kernel: [31798.599899] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:56:18 raspberrypi kernel: [31798.599924] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:56:18 raspberrypi kernel: [31798.599944] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:56:18 raspberrypi kernel: [31798.599964] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:56:18 raspberrypi kernel: [31798.599987] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:56:18 raspberrypi kernel: [31798.600010] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:56:18 raspberrypi kernel: [31798.600034] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:56:18 raspberrypi kernel: [31798.600055] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:56:18 raspberrypi kernel: [31798.600083] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:56:18 raspberrypi kernel: [31798.600112] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:56:18 raspberrypi kernel: [31798.600137] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:56:18 raspberrypi kernel: [31798.600157] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:56:18 raspberrypi kernel: [31798.600178] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 01:58:18 raspberrypi kernel: [31918.590229] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 01:58:18 raspberrypi kernel: [31918.590244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:58:18 raspberrypi kernel: [31918.590255] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 01:58:18 raspberrypi kernel: [31918.590299] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 01:58:18 raspberrypi kernel: [31918.590334] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 01:58:18 raspberrypi kernel: [31918.590360] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 01:58:18 raspberrypi kernel: [31918.590388] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:58:18 raspberrypi kernel: [31918.590411] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 01:58:18 raspberrypi kernel: [31918.590443] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 01:58:18 raspberrypi kernel: [31918.590468] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 01:58:18 raspberrypi kernel: [31918.590493] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 01:58:18 raspberrypi kernel: [31918.590519] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 01:58:18 raspberrypi kernel: [31918.590533] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 01:58:18 raspberrypi kernel: [31918.590542] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 01:58:18 raspberrypi kernel: [31918.590551] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 01:58:18 raspberrypi kernel: [31918.590576] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 01:58:18 raspberrypi kernel: [31918.590596] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 01:58:18 raspberrypi kernel: [31918.590617] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 01:58:18 raspberrypi kernel: [31918.590639] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 01:58:18 raspberrypi kernel: [31918.590662] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 01:58:18 raspberrypi kernel: [31918.590686] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 01:58:18 raspberrypi kernel: [31918.590706] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 01:58:18 raspberrypi kernel: [31918.590735] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 01:58:18 raspberrypi kernel: [31918.590764] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 01:58:18 raspberrypi kernel: [31918.590789] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 01:58:18 raspberrypi kernel: [31918.590809] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 01:58:18 raspberrypi kernel: [31918.590829] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)
Jan  7 02:00:18 raspberrypi kernel: [32038.580902] INFO: task kworker/0:2:1656 blocked for more than 120 seconds.
Jan  7 02:00:18 raspberrypi kernel: [32038.580919] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 02:00:18 raspberrypi kernel: [32038.580930] kworker/0:2     D c037a204     0  1656      2 0x00000000
Jan  7 02:00:18 raspberrypi kernel: [32038.580974] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154)
Jan  7 02:00:18 raspberrypi kernel: [32038.581010] [<c037b720>] (__mutex_lock_slowpath+0xb8/0x154) from [<c001a218>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 02:00:18 raspberrypi kernel: [32038.581036] [<c001a218>] (bcm_mailbox_write+0x34/0xb8) from [<c001a390>] (bcm_mailbox_property+0x5c/0xe4)
Jan  7 02:00:18 raspberrypi kernel: [32038.581065] [<c001a390>] (bcm_mailbox_property+0x5c/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 02:00:18 raspberrypi kernel: [32038.581088] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c029704c>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 02:00:18 raspberrypi kernel: [32038.581121] [<c029704c>] (thermal_zone_device_update+0x2c/0x240) from [<c003da30>] (process_one_work+0x134/0x38c)
Jan  7 02:00:18 raspberrypi kernel: [32038.581145] [<c003da30>] (process_one_work+0x134/0x38c) from [<c003e4a4>] (worker_thread+0x1a0/0x354)
Jan  7 02:00:18 raspberrypi kernel: [32038.581170] [<c003e4a4>] (worker_thread+0x1a0/0x354) from [<c0042e3c>] (kthread+0x84/0x8c)
Jan  7 02:00:18 raspberrypi kernel: [32038.581197] [<c0042e3c>] (kthread+0x84/0x8c) from [<c000e930>] (kernel_thread_exit+0x0/0x8)
Jan  7 02:00:18 raspberrypi kernel: [32038.581212] INFO: task cat:18732 blocked for more than 120 seconds.
Jan  7 02:00:18 raspberrypi kernel: [32038.581220] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 02:00:18 raspberrypi kernel: [32038.581229] cat             D c037a204     0 18732   2185 0x00000000
Jan  7 02:00:18 raspberrypi kernel: [32038.581254] [<c037a204>] (__schedule+0x2bc/0x568) from [<c037aea8>] (schedule_timeout+0x198/0x230)
Jan  7 02:00:18 raspberrypi kernel: [32038.581274] [<c037aea8>] (schedule_timeout+0x198/0x230) from [<c037bb9c>] (__down+0x88/0xc0)
Jan  7 02:00:18 raspberrypi kernel: [32038.581296] [<c037bb9c>] (__down+0x88/0xc0) from [<c004809c>] (down+0x60/0x64)
Jan  7 02:00:18 raspberrypi kernel: [32038.581319] [<c004809c>] (down+0x60/0x64) from [<c001a2fc>] (bcm_mailbox_read+0x60/0x98)
Jan  7 02:00:18 raspberrypi kernel: [32038.581341] [<c001a2fc>] (bcm_mailbox_read+0x60/0x98) from [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4)
Jan  7 02:00:18 raspberrypi kernel: [32038.581364] [<c001a3d4>] (bcm_mailbox_property+0xa0/0xe4) from [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 02:00:18 raspberrypi kernel: [32038.581385] [<c0297e2c>] (bcm2835_get_temp+0x3c/0x64) from [<c0296934>] (temp_show+0x28/0x54)
Jan  7 02:00:18 raspberrypi kernel: [32038.581414] [<c0296934>] (temp_show+0x28/0x54) from [<c02232f8>] (dev_attr_show+0x1c/0x48)
Jan  7 02:00:18 raspberrypi kernel: [32038.581444] [<c02232f8>] (dev_attr_show+0x1c/0x48) from [<c0114ff8>] (sysfs_read_file+0x90/0x134)
Jan  7 02:00:18 raspberrypi kernel: [32038.581468] [<c0114ff8>] (sysfs_read_file+0x90/0x134) from [<c00b88c4>] (vfs_read+0x98/0x16c)
Jan  7 02:00:18 raspberrypi kernel: [32038.581488] [<c00b88c4>] (vfs_read+0x98/0x16c) from [<c00b89d0>] (sys_read+0x38/0x70)
Jan  7 02:00:18 raspberrypi kernel: [32038.581508] [<c00b89d0>] (sys_read+0x38/0x70) from [<c000d980>] (ret_fast_syscall+0x0/0x30)

In addition, when running two (or more) concurrent tests, the values returned by the thermal driver are often invalid (ie. large, negative values). The likelihood of invalid values being returned by the thermal driver increases with the number of concurrently running tests. This might suggest there is an underlying concurrency issue in the thermal driver.

The text was updated successfully, but these errors were encountered:

MilhouseVH · 2013-01-07T20:35:14Z

Reproducible with latest firmware and kernel:

pi@raspberrypi ~ $ vcgencmd version
Jan  6 2013 20:27:42
Copyright (c) 2012 Broadcom
version 360762 (release)
pi@raspberrypi ~ $ uname -a
Linux raspberrypi 3.6.11+ #348 PREEMPT Tue Jan 1 16:33:22 GMT 2013 armv6l GNU/Linux

Just run the following test:

X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &
X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &
X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &
X=0; while true; do echo "$(date): $(cat /sys/class/thermal/thermal_zone0/temp) ($(echo $X))"; X=$(($X+1)); done &

And you should see the processes all hang within 20 seconds, and the following stacktrace will appear in syslog/kern.log after a few minutes:

Jan  7 20:30:04 raspberrypi kernel: [   89.930207] bcm2708_vcio: mbox chan 8 overflow - drop 5e044008
Jan  7 20:32:35 raspberrypi kernel: [  241.102918] INFO: task kworker/0:3:289 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.102933] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.102941] kworker/0:3     D c0397440     0   289      2 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.102995] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0397e24>] (schedule_preempt_disabled+0x24/0x34)
Jan  7 20:32:35 raspberrypi kernel: [  241.103016] [<c0397e24>] (schedule_preempt_disabled+0x24/0x34) from [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103048] [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c) from [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103066] [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8) from [<c001a738>] (bcm_mailbox_property+0x68/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103090] [<c001a738>] (bcm_mailbox_property+0x68/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103108] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02af348>] (thermal_zone_device_update+0x2c/0x240)
Jan  7 20:32:35 raspberrypi kernel: [  241.103127] [<c02af348>] (thermal_zone_device_update+0x2c/0x240) from [<c0035010>] (process_one_work+0x150/0x40c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103142] [<c0035010>] (process_one_work+0x150/0x40c) from [<c00355f0>] (worker_thread+0x150/0x498)
Jan  7 20:32:35 raspberrypi kernel: [  241.103163] [<c00355f0>] (worker_thread+0x150/0x498) from [<c003a7b4>] (kthread+0x88/0x94)
Jan  7 20:32:35 raspberrypi kernel: [  241.103186] [<c003a7b4>] (kthread+0x88/0x94) from [<c000e9fc>] (kernel_thread_exit+0x0/0x8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103214] INFO: task cat:3003 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.103219] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.103224] cat             D c0397440     0  3003   2207 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.103252] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0396180>] (schedule_timeout+0x158/0x1e0)
Jan  7 20:32:35 raspberrypi kernel: [  241.103270] [<c0396180>] (schedule_timeout+0x158/0x1e0) from [<c0396e70>] (__down+0x88/0xc0)
Jan  7 20:32:35 raspberrypi kernel: [  241.103292] [<c0396e70>] (__down+0x88/0xc0) from [<c00400c0>] (down+0x60/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103313] [<c00400c0>] (down+0x60/0x64) from [<c001a698>] (bcm_mailbox_read+0x60/0x98)
Jan  7 20:32:35 raspberrypi kernel: [  241.103330] [<c001a698>] (bcm_mailbox_read+0x60/0x98) from [<c001a784>] (bcm_mailbox_property+0xb4/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103348] [<c001a784>] (bcm_mailbox_property+0xb4/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103364] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02ae9c0>] (temp_show+0x28/0x54)
Jan  7 20:32:35 raspberrypi kernel: [  241.103386] [<c02ae9c0>] (temp_show+0x28/0x54) from [<c0233c48>] (dev_attr_show+0x1c/0x48)
Jan  7 20:32:35 raspberrypi kernel: [  241.103407] [<c0233c48>] (dev_attr_show+0x1c/0x48) from [<c011dc38>] (sysfs_read_file+0x90/0x134)
Jan  7 20:32:35 raspberrypi kernel: [  241.103433] [<c011dc38>] (sysfs_read_file+0x90/0x134) from [<c00c0b78>] (vfs_read+0x98/0x16c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103450] [<c00c0b78>] (vfs_read+0x98/0x16c) from [<c00c0c84>] (sys_read+0x38/0x70)
Jan  7 20:32:35 raspberrypi kernel: [  241.103468] [<c00c0c84>] (sys_read+0x38/0x70) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
Jan  7 20:32:35 raspberrypi kernel: [  241.103476] INFO: task cat:3009 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.103482] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.103487] cat             D c0397440     0  3009   2219 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.103515] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0397e24>] (schedule_preempt_disabled+0x24/0x34)
Jan  7 20:32:35 raspberrypi kernel: [  241.103533] [<c0397e24>] (schedule_preempt_disabled+0x24/0x34) from [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103555] [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c) from [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103572] [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8) from [<c001a738>] (bcm_mailbox_property+0x68/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103590] [<c001a738>] (bcm_mailbox_property+0x68/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103605] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02ae9c0>] (temp_show+0x28/0x54)
Jan  7 20:32:35 raspberrypi kernel: [  241.103622] [<c02ae9c0>] (temp_show+0x28/0x54) from [<c0233c48>] (dev_attr_show+0x1c/0x48)
Jan  7 20:32:35 raspberrypi kernel: [  241.103639] [<c0233c48>] (dev_attr_show+0x1c/0x48) from [<c011dc38>] (sysfs_read_file+0x90/0x134)
Jan  7 20:32:35 raspberrypi kernel: [  241.103658] [<c011dc38>] (sysfs_read_file+0x90/0x134) from [<c00c0b78>] (vfs_read+0x98/0x16c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103675] [<c00c0b78>] (vfs_read+0x98/0x16c) from [<c00c0c84>] (sys_read+0x38/0x70)
Jan  7 20:32:35 raspberrypi kernel: [  241.103692] [<c00c0c84>] (sys_read+0x38/0x70) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
Jan  7 20:32:35 raspberrypi kernel: [  241.103701] INFO: task cat:3010 blocked for more than 120 seconds.
Jan  7 20:32:35 raspberrypi kernel: [  241.103706] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  7 20:32:35 raspberrypi kernel: [  241.103711] cat             D c0397440     0  3010   2244 0x00000000
Jan  7 20:32:35 raspberrypi kernel: [  241.103738] [<c0397440>] (__schedule+0x2c4/0x5b0) from [<c0397e24>] (schedule_preempt_disabled+0x24/0x34)
Jan  7 20:32:35 raspberrypi kernel: [  241.103757] [<c0397e24>] (schedule_preempt_disabled+0x24/0x34) from [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103777] [<c03969e4>] (__mutex_lock_slowpath+0xac/0x13c) from [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8)
Jan  7 20:32:35 raspberrypi kernel: [  241.103793] [<c001a5b4>] (bcm_mailbox_write+0x34/0xb8) from [<c001a738>] (bcm_mailbox_property+0x68/0x148)
Jan  7 20:32:35 raspberrypi kernel: [  241.103811] [<c001a738>] (bcm_mailbox_property+0x68/0x148) from [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64)
Jan  7 20:32:35 raspberrypi kernel: [  241.103827] [<c02b03b0>] (bcm2835_get_temp+0x3c/0x64) from [<c02ae9c0>] (temp_show+0x28/0x54)
Jan  7 20:32:35 raspberrypi kernel: [  241.103844] [<c02ae9c0>] (temp_show+0x28/0x54) from [<c0233c48>] (dev_attr_show+0x1c/0x48)
Jan  7 20:32:35 raspberrypi kernel: [  241.103861] [<c0233c48>] (dev_attr_show+0x1c/0x48) from [<c011dc38>] (sysfs_read_file+0x90/0x134)
Jan  7 20:32:35 raspberrypi kernel: [  241.103880] [<c011dc38>] (sysfs_read_file+0x90/0x134) from [<c00c0b78>] (vfs_read+0x98/0x16c)
Jan  7 20:32:35 raspberrypi kernel: [  241.103897] [<c00c0b78>] (vfs_read+0x98/0x16c) from [<c00c0c84>] (sys_read+0x38/0x70)
Jan  7 20:32:35 raspberrypi kernel: [  241.103914] [<c00c0c84>] (sys_read+0x38/0x70) from [<c000da60>] (ret_fast_syscall+0x0/0x30)

popcornmix · 2013-01-07T20:54:12Z

Ah, I see an issue. I think we need a mutex around bcm_mailbox_property.
I'm guessing we are getting a second bcm_mailbox_property call before the first has responded.
I'll have a look.

popcornmix · 2013-01-07T22:37:39Z

Update is pushout out. Use rpi-update and test please.

simonjhall · 2013-01-07T22:54:34Z

Out of interest, will this affect me doing mailbox execute? My calls are all synchronous (from my POV) but that doesn't mean I can't fall foul of a mutex ;-)

MilhouseVH · 2013-01-07T23:02:04Z

Thanks Dom.

Hopefully I've picked up the right version:

pi@raspberrypi ~ $ vcgencmd version
Jan  7 2013 21:59:59
Copyright (c) 2012 Broadcom
version 360961 (release)
pi@raspberrypi ~ $ uname -a
Linux raspberrypi 3.6.11+ #350 PREEMPT Mon Jan 7 21:51:11 GMT 2013 armv6l GNU/Linux

4 concurrent read loops seem to be progressing nicely, so I think this can be closed.

However, I should add that I still notice the occasional bogus value being returned, eg:

Mon Jan  7 22:52:48 GMT 2013: 59451 (1368)
Mon Jan  7 22:52:48 GMT 2013: -650788984 (1363)
Mon Jan  7 22:52:48 GMT 2013: 85000 (1372)
Mon Jan  7 22:52:49 GMT 2013: 59451 (1364)

Both the second and third values are clearly not valid.

Any idea if that needs to be looked at? I see these bogus values even with a single loop, for example when starting a second ssh window you'll see a bogus value appear in the window running the loop.

popcornmix · 2013-01-08T01:09:20Z

The bogus values are a different issue to the hang.

The currently disabled message "Failed to get temperature" does get output when we get a spurious value,
so the temperature driver spots the problem.

I could add a retry, which would avoid the problem, but I'd prefer to understand it...

MilhouseVH · 2013-01-08T03:10:17Z

Sure, understood - I'll open a separate issue for the bogus value just so that it's on the list, though it's a very minor problem and easily worked around with a retry until a sane value is acquired.

I've been running the 4-loop torture test for about 4.5 hours now, and there have been no hangs and syslog/kern.log is clean, so I'm going to close this issue as fixed. Many thanks for the quick resolution.

popcornmix · 2013-01-08T11:23:53Z

@simonjhall
Do you use the bcm_mailbox_property function?
If so then previously you could have been clobbered by another mailbox message (like a temperature read).
However you would know about it ("mbox chan 8 overflow" and kernel task hang).

simonjhall · 2013-01-08T11:31:07Z

I use your ioctl + IOCTL_MBOX_PROPERTY system, called from user code.

popcornmix · 2013-01-08T11:32:06Z

Yes, that makes use of bcm_mailbox_property function.

simonjhall · 2013-01-08T11:34:36Z

Oooh. I send run lots of VPU jobs, and I could imagine that this could clash if someone had a CPU temp display on their desktop or something. Also, what about the clock speed freq system?

popcornmix · 2013-01-08T11:58:35Z

Yes, on demand cpufreq scheduler will cause mailbox property writes.

stupid-boy · 2013-01-08T13:39:48Z

fix from Jan 7 doesn't solve problem. most likely it only stashes it temporally.
other symptoms for may be same bug and new logs with Jan 7 build:
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=31&t=27908

popcornmix · 2013-01-08T14:32:24Z

@stupid-boy
Your problem is unrelated. Looks like the GPU has been hosed (which eventually caused a timeout from cpufreq driver not getting a response). Best open another issue.

rce1086 · 2015-07-28T22:37:33Z

I see this lockup when attempting to read the core temp regularly, since I'm trying to use RPIMonitor... same kernel messages as above; this particular Pi hosed at around midday today, even though it wasn't yet running RPImonitor at that point, so I don't think anything would have been reading temp, but I do have the 'on demand' CPU govenor running...
I updated yesterday to "3.18.16+ #796 PREEMPT Fri Jun 19 21:01:03 BST 2015 armv6l" via a targetted rpi-update, so should be pretty current for firmware.
Is there anyting I can do to help you understand this?

MilhouseVH closed this as completed Jan 8, 2013

MilhouseVH mentioned this issue Jan 8, 2013

Bogus value sometimes returned by /sys/class/thermal/thermal_zone0/temp #133

Closed

MilhouseVH mentioned this issue Jun 29, 2013

kworker task blocked - reading temperature? #192

Closed

popcornmix mentioned this issue Jun 15, 2015

bcm2835-thermal and mailbox property retry raspberrypi/linux#1025

Closed

cyoung mentioned this issue May 11, 2017

Rework fan PWM control cyoung/stratux#593

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Polling /sys/class/thermal/thermal_zone0/temp eventually results in kernel backtrace #132

Polling /sys/class/thermal/thermal_zone0/temp eventually results in kernel backtrace #132

MilhouseVH commented Jan 7, 2013

MilhouseVH commented Jan 7, 2013

Uh oh!

popcornmix commented Jan 7, 2013

Uh oh!

popcornmix commented Jan 7, 2013

Uh oh!

simonjhall commented Jan 7, 2013

Uh oh!

MilhouseVH commented Jan 7, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

MilhouseVH commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

simonjhall commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

simonjhall commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

stupid-boy commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

rce1086 commented Jul 28, 2015

Uh oh!

Polling /sys/class/thermal/thermal_zone0/temp eventually results in kernel backtrace #132

Polling /sys/class/thermal/thermal_zone0/temp eventually results in kernel backtrace #132

Comments

MilhouseVH commented Jan 7, 2013

MilhouseVH commented Jan 7, 2013

Uh oh!

popcornmix commented Jan 7, 2013

Uh oh!

popcornmix commented Jan 7, 2013

Uh oh!

simonjhall commented Jan 7, 2013

Uh oh!

MilhouseVH commented Jan 7, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

MilhouseVH commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

simonjhall commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

simonjhall commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

stupid-boy commented Jan 8, 2013

Uh oh!

popcornmix commented Jan 8, 2013

Uh oh!

rce1086 commented Jul 28, 2015

Uh oh!