Skip to content

w1 (1-wire) discovery locks up the 1-wire bus when a ds2408 is available #3491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tom36524 opened this issue Mar 7, 2020 · 2 comments
Open

Comments

@Tom36524
Copy link

Tom36524 commented Mar 7, 2020

Describe the bug
The w1 driver in the kernel get locked on bus_master mutex when I use a ds2408 device on the 1-wire bus.
It seems that the w1 driver hangs when it tries to perform a mutex_lock on the master_bus mutex for the second time. This occurs in the discovery process when the add_slave is called for the ds2408.

The 1-wire bus is then unusable for any requests and access.

All newer kernels seems to have the same problem I have tried a couple of them with the same result.

To reproduce
Install w1 kernel driver and let the master discover the devices on the bus.
It is when the discovery of the ds2408 things go wrong.

Expected behaviour
No hang.

Actual behaviour
Add a clear and concise description of what actually happened.

System

  • Which model of Raspberry Pi?
    Pi3B+ Pi4

  • Which OS and version (cat /etc/rpi-issue)?
    Raspberry Pi reference 2020-02-13
    Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 5f884374b6ac6e155330c58caa1fb7249b8badf1, stage4

  • Which firmware version (vcgencmd version)?
    Feb 12 2020 12:38:08
    Copyright (c) 2012 Broadcom
    version 53a54c770c493957d99bf49762dfabc4eee00e45 (clean) (release) (start)

  • Which kernel version (uname -a)?
    In my tries to find the error I updated the kernel to 4.19.106 but the same behaviour is valid for older kernels as well.
    Linux rpi3bplus 4.19.106-v7+ var->green.length may be left uninitialized #1 SMP Mon Mar 2 21:41:45 CET 2020 armv7l GNU/Linux

Logs
Here is the kernel log from the error:
Linux raspberrypi 4.19.97-v7+ #1294 SMP Thu Jan 30 13:15:58 GMT 2020 armv7l GNU/Linux

[ 100.712786] w1_master_driver w1_bus_master1: Attaching one wire slave 28.0000018f6de9 crc b4
[ 100.728719] w1_master_driver w1_bus_master1: Attaching one wire slave 81.000000272bc2 crc 25
[ 100.740602] w1_master_driver w1_bus_master1: Attaching one wire slave 29.000000045ec6 crc 37
[ 243.688691] INFO: task w1_bus_master1:727 blocked for more than 120 seconds.
[ 243.688703] Tainted: G C 4.19.97-v7+ #1294
[ 243.688709] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 243.688718] w1_bus_master1 D 0 727 2 0x00000000
[ 243.688766] [<8085d70c>] (__schedule) from [<8085dd7c>] (schedule+0x50/0xa8)
[ 243.688786] [<8085dd7c>] (schedule) from [<8085e1d8>] (schedule_preempt_disabled+0x18/0x1c)
[ 243.688805] [<8085e1d8>] (schedule_preempt_disabled) from [<8085f288>] (__mutex_lock.constprop.5+0x1a8/0x590)
[ 243.688824] [<8085f288>] (__mutex_lock.constprop.5) from [<8085f78c>] (__mutex_lock_slowpath+0x1c/0x20)
[ 243.688843] [<8085f78c>] (__mutex_lock_slowpath) from [<8085f7ec>] (mutex_lock+0x5c/0x60)
[ 243.688875] [<8085f7ec>] (mutex_lock) from [<7f811664>] (w1_f29_disable_test_mode+0x60/0xb8 [w1_ds2408])
[ 243.688954] [<7f811664>] (w1_f29_disable_test_mode [w1_ds2408]) from [<7f7e5c40>] (w1_attach_slave_device+0x24c/0x470 [wire])
[ 243.689019] [<7f7e5c40>] (w1_attach_slave_device [wire]) from [<7f7e5ff4>] (w1_slave_found+0xcc/0xd4 [wire])
[ 243.689065] [<7f7e5ff4>] (w1_slave_found [wire]) from [<7f7f7904>] (ds9490r_search+0x17c/0x238 [ds2490])
[ 243.689112] [<7f7f7904>] (ds9490r_search [ds2490]) from [<7f7e8e00>] (w1_search_devices+0x4c/0x58 [wire])
[ 243.689172] [<7f7e8e00>] (w1_search_devices [wire]) from [<7f7e6904>] (w1_search_process_cb+0x74/0x120 [wire])
[ 243.689231] [<7f7e6904>] (w1_search_process_cb [wire]) from [<7f7e6b38>] (w1_process+0x104/0x13c [wire])
[ 243.689271] [<7f7e6b38>] (w1_process [wire]) from [<80142ac4>] (kthread+0x138/0x168)
[ 243.689291] [<80142ac4>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28)
[ 243.689298] Exception stack(0xb1579fb0 to 0xb1579ff8)
[ 243.689309] 9fa0: 00000000 00000000 00000000 00000000
[ 243.689322] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 243.689333] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000

Additional context
My suggestion for fixing the problem is (without really understanding how the mutex in the w1 is intended to work is added below.
The add_slave function for ds2408 is the w1_f29_disable_test_mode found in file w1_ds2408.c.
The add slave gets called when the bus-mutex is already locked, and then the problem occurs.
There is no recovery for this problem.

diff --git a/drivers/w1/slaves/w1_ds2408.c b/drivers/w1/slaves/w1_ds2408.c
index edf0bc980..bde9cc79b 100644
--- a/drivers/w1/slaves/w1_ds2408.c
+++ b/drivers/w1/slaves/w1_ds2408.c
@@ -299,7 +299,6 @@ static int w1_f29_disable_test_mode(struct w1_slave *sl)
        memcpy(&magic[1], &rn, 8);
        magic[9] = 0x3C;

-       mutex_lock(&sl->master->bus_mutex);

        res = w1_reset_bus(sl->master);
        if (res)
@@ -308,7 +307,6 @@ static int w1_f29_disable_test_mode(struct w1_slave *sl)

        res = w1_reset_bus(sl->master);
 out:
-       mutex_unlock(&sl->master->bus_mutex);
        return res;
 }
@6by9
Copy link
Contributor

6by9 commented Mar 11, 2020

It is very significant to say that you are using a DS9490B USB to 1-wire interface. Please state these facts rather than relying on us having to deduce it from the backtrace. It is possible to bit-bash 1-wire via a GPIO. and this tends to be the more common approach with Pi users.

It appears that only the DS9490B and DS1WM drivers acquire and hold the bus_mutex when searching (https://github.com/raspberrypi/linux/blob/rpi-5.4.y/drivers/w1/masters/ds2490.c#L734).
If you compare to the generic search at https://github.com/raspberrypi/linux/blob/rpi-5.4.y/drivers/w1/w1.c#L980, whilst it does acquire the bus_mutex, it releases it before making the callback to add the device.

None of this code is modified from the mainline Linux code, so this really needs reporting upstream (it'll affect ALL platforms).

Removing the bus_mutex lock/unlock in w1_ds2408.c is almost certainly the wrong thing, and it's the DS9490B driver that should be releasing it before making the callback.

@Tom36524
Copy link
Author

Tom36524 commented Mar 14, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants