Skip to content

Commit 0650bf5

Browse files
vladimirolteandavem330
authored andcommitted
net: dsa: be compatible with masters which unregister on shutdown
Lino reports that on his system with bcmgenet as DSA master and KSZ9897 as a switch, rebooting or shutting down never works properly. What does the bcmgenet driver have special to trigger this, that other DSA masters do not? It has an implementation of ->shutdown which simply calls its ->remove implementation. Otherwise said, it unregisters its network interface on shutdown. This message can be seen in a loop, and it hangs the reboot process there: unregister_netdevice: waiting for eth0 to become free. Usage count = 3 So why 3? A usage count of 1 is normal for a registered network interface, and any virtual interface which links itself as an upper of that will increment it via dev_hold. In the case of DSA, this is the call path: dsa_slave_create -> netdev_upper_dev_link -> __netdev_upper_dev_link -> __netdev_adjacent_dev_insert -> dev_hold So a DSA switch with 3 interfaces will result in a usage count elevated by two, and netdev_wait_allrefs will wait until they have gone away. Other stacked interfaces, like VLAN, watch NETDEV_UNREGISTER events and delete themselves, but DSA cannot just vanish and go poof, at most it can unbind itself from the switch devices, but that must happen strictly earlier compared to when the DSA master unregisters its net_device, so reacting on the NETDEV_UNREGISTER event is way too late. It seems that it is a pretty established pattern to have a driver's ->shutdown hook redirect to its ->remove hook, so the same code is executed regardless of whether the driver is unbound from the device, or the system is just shutting down. As Florian puts it, it is quite a big hammer for bcmgenet to unregister its net_device during shutdown, but having a common code path with the driver unbind helps ensure it is well tested. So DSA, for better or for worse, has to live with that and engage in an arms race of implementing the ->shutdown hook too, from all individual drivers, and do something sane when paired with masters that unregister their net_device there. The only sane thing to do, of course, is to unlink from the master. However, complications arise really quickly. The pattern of redirecting ->shutdown to ->remove is not unique to bcmgenet or even to net_device drivers. In fact, SPI controllers do it too (see dspi_shutdown -> dspi_remove), and presumably, I2C controllers and MDIO controllers do it too (this is something I have not researched too deeply, but even if this is not the case today, it is certainly plausible to happen in the future, and must be taken into consideration). Since DSA switches might be SPI devices, I2C devices, MDIO devices, the insane implication is that for the exact same DSA switch device, we might have both ->shutdown and ->remove getting called. So we need to do something with that insane environment. The pattern I've come up with is "if this, then not that", so if either ->shutdown or ->remove gets called, we set the device's drvdata to NULL, and in the other hook, we check whether the drvdata is NULL and just do nothing. This is probably not necessary for platform devices, just for devices on buses, but I would really insist for consistency among drivers, because when code is copy-pasted, it is not always copy-pasted from the best sources. So depending on whether the DSA switch's ->remove or ->shutdown will get called first, we cannot really guarantee even for the same driver if rebooting will result in the same code path on all platforms. But nonetheless, we need to do something minimally reasonable on ->shutdown too to fix the bug. Of course, the ->remove will do more (a full teardown of the tree, with all data structures freed, and this is why the bug was not caught for so long). The new ->shutdown method is kept separate from dsa_unregister_switch not because we couldn't have unregistered the switch, but simply in the interest of doing something quick and to the point. The big question is: does the DSA switch's ->shutdown get called earlier than the DSA master's ->shutdown? If not, there is still a risk that we might still trigger the WARN_ON in unregister_netdevice that says we are attempting to unregister a net_device which has uppers. That's no good. Although the reference to the master net_device won't physically go away even if DSA's ->shutdown comes afterwards, remember we have a dev_hold on it. The answer to that question lies in this comment above device_link_add: * A side effect of the link creation is re-ordering of dpm_list and the * devices_kset list by moving the consumer device and all devices depending * on it to the ends of these lists (that does not happen to devices that have * not been registered when this function is called). so the fact that DSA uses device_link_add towards its master is not exactly for nothing. device_shutdown() walks devices_kset from the back, so this is our guarantee that DSA's shutdown happens before the master's shutdown. Fixes: 2f1e8ea ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings") Link: https://lore.kernel.org/netdev/[email protected]/ Reported-by: Lino Sanfilippo <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Tested-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent cf95799 commit 0650bf5

30 files changed

+457
-24
lines changed

drivers/net/dsa/b53/b53_mdio.c

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -351,9 +351,25 @@ static int b53_mdio_probe(struct mdio_device *mdiodev)
351351
static void b53_mdio_remove(struct mdio_device *mdiodev)
352352
{
353353
struct b53_device *dev = dev_get_drvdata(&mdiodev->dev);
354-
struct dsa_switch *ds = dev->ds;
355354

356-
dsa_unregister_switch(ds);
355+
if (!dev)
356+
return;
357+
358+
b53_switch_remove(dev);
359+
360+
dev_set_drvdata(&mdiodev->dev, NULL);
361+
}
362+
363+
static void b53_mdio_shutdown(struct mdio_device *mdiodev)
364+
{
365+
struct b53_device *dev = dev_get_drvdata(&mdiodev->dev);
366+
367+
if (!dev)
368+
return;
369+
370+
b53_switch_shutdown(dev);
371+
372+
dev_set_drvdata(&mdiodev->dev, NULL);
357373
}
358374

359375
static const struct of_device_id b53_of_match[] = {
@@ -373,6 +389,7 @@ MODULE_DEVICE_TABLE(of, b53_of_match);
373389
static struct mdio_driver b53_mdio_driver = {
374390
.probe = b53_mdio_probe,
375391
.remove = b53_mdio_remove,
392+
.shutdown = b53_mdio_shutdown,
376393
.mdiodrv.driver = {
377394
.name = "bcm53xx",
378395
.of_match_table = b53_of_match,

drivers/net/dsa/b53/b53_mmap.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -316,9 +316,21 @@ static int b53_mmap_remove(struct platform_device *pdev)
316316
if (dev)
317317
b53_switch_remove(dev);
318318

319+
platform_set_drvdata(pdev, NULL);
320+
319321
return 0;
320322
}
321323

324+
static void b53_mmap_shutdown(struct platform_device *pdev)
325+
{
326+
struct b53_device *dev = platform_get_drvdata(pdev);
327+
328+
if (dev)
329+
b53_switch_shutdown(dev);
330+
331+
platform_set_drvdata(pdev, NULL);
332+
}
333+
322334
static const struct of_device_id b53_mmap_of_table[] = {
323335
{ .compatible = "brcm,bcm3384-switch" },
324336
{ .compatible = "brcm,bcm6328-switch" },
@@ -331,6 +343,7 @@ MODULE_DEVICE_TABLE(of, b53_mmap_of_table);
331343
static struct platform_driver b53_mmap_driver = {
332344
.probe = b53_mmap_probe,
333345
.remove = b53_mmap_remove,
346+
.shutdown = b53_mmap_shutdown,
334347
.driver = {
335348
.name = "b53-switch",
336349
.of_match_table = b53_mmap_of_table,

drivers/net/dsa/b53/b53_priv.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,11 @@ static inline void b53_switch_remove(struct b53_device *dev)
228228
dsa_unregister_switch(dev->ds);
229229
}
230230

231+
static inline void b53_switch_shutdown(struct b53_device *dev)
232+
{
233+
dsa_switch_shutdown(dev->ds);
234+
}
235+
231236
#define b53_build_op(type_op_size, val_type) \
232237
static inline int b53_##type_op_size(struct b53_device *dev, u8 page, \
233238
u8 reg, val_type val) \

drivers/net/dsa/b53/b53_spi.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,9 +321,21 @@ static int b53_spi_remove(struct spi_device *spi)
321321
if (dev)
322322
b53_switch_remove(dev);
323323

324+
spi_set_drvdata(spi, NULL);
325+
324326
return 0;
325327
}
326328

329+
static void b53_spi_shutdown(struct spi_device *spi)
330+
{
331+
struct b53_device *dev = spi_get_drvdata(spi);
332+
333+
if (dev)
334+
b53_switch_shutdown(dev);
335+
336+
spi_set_drvdata(spi, NULL);
337+
}
338+
327339
static const struct of_device_id b53_spi_of_match[] = {
328340
{ .compatible = "brcm,bcm5325" },
329341
{ .compatible = "brcm,bcm5365" },
@@ -344,6 +356,7 @@ static struct spi_driver b53_spi_driver = {
344356
},
345357
.probe = b53_spi_probe,
346358
.remove = b53_spi_remove,
359+
.shutdown = b53_spi_shutdown,
347360
};
348361

349362
module_spi_driver(b53_spi_driver);

drivers/net/dsa/b53/b53_srab.c

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -629,17 +629,34 @@ static int b53_srab_probe(struct platform_device *pdev)
629629
static int b53_srab_remove(struct platform_device *pdev)
630630
{
631631
struct b53_device *dev = platform_get_drvdata(pdev);
632-
struct b53_srab_priv *priv = dev->priv;
633632

634-
b53_srab_intr_set(priv, false);
633+
if (!dev)
634+
return 0;
635+
636+
b53_srab_intr_set(dev->priv, false);
635637
b53_switch_remove(dev);
636638

639+
platform_set_drvdata(pdev, NULL);
640+
637641
return 0;
638642
}
639643

644+
static void b53_srab_shutdown(struct platform_device *pdev)
645+
{
646+
struct b53_device *dev = platform_get_drvdata(pdev);
647+
648+
if (!dev)
649+
return;
650+
651+
b53_switch_shutdown(dev);
652+
653+
platform_set_drvdata(pdev, NULL);
654+
}
655+
640656
static struct platform_driver b53_srab_driver = {
641657
.probe = b53_srab_probe,
642658
.remove = b53_srab_remove,
659+
.shutdown = b53_srab_shutdown,
643660
.driver = {
644661
.name = "b53-srab-switch",
645662
.of_match_table = b53_srab_of_match,

drivers/net/dsa/bcm_sf2.c

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1512,6 +1512,9 @@ static int bcm_sf2_sw_remove(struct platform_device *pdev)
15121512
{
15131513
struct bcm_sf2_priv *priv = platform_get_drvdata(pdev);
15141514

1515+
if (!priv)
1516+
return 0;
1517+
15151518
priv->wol_ports_mask = 0;
15161519
/* Disable interrupts */
15171520
bcm_sf2_intr_disable(priv);
@@ -1523,13 +1526,18 @@ static int bcm_sf2_sw_remove(struct platform_device *pdev)
15231526
if (priv->type == BCM7278_DEVICE_ID)
15241527
reset_control_assert(priv->rcdev);
15251528

1529+
platform_set_drvdata(pdev, NULL);
1530+
15261531
return 0;
15271532
}
15281533

15291534
static void bcm_sf2_sw_shutdown(struct platform_device *pdev)
15301535
{
15311536
struct bcm_sf2_priv *priv = platform_get_drvdata(pdev);
15321537

1538+
if (!priv)
1539+
return;
1540+
15331541
/* For a kernel about to be kexec'd we want to keep the GPHY on for a
15341542
* successful MDIO bus scan to occur. If we did turn off the GPHY
15351543
* before (e.g: port_disable), this will also power it back on.
@@ -1538,6 +1546,10 @@ static void bcm_sf2_sw_shutdown(struct platform_device *pdev)
15381546
*/
15391547
if (priv->hw_params.num_gphy == 1)
15401548
bcm_sf2_gphy_enable_set(priv->dev->ds, true);
1549+
1550+
dsa_switch_shutdown(priv->dev->ds);
1551+
1552+
platform_set_drvdata(pdev, NULL);
15411553
}
15421554

15431555
#ifdef CONFIG_PM_SLEEP

drivers/net/dsa/dsa_loop.c

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -340,10 +340,29 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
340340
static void dsa_loop_drv_remove(struct mdio_device *mdiodev)
341341
{
342342
struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
343-
struct dsa_loop_priv *ps = ds->priv;
343+
struct dsa_loop_priv *ps;
344+
345+
if (!ds)
346+
return;
347+
348+
ps = ds->priv;
344349

345350
dsa_unregister_switch(ds);
346351
dev_put(ps->netdev);
352+
353+
dev_set_drvdata(&mdiodev->dev, NULL);
354+
}
355+
356+
static void dsa_loop_drv_shutdown(struct mdio_device *mdiodev)
357+
{
358+
struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
359+
360+
if (!ds)
361+
return;
362+
363+
dsa_switch_shutdown(ds);
364+
365+
dev_set_drvdata(&mdiodev->dev, NULL);
347366
}
348367

349368
static struct mdio_driver dsa_loop_drv = {
@@ -352,6 +371,7 @@ static struct mdio_driver dsa_loop_drv = {
352371
},
353372
.probe = dsa_loop_drv_probe,
354373
.remove = dsa_loop_drv_remove,
374+
.shutdown = dsa_loop_drv_shutdown,
355375
};
356376

357377
#define NUM_FIXED_PHYS (DSA_LOOP_NUM_PORTS - 2)

drivers/net/dsa/lan9303-core.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1379,6 +1379,12 @@ int lan9303_remove(struct lan9303 *chip)
13791379
}
13801380
EXPORT_SYMBOL(lan9303_remove);
13811381

1382+
void lan9303_shutdown(struct lan9303 *chip)
1383+
{
1384+
dsa_switch_shutdown(chip->ds);
1385+
}
1386+
EXPORT_SYMBOL(lan9303_shutdown);
1387+
13821388
MODULE_AUTHOR("Juergen Borleis <[email protected]>");
13831389
MODULE_DESCRIPTION("Core driver for SMSC/Microchip LAN9303 three port ethernet switch");
13841390
MODULE_LICENSE("GPL v2");

drivers/net/dsa/lan9303.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ extern const struct lan9303_phy_ops lan9303_indirect_phy_ops;
1010

1111
int lan9303_probe(struct lan9303 *chip, struct device_node *np);
1212
int lan9303_remove(struct lan9303 *chip);
13+
void lan9303_shutdown(struct lan9303 *chip);

drivers/net/dsa/lan9303_i2c.c

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,13 +67,28 @@ static int lan9303_i2c_probe(struct i2c_client *client,
6767

6868
static int lan9303_i2c_remove(struct i2c_client *client)
6969
{
70-
struct lan9303_i2c *sw_dev;
70+
struct lan9303_i2c *sw_dev = i2c_get_clientdata(client);
7171

72-
sw_dev = i2c_get_clientdata(client);
7372
if (!sw_dev)
74-
return -ENODEV;
73+
return 0;
74+
75+
lan9303_remove(&sw_dev->chip);
76+
77+
i2c_set_clientdata(client, NULL);
78+
79+
return 0;
80+
}
81+
82+
static void lan9303_i2c_shutdown(struct i2c_client *client)
83+
{
84+
struct lan9303_i2c *sw_dev = i2c_get_clientdata(client);
85+
86+
if (!sw_dev)
87+
return;
88+
89+
lan9303_shutdown(&sw_dev->chip);
7590

76-
return lan9303_remove(&sw_dev->chip);
91+
i2c_set_clientdata(client, NULL);
7792
}
7893

7994
/*-------------------------------------------------------------------------*/
@@ -97,6 +112,7 @@ static struct i2c_driver lan9303_i2c_driver = {
97112
},
98113
.probe = lan9303_i2c_probe,
99114
.remove = lan9303_i2c_remove,
115+
.shutdown = lan9303_i2c_shutdown,
100116
.id_table = lan9303_i2c_id,
101117
};
102118
module_i2c_driver(lan9303_i2c_driver);

drivers/net/dsa/lan9303_mdio.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,20 @@ static void lan9303_mdio_remove(struct mdio_device *mdiodev)
138138
return;
139139

140140
lan9303_remove(&sw_dev->chip);
141+
142+
dev_set_drvdata(&mdiodev->dev, NULL);
143+
}
144+
145+
static void lan9303_mdio_shutdown(struct mdio_device *mdiodev)
146+
{
147+
struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev);
148+
149+
if (!sw_dev)
150+
return;
151+
152+
lan9303_shutdown(&sw_dev->chip);
153+
154+
dev_set_drvdata(&mdiodev->dev, NULL);
141155
}
142156

143157
/*-------------------------------------------------------------------------*/
@@ -155,6 +169,7 @@ static struct mdio_driver lan9303_mdio_driver = {
155169
},
156170
.probe = lan9303_mdio_probe,
157171
.remove = lan9303_mdio_remove,
172+
.shutdown = lan9303_mdio_shutdown,
158173
};
159174
mdio_module_driver(lan9303_mdio_driver);
160175

drivers/net/dsa/lantiq_gswip.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2184,6 +2184,9 @@ static int gswip_remove(struct platform_device *pdev)
21842184
struct gswip_priv *priv = platform_get_drvdata(pdev);
21852185
int i;
21862186

2187+
if (!priv)
2188+
return 0;
2189+
21872190
/* disable the switch */
21882191
gswip_mdio_mask(priv, GSWIP_MDIO_GLOB_ENABLE, 0, GSWIP_MDIO_GLOB);
21892192

@@ -2197,9 +2200,23 @@ static int gswip_remove(struct platform_device *pdev)
21972200
for (i = 0; i < priv->num_gphy_fw; i++)
21982201
gswip_gphy_fw_remove(priv, &priv->gphy_fw[i]);
21992202

2203+
platform_set_drvdata(pdev, NULL);
2204+
22002205
return 0;
22012206
}
22022207

2208+
static void gswip_shutdown(struct platform_device *pdev)
2209+
{
2210+
struct gswip_priv *priv = platform_get_drvdata(pdev);
2211+
2212+
if (!priv)
2213+
return;
2214+
2215+
dsa_switch_shutdown(priv->ds);
2216+
2217+
platform_set_drvdata(pdev, NULL);
2218+
}
2219+
22032220
static const struct gswip_hw_info gswip_xrx200 = {
22042221
.max_ports = 7,
22052222
.cpu_port = 6,
@@ -2223,6 +2240,7 @@ MODULE_DEVICE_TABLE(of, gswip_of_match);
22232240
static struct platform_driver gswip_driver = {
22242241
.probe = gswip_probe,
22252242
.remove = gswip_remove,
2243+
.shutdown = gswip_shutdown,
22262244
.driver = {
22272245
.name = "gswip",
22282246
.of_match_table = gswip_of_match,

drivers/net/dsa/microchip/ksz8795_spi.c

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,15 +94,24 @@ static int ksz8795_spi_remove(struct spi_device *spi)
9494
if (dev)
9595
ksz_switch_remove(dev);
9696

97+
spi_set_drvdata(spi, NULL);
98+
9799
return 0;
98100
}
99101

100102
static void ksz8795_spi_shutdown(struct spi_device *spi)
101103
{
102104
struct ksz_device *dev = spi_get_drvdata(spi);
103105

104-
if (dev && dev->dev_ops->shutdown)
106+
if (!dev)
107+
return;
108+
109+
if (dev->dev_ops->shutdown)
105110
dev->dev_ops->shutdown(dev);
111+
112+
dsa_switch_shutdown(dev->ds);
113+
114+
spi_set_drvdata(spi, NULL);
106115
}
107116

108117
static const struct of_device_id ksz8795_dt_ids[] = {

0 commit comments

Comments
 (0)