Skip to content

USB can not be built as a module #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yann-morin-1998 opened this issue Jan 27, 2013 · 3 comments
Closed

USB can not be built as a module #203

yann-morin-1998 opened this issue Jan 27, 2013 · 3 comments

Comments

@yann-morin-1998
Copy link
Contributor

When USB is configured as a module, the build fails with:

  LD      init/built-in.o
arch/arm/mach-bcm2708/built-in.o: In function `bcm2708_init':
:(.init.text+0x300): undefined reference to `fiq_fix_enable'

That's because 'fiq_fix_enable' is declared in drivers/usb/host/dwc_otg/dwc_otg_driver.c and the symbol is not available if dwc_otg is compiled as a module, so it can not be used from arch/arm/mach-bcm2708/bcm2708.c in this case.

@licaon-kter
Copy link

see this: #112

@popcornmix
Copy link
Collaborator

Duplicate.

@yann-morin-1998
Copy link
Contributor Author

Indeed, I missed it before filling. I'll go comment there.
Thank you. :-)

popcornmix pushed a commit that referenced this issue Nov 22, 2014
…fuel gauge unbind

commit bdbe814 upstream.

The charger manager obtained reference to fuel gauge power supply in probe
with power_supply_get_by_name() for later usage. However if fuel gauge
driver was removed and re-added then this reference would point to old
power supply (from driver which was removed).

This lead to accessing old (and probably invalid) memory which could be
observed with:
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/unbind
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/bind
$ cat /sys/devices/virtual/power_supply/battery/capacity
[  240.480084] INFO: task cat:1393 blocked for more than 120 seconds.
[  240.484799]       Not tainted 3.17.0-next-20141007-00028-ge60b6dd79570 #203
[  240.491782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.499589] cat             D c0469530     0  1393      1 0x00000000
[  240.505947] [<c0469530>] (__schedule) from [<c0469d3c>] (schedule_preempt_disabled+0x14/0x20)
[  240.514449] [<c0469d3c>] (schedule_preempt_disabled) from [<c046af08>] (mutex_lock_nested+0x1bc/0x458)
[  240.523736] [<c046af08>] (mutex_lock_nested) from [<c0287a98>] (regmap_read+0x30/0x60)
[  240.531647] [<c0287a98>] (regmap_read) from [<c032238c>] (max17042_get_property+0x2e8/0x350)
[  240.540055] [<c032238c>] (max17042_get_property) from [<c03247d8>] (charger_get_property+0x264/0x348)
[  240.549252] [<c03247d8>] (charger_get_property) from [<c0320764>] (power_supply_show_property+0x48/0x1e0)
[  240.558808] [<c0320764>] (power_supply_show_property) from [<c027308c>] (dev_attr_show+0x1c/0x48)
[  240.567664] [<c027308c>] (dev_attr_show) from [<c0141fb0>] (sysfs_kf_seq_show+0x84/0x104)
[  240.575814] [<c0141fb0>] (sysfs_kf_seq_show) from [<c0140b18>] (kernfs_seq_show+0x24/0x28)
[  240.584061] [<c0140b18>] (kernfs_seq_show) from [<c0104574>] (seq_read+0x1b0/0x484)
[  240.591702] [<c0104574>] (seq_read) from [<c00e1e24>] (vfs_read+0x88/0x144)
[  240.598640] [<c00e1e24>] (vfs_read) from [<c00e1f20>] (SyS_read+0x40/0x8c)
[  240.605507] [<c00e1f20>] (SyS_read) from [<c000e760>] (ret_fast_syscall+0x0/0x48)
[  240.612952] 4 locks held by cat/1393:
[  240.616589]  #0:  (&p->lock){+.+.+.}, at: [<c01043f4>] seq_read+0x30/0x484
[  240.623414]  #1:  (&of->mutex){+.+.+.}, at: [<c01417dc>] kernfs_seq_start+0x1c/0x8c
[  240.631086]  #2:  (s_active#31){++++.+}, at: [<c01417e4>] kernfs_seq_start+0x24/0x8c
[  240.638777]  #3:  (&map->mutex){+.+...}, at: [<c0287a98>] regmap_read+0x30/0x60

The charger-manager should get reference to fuel gauge power supply on
each use of get_property callback. The thermal zone 'tzd' field of
power supply should not be used because of the same reason.

Additionally this change solves also the issue with nested
thermal_zone_get_temp() calls and related false lockdep positive for
deadlock for thermal zone's mutex [1]. When fuel gauge is used as source of
temperature then the charger manager forwards its get_temp calls to fuel
gauge thermal zone. So actually different mutexes are used (one for
charger manager thermal zone and second for fuel gauge thermal zone) but
for lockdep this is one class of mutex.

The recursion is removed by retrieving temperature through power
supply's get_property().

In case external thermal zone is used ('cm-thermal-zone' property is
present in DTS) the recursion does not exist. Charger manager simply
exports POWER_SUPPLY_PROP_TEMP_AMBIENT property (instead of
POWER_SUPPLY_PROP_TEMP) thus no thermal zone is created for this power
supply.

[1] https://lkml.org/lkml/2014/10/6/309

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: 3bb3dbb ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
popcornmix pushed a commit that referenced this issue Dec 20, 2014
…fuel gauge unbind

commit bdbe814 upstream.

The charger manager obtained reference to fuel gauge power supply in probe
with power_supply_get_by_name() for later usage. However if fuel gauge
driver was removed and re-added then this reference would point to old
power supply (from driver which was removed).

This lead to accessing old (and probably invalid) memory which could be
observed with:
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/unbind
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/bind
$ cat /sys/devices/virtual/power_supply/battery/capacity
[  240.480084] INFO: task cat:1393 blocked for more than 120 seconds.
[  240.484799]       Not tainted 3.17.0-next-20141007-00028-ge60b6dd79570 #203
[  240.491782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.499589] cat             D c0469530     0  1393      1 0x00000000
[  240.505947] [<c0469530>] (__schedule) from [<c0469d3c>] (schedule_preempt_disabled+0x14/0x20)
[  240.514449] [<c0469d3c>] (schedule_preempt_disabled) from [<c046af08>] (mutex_lock_nested+0x1bc/0x458)
[  240.523736] [<c046af08>] (mutex_lock_nested) from [<c0287a98>] (regmap_read+0x30/0x60)
[  240.531647] [<c0287a98>] (regmap_read) from [<c032238c>] (max17042_get_property+0x2e8/0x350)
[  240.540055] [<c032238c>] (max17042_get_property) from [<c03247d8>] (charger_get_property+0x264/0x348)
[  240.549252] [<c03247d8>] (charger_get_property) from [<c0320764>] (power_supply_show_property+0x48/0x1e0)
[  240.558808] [<c0320764>] (power_supply_show_property) from [<c027308c>] (dev_attr_show+0x1c/0x48)
[  240.567664] [<c027308c>] (dev_attr_show) from [<c0141fb0>] (sysfs_kf_seq_show+0x84/0x104)
[  240.575814] [<c0141fb0>] (sysfs_kf_seq_show) from [<c0140b18>] (kernfs_seq_show+0x24/0x28)
[  240.584061] [<c0140b18>] (kernfs_seq_show) from [<c0104574>] (seq_read+0x1b0/0x484)
[  240.591702] [<c0104574>] (seq_read) from [<c00e1e24>] (vfs_read+0x88/0x144)
[  240.598640] [<c00e1e24>] (vfs_read) from [<c00e1f20>] (SyS_read+0x40/0x8c)
[  240.605507] [<c00e1f20>] (SyS_read) from [<c000e760>] (ret_fast_syscall+0x0/0x48)
[  240.612952] 4 locks held by cat/1393:
[  240.616589]  #0:  (&p->lock){+.+.+.}, at: [<c01043f4>] seq_read+0x30/0x484
[  240.623414]  #1:  (&of->mutex){+.+.+.}, at: [<c01417dc>] kernfs_seq_start+0x1c/0x8c
[  240.631086]  #2:  (s_active#31){++++.+}, at: [<c01417e4>] kernfs_seq_start+0x24/0x8c
[  240.638777]  #3:  (&map->mutex){+.+...}, at: [<c0287a98>] regmap_read+0x30/0x60

The charger-manager should get reference to fuel gauge power supply on
each use of get_property callback. The thermal zone 'tzd' field of
power supply should not be used because of the same reason.

Additionally this change solves also the issue with nested
thermal_zone_get_temp() calls and related false lockdep positive for
deadlock for thermal zone's mutex [1]. When fuel gauge is used as source of
temperature then the charger manager forwards its get_temp calls to fuel
gauge thermal zone. So actually different mutexes are used (one for
charger manager thermal zone and second for fuel gauge thermal zone) but
for lockdep this is one class of mutex.

The recursion is removed by retrieving temperature through power
supply's get_property().

In case external thermal zone is used ('cm-thermal-zone' property is
present in DTS) the recursion does not exist. Charger manager simply
exports POWER_SUPPLY_PROP_TEMP_AMBIENT property (instead of
POWER_SUPPLY_PROP_TEMP) thus no thermal zone is created for this power
supply.

[1] https://lkml.org/lkml/2014/10/6/309

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: 3bb3dbb ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit to anholt/linux that referenced this issue Apr 8, 2017
When manually overwriting the HWS, rather than assume irq_seqno_barrier
does the right thing, we can explicitly flush the cacheline instead.
This avoids us calling the engine->irq_seqno_barrier() from an illegal
context:

[ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002
[ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci
[ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G     U          4.11.0-rc1+ raspberrypi#203
[ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010
[ 1472.651900] Call Trace:
[ 1472.651913]  dump_stack+0x63/0x90
[ 1472.651922]  __schedule_bug+0x5d/0x6b
[ 1472.651930]  __schedule+0x46a/0x5f0
[ 1472.651934]  schedule+0x38/0x90
[ 1472.651938]  schedule_hrtimeout_range_clock+0x85/0x110
[ 1472.651945]  ? hrtimer_init+0x10/0x10
[ 1472.651949]  schedule_hrtimeout_range+0xe/0x10
[ 1472.651952]  usleep_range+0x4d/0x60
[ 1472.652037]  gen5_seqno_barrier+0x13/0x20 [i915]
[ 1472.652101]  intel_engine_init_global_seqno+0xd7/0x160 [i915]
[ 1472.652160]  __i915_gem_set_wedged_BKL+0xa0/0x180 [i915]
[ 1472.652166]  multi_cpu_stop+0xbb/0xe0
[ 1472.652170]  ? cpu_stop_queue_work+0x90/0x90
[ 1472.652174]  cpu_stopper_thread+0x82/0x110
[ 1472.652179]  smpboot_thread_fn+0x137/0x190
[ 1472.652184]  kthread+0xf7/0x130
[ 1472.652187]  ? sort_range+0x20/0x20
[ 1472.652191]  ? kthread_park+0x90/0x90
[ 1472.652195]  ret_from_fork+0x2c/0x40

Testcase: igt/gem_eio #ilk
Signed-off-by: Chris Wilson <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
popcornmix pushed a commit that referenced this issue Jul 9, 2018
[ Upstream commit f0f59a2 ]

Dasd uses completion_data from struct request to store per request
private data - this is problematic since this member is part of a
union which is also used by IO schedulers.
Let the block layer maintain space for per request data behind each
struct request.

Fixes crashes on block layer timeouts like this one:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000001308007 R3:00000000fffc8007 S:00000000fffcc000 P:000000000000013d
Oops: 0004 ilc:2 [#1] PREEMPT SMP
Modules linked in: [...]
CPU: 0 PID: 1480 Comm: kworker/0:2H Not tainted 4.17.0-rc4-00046-gaa3bcd43b5af #203
Hardware name: IBM 3906 M02 702 (LPAR)
Workqueue: kblockd blk_mq_timeout_work
Krnl PSW : 0000000067ac406b 00000000b6960308 (do_raw_spin_trylock+0x30/0x78)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000c00 0000000000000000 0000000000000000 0000000000000001
           0000000000b9d3c8 0000000000000000 0000000000000001 00000000cf9639d8
           0000000000000000 0700000000000000 0000000000000000 000000000099f09e
           0000000000000000 000000000076e9d0 000000006247bb08 000000006247bae0
Krnl Code: 00000000001c159c: b90400c2           lgr     %r12,%r2
           00000000001c15a0: a7180000           lhi     %r1,0
          #00000000001c15a4: 583003a4           l       %r3,932
          >00000000001c15a8: ba132000           cs      %r1,%r3,0(%r2)
           00000000001c15ac: a7180001           lhi     %r1,1
           00000000001c15b0: a784000b           brc     8,1c15c6
           00000000001c15b4: c0e5004e72aa       brasl   %r14,b8fb08
           00000000001c15ba: 1812               lr      %r1,%r2
Call Trace:
([<0700000000000000>] 0x700000000000000)
 [<0000000000b9d3d2>] _raw_spin_lock_irqsave+0x7a/0xb8
 [<000000000099f09e>] dasd_times_out+0x46/0x278
 [<000000000076ea6e>] blk_mq_terminate_expired+0x9e/0x108
 [<000000000077497a>] bt_for_each+0x102/0x130
 [<0000000000774e54>] blk_mq_queue_tag_busy_iter+0x74/0xd8
 [<000000000076fea0>] blk_mq_timeout_work+0x260/0x320
 [<0000000000169dd4>] process_one_work+0x3bc/0x708
 [<000000000016a382>] worker_thread+0x262/0x408
 [<00000000001723a8>] kthread+0x160/0x178
 [<0000000000b9e73a>] kernel_thread_starter+0x6/0xc
 [<0000000000b9e734>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<0000000000b9d3cc>] _raw_spin_lock_irqsave+0x74/0xb8

Kernel panic - not syncing: Fatal exception: panic_on_oops

Signed-off-by: Sebastian Ott <[email protected]>
Reviewed-by: Stefan Haberland <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
popcornmix pushed a commit that referenced this issue Mar 22, 2021
…before setting skb ownership

commit e940e08 upstream.

There are two ref count variables controlling the free()ing of a socket:
- struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
- struct sock::sk_wmem_alloc - which accounts the memory allocated by
  the skbs in the send path.

In case there are still TX skbs on the fly and the socket() is closed,
the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
clones an "echo" skb, calls sock_hold() on the original socket and
references it. This produces the following back trace:

| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134
| refcount_t: addition on 0; use-after-free.
| Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E)
| CPU: 0 PID: 280 Comm: test_can.sh Tainted: G            E     5.11.0-04577-gf8ff6603c617 #203
| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
| Backtrace:
| [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220
| [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8)
| [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90
| [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2
| [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540
| [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50)
| [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c)
| [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600
| [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00
| [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600
| [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400
| [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)

To fix this problem, only set skb ownership to sockets which have still
a ref count > 0.

Fixes: 0ae89be ("can: add destructor for self generated skbs")
Cc: Oliver Hartkopp <[email protected]>
Cc: Andre Naujoks <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Oliver Hartkopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
popcornmix pushed a commit that referenced this issue Mar 22, 2021
…before setting skb ownership

commit e940e08 upstream.

There are two ref count variables controlling the free()ing of a socket:
- struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
- struct sock::sk_wmem_alloc - which accounts the memory allocated by
  the skbs in the send path.

In case there are still TX skbs on the fly and the socket() is closed,
the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
clones an "echo" skb, calls sock_hold() on the original socket and
references it. This produces the following back trace:

| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134
| refcount_t: addition on 0; use-after-free.
| Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E)
| CPU: 0 PID: 280 Comm: test_can.sh Tainted: G            E     5.11.0-04577-gf8ff6603c617 #203
| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
| Backtrace:
| [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220
| [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8)
| [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90
| [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2
| [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540
| [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50)
| [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c)
| [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600
| [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00
| [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600
| [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400
| [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)

To fix this problem, only set skb ownership to sockets which have still
a ref count > 0.

Fixes: 0ae89be ("can: add destructor for self generated skbs")
Cc: Oliver Hartkopp <[email protected]>
Cc: Andre Naujoks <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Oliver Hartkopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
alsutton pushed a commit to alsutton/rpi-linux that referenced this issue Apr 20, 2021
…before setting skb ownership

commit e940e08 upstream.

There are two ref count variables controlling the free()ing of a socket:
- struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
- struct sock::sk_wmem_alloc - which accounts the memory allocated by
  the skbs in the send path.

In case there are still TX skbs on the fly and the socket() is closed,
the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
clones an "echo" skb, calls sock_hold() on the original socket and
references it. This produces the following back trace:

| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134
| refcount_t: addition on 0; use-after-free.
| Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E)
| CPU: 0 PID: 280 Comm: test_can.sh Tainted: G            E     5.11.0-04577-gf8ff6603c617 raspberrypi#203
| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
| Backtrace:
| [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220
| [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8)
| [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90
| [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2
| [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540
| [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50)
| [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c)
| [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600
| [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00
| [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600
| [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400
| [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)

To fix this problem, only set skb ownership to sockets which have still
a ref count > 0.

Fixes: 0ae89be ("can: add destructor for self generated skbs")
Cc: Oliver Hartkopp <[email protected]>
Cc: Andre Naujoks <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Oliver Hartkopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants