Skip to content

CPU stepping not working #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
coveritytest opened this issue Nov 4, 2015 · 0 comments
Open

CPU stepping not working #1

coveritytest opened this issue Nov 4, 2015 · 0 comments

Comments

@coveritytest
Copy link

Unlike in branch odroidxu3-3.10.y the CPU stepping is not working. This means, that the thermal_core shuts down the machine, instead of throttling the frequencies to lower values. In branch odroidxu3-3.10.y this is working correctly.

mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 6, 2016
In commit 3d83811 ("iio: pressure: bmp280: add power management")

For some reason the code in the runtime suspend/resume hooks
got wrong (I suspect in the ambition to cut down boilerplate)
and it seems it was tested without CONFIG_PM and crashes like
so for me:

Unable to handle kernel NULL pointer dereference at virtual address 0000000c
pgd = c0204000
[0000000c] *pgd=00000000
Internal error: Oops: 5 [tobetter#1] PREEMPT SMP ARM
Modules linked in:
CPU: 1 PID: 89 Comm: kworker/1:2 Not tainted
  4.7.0-03348-g90dc3680458a-dirty torvalds#99
Hardware name: Generic DT based system
Workqueue: pm pm_runtime_work
task: df3c6300 ti: dec8a000 task.ti: dec8a000
PC is at regulator_disable+0x0/0x6c
LR is at bmp280_runtime_suspend+0x3c/0xa4

Dereferencing the BMP280 state container properly fixes the problem,
sorry for screwing up.

Fixes: 3d83811 ("iio: pressure: bmp280: add power management")
Signed-off-by: Linus Walleij <[email protected]>
Tested-by: Jarkko Nikula <[email protected]>
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 6, 2016
This patch leverages 'struct pci_host_bridge' from the PCI subsystem
in order to free the pci_controller only after the last reference to
its devices is dropped (avoiding an oops in pcibios_release_device()
if the last reference is dropped after pcibios_free_controller()).

The patch relies on pci_host_bridge.release_fn() (and .release_data),
which is called automatically by the PCI subsystem when the root bus
is released (i.e., the last reference is dropped).  Those fields are
set via pci_set_host_bridge_release() (e.g. in the platform-specific
implementation of pcibios_root_bridge_prepare()).

It introduces the 'pcibios_free_controller_deferred()' .release_fn()
and it expects .release_data to hold a pointer to the pci_controller.

The function implictly calls 'pcibios_free_controller()', so an user
must *NOT* explicitly call it if using the new _deferred() callback.

The functionality is enabled for pseries (although it isn't platform
specific, and may be used by cxl).

Details on not-so-elegant design choices:

 - Use 'pci_host_bridge.release_data' field as pointer to associated
   'struct pci_controller' so *not* to 'pci_bus_to_host(bridge->bus)'
   in pcibios_free_controller_deferred().

   That's because pci_remove_root_bus() sets 'host_bridge->bus = NULL'
   (so, if the last reference is released after pci_remove_root_bus()
   runs, which eventually reaches pcibios_free_controller_deferred(),
   that would hit a null pointer dereference).

   The cxl/vphb.c code calls pci_remove_root_bus(), and the cxl folks
   are interested in this fix.

Test-case tobetter#1 (hold references)

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.0
  <...> /sys/block/sdaa -> ../devices/pci0021:01/0021:01:00.0/<...>

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.1
  <...> /sys/block/sdab -> ../devices/pci0021:01/0021:01:00.1/<...>

  # cat >/dev/sdaa & pid1=$!
  # cat >/dev/sdab & pid2=$!

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  594.306719] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  594.306738] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  598.236381] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  611.972077] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  611.972140] rpadlpar_io: slot PHB 33 removed

  # kill -9 $pid1
  # kill -9 $pid2
  [  632.918088] pcibios_free_controller_deferred: domain 33, dynamic 1

Test-case tobetter#2 (don't hold references)

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  916.357363] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  916.357386] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  920.566527] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  933.955873] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  933.955977] pcibios_free_controller_deferred: domain 33, dynamic 1
  [  933.955999] rpadlpar_io: slot PHB 33 removed

Suggested-By: Gavin Shan <[email protected]>
Signed-off-by: Mauricio Faria de Oliveira <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Andrew Donnellan <[email protected]>
Tested-by: Andrew Donnellan <[email protected]> # cxl
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 6, 2016
We observed a kernel oops when running a PPC guest with config NR_CPUS=4
and qemu option "-smp cores=1,threads=8":

[   30.634781] Unable to handle kernel paging request for data at
address 0xc00000014192eb17
[   30.636173] Faulting instruction address: 0xc00000000003e5cc
[   30.637069] Oops: Kernel access of bad area, sig: 11 [tobetter#1]
[   30.637877] SMP NR_CPUS=4 NUMA pSeries
[   30.638471] Modules linked in:
[   30.638949] CPU: 3 PID: 27 Comm: migration/3 Not tainted
4.7.0-07963-g9714b26 tobetter#1
[   30.640059] task: c00000001e29c600 task.stack: c00000001e2a8000
[   30.640956] NIP: c00000000003e5cc LR: c00000000003e550 CTR:
0000000000000000
[   30.642001] REGS: c00000001e2ab8e0 TRAP: 0300   Not tainted
(4.7.0-07963-g9714b26)
[   30.643139] MSR: 8000000102803033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22004084  XER: 00000000
[   30.644583] CFAR: c000000000009e98 DAR: c00000014192eb17 DSISR: 40000000 SOFTE: 0
GPR00: c00000000140a6b8 c00000001e2abb60 c0000000016dd300 0000000000000003
GPR04: 0000000000000000 0000000000000004 c0000000016e5920 0000000000000008
GPR08: 0000000000000004 c00000014192eb17 0000000000000000 0000000000000020
GPR12: c00000000140a6c0 c00000000ffffc00 c0000000000d3ea8 c00000001e005680
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000001e6b3a00 0000000000000000 0000000000000001
GPR24: c00000001ff85138 c00000001ff85130 000000001eb6f000 0000000000000001
GPR28: 0000000000000000 c0000000017014e0 0000000000000000 0000000000000018
[   30.653882] NIP [c00000000003e5cc] __cpu_disable+0xcc/0x190
[   30.654713] LR [c00000000003e550] __cpu_disable+0x50/0x190
[   30.655528] Call Trace:
[   30.655893] [c00000001e2abb60] [c00000000003e550] __cpu_disable+0x50/0x190 (unreliable)
[   30.657280] [c00000001e2abbb0] [c0000000000aca0c] take_cpu_down+0x5c/0x100
[   30.658365] [c00000001e2abc10] [c000000000163918] multi_cpu_stop+0x1a8/0x1e0
[   30.659617] [c00000001e2abc60] [c000000000163cc0] cpu_stopper_thread+0xf0/0x1d0
[   30.660737] [c00000001e2abd20] [c0000000000d8d70] smpboot_thread_fn+0x290/0x2a0
[   30.661879] [c00000001e2abd80] [c0000000000d3fa8] kthread+0x108/0x130
[   30.662876] [c00000001e2abe30] [c000000000009968] ret_from_kernel_thread+0x5c/0x74
[   30.664017] Instruction dump:
[   30.664477] 7bde1f24 38a00000 787f1f24 3b600001 39890008 7d204b78 7d05e214 7d0b07b4
[   30.665642] 796b1f24 7d26582a 7d204a14 7d29f214 <7d4048a8> 7d4a3878 7d4049ad 40c2fff4
[   30.666854] ---[ end trace 32643b7195717741 ]---

The reason of this is that in __cpu_disable(), when we try to set the
cpu_sibling_mask or cpu_core_mask of the sibling CPUs of the disabled
one, we don't check whether the current configuration employs those
sibling CPUs(hw threads). And if a CPU is not employed by a
configuration, the percpu structures cpu_{sibling,core}_mask are not
allocated, therefore accessing those cpumasks will result in problems as
above.

This patch fixes this problem by adding an addition check on whether the
id is no less than nr_cpu_ids in the sibling CPU iteration code.

Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 6, 2016
of_mm_gpiochip_add_data() calls mm_gc->save_regs() before
setting the data. Therefore ->save_regs() cannot use
gpiochip_get_data()

[    0.275940] Unable to handle kernel paging request for data at address 0x00000130
[    0.283120] Faulting instruction address: 0xc01b44cc
[    0.288175] Oops: Kernel access of bad area, sig: 11 [tobetter#1]
[    0.293343] PREEMPT CMPC885
[    0.296141] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-g65124df-dirty torvalds#68
[    0.304131] task: c6074000 ti: c6080000 task.ti: c6080000
[    0.309459] NIP: c01b44cc LR: c0011720 CTR: c0011708
[    0.314372] REGS: c6081d90 TRAP: 0300   Not tainted  (4.7.0-g65124df-dirty)
[    0.322267] MSR: 00009032 <EE,ME,IR,DR,RI>  CR: 24000028  XER: 20000000
[    0.328813] DAR: 00000130 DSISR: c0000000
GPR00: c01b6d0c c6081e40 c6074000 c6017000 c9028000 c601d028 c6081dd8 00000000
GPR08: c601d028 00000000 ffffffff 00000001 24000044 00000000 c0002790 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c05643b0 00000083
GPR24: c04a1a6c c0560000 c04a8308 c04c6480 c0012498 c6017000 c7ffcc78 c6017000
[    0.360806] NIP [c01b44cc] gpiochip_get_data+0x4/0xc
[    0.365684] LR [c0011720] cpm1_gpio16_save_regs+0x18/0x44
[    0.370972] Call Trace:
[    0.373451] [c6081e50] [c01b6d0c] of_mm_gpiochip_add_data+0x70/0xdc
[    0.379624] [c6081e70] [c00124c0] cpm_init_par_io+0x28/0x118
[    0.385238] [c6081e80] [c04a8ac0] do_one_initcall+0xb0/0x17c
[    0.390819] [c6081ef0] [c04a8cbc] kernel_init_freeable+0x130/0x1dc
[    0.396924] [c6081f30] [c00027a4] kernel_init+0x14/0x110
[    0.402177] [c6081f40] [c000b424] ret_from_kernel_thread+0x5c/0x64
[    0.408233] Instruction dump:
[    0.411168] 4182fafc 3f80c040 48234c6 3bc0fff0 3b9c5ed0 4bfffaf4 81290020 712a0004
[    0.418825] 4182fb34 48234c51 4bfffb2c 81230004 <80690130> 4e800020 7c0802a6 9421ffe0
[    0.426763] ---[ end trace fe4113ee21d72ffa ]---

fixes: e65078f ("powerpc: sysdev: cpm1: use gpiochip data pointer")
fixes: a14a2d4 ("powerpc: cpm_common: use gpiochip data pointer")
Cc: [email protected]
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 6, 2016
Userspace can begin and suspend a transaction within the signal
handler which means they might enter sys_rt_sigreturn() with the
processor in suspended state.

sys_rt_sigreturn() wants to restore process context (which may have
been in a transaction before signal delivery). To do this it must
restore TM SPRS. To achieve this, any transaction initiated within the
signal frame must be discarded in order to be able to restore TM SPRs
as TM SPRs can only be manipulated non-transactionally..
>From the PowerPC ISA:
  TM Bad Thing Exception [Category: Transactional Memory]
   An attempt is made to execute a mtspr targeting a TM register in
   other than Non-transactional state.

Not doing so results in a TM Bad Thing:
[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [tobetter#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
 nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
 ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
 uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure
 scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 tobetter#34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]> CR: 28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0 PACATMSCRATCH: 900000014280f033
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
[12045.223955] 4e800020 e80304a8 7c0023a e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

It isn't clear exactly if there is really a use case for userspace
returning with a suspended transaction, however, doing so doesn't (on
its own) constitute a bad frame. As such, this patch simply discards
the transactional state of the context calling the sigreturn and
continues.

Reported-by: Laurent Dufour <[email protected]>
Signed-off-by: Cyril Bur <[email protected]>
Tested-by: Laurent Dufour <[email protected]>
Reviewed-by: Laurent Dufour <[email protected]>
Acked-by: Simon Guo <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 6, 2016
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

   This is a static warning which happens when object size and copy size
   are both const, and copy size > object size.  I didn't see any false
   positives for this one.  So the function warning attribute seems to
   be working fine here.

   Note this scenario is always a bug and so I think it should be
   changed to *always* be an error, regardless of
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

   This is another static warning which happens when I enable
   __compiletime_object_size() for new compilers (and
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
   is const, but copy size is *not*.  In this case there's no way to
   compare the two at build time, so it gives the warning.  (Note the
   warning is a byproduct of the fact that gcc has no way of knowing
   whether the overflow function will be called, so the call isn't dead
   code and the warning attribute is activated.)

   So this warning seems to only indicate "this is an unusual pattern,
   maybe you should check it out" rather than "this is a bug".

   I get 102(!) of these warnings with allyesconfig and the
   __compiletime_object_size() gcc check removed.  I don't know if there
   are any real bugs hiding in there, but from looking at a small
   sample, I didn't see any.  According to Kees, it does sometimes find
   real bugs.  But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

   This is a runtime warning where object size is const, and copy size >
   object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

  2fb0815 ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size().  But in fact,
__compiletime_object_size() seems to be working fine.  The false
positives were instead triggered by tobetter#2 above.  (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning tobetter#2 to get rid of all the false positives, and re-enable
warnings tobetter#1 and tobetter#3 by reverting the above commit.

Furthermore, since tobetter#1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Byungchul Park <[email protected]>
Cc: Nilay Vaish <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
pxa_timer wants to be able to call clk_enable() etc on this clock,
but our clk_enable() implementation expects non-NULL enable/disable
operations.  Provide these dummy implementations.

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0204000
[00000000] *pgd=00000000
Internal error: Oops: 80000005 [tobetter#1] ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.8.0-rc2+ torvalds#887
Hardware name: Intel-Assabet
task: c0644590 task.stack: c0640000
PC is at 0x0
LR is at clk_enable+0x40/0x58
pc : [<00000000>]    lr : [<c021b178>]    psr: 600000d3
sp : c0641f60  ip : c0641f4c  fp : c0641f74
r10: c1ffc7a0  r9 : 6901b118  r8 : 00000001
r7 : c0639a34  r6 : 0000001b  r5 : a00000d3  r4 : c0645d70
r3 : c0645d78  r2 : 00000001  r1 : c0641ef0  r0 : c0645d70
Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
Control: c020717f  Table: c020717f  DAC: 00000053
Process swapper (pid: 0, stack limit = 0xc0640188)
Stack: (0xc0641f60 to 0xc0642000)
1f60: 00384000 c08762e4 c0641f98 c0641f7 c063308c c021b144 00000000 00000000
1f80: 00000000 c0660b20 ffffffff c0641fa8 c0641f9c c06220ec c0633058 c0641fb8
1fa0: c0641fac c061f114 c06220dc c0641ff4 c0641fbc c061bb68 c061f0fc ffffffff
1fc0: ffffffff 00000000 c061b6cc c0639a34 c0660cd4 c0642038 c0639a30 c0645434
1fe0: c0204000 c06380f8 00000000 c0641ff8 c0208048 c061b954 00000000 00000000
Backtrace:
[<c021b138>] (clk_enable) from [<c063308c>] (pxa_timer_nodt_init+0x40/0x120)
 r5:c08762e4 r4:00384000
[<c063304c>] (pxa_timer_nodt_init) from [<c06220ec>] (sa1100_timer_init+0x1c/0x20)
 r6:ffffffff r5:c0660b20 r4:00000000
[<c06220d0>] (sa1100_timer_init) from [<c061f114>] (time_init+0x24/0x2c)
[<c061f0f0>] (time_init) from [<c061bb68>] (start_kernel+0x220/0x42c)
[<c061b948>] (start_kernel) from [<c0208048>] (0xc0208048)
 r10:c06380f8 r8:c0204000 r7:c0645434 r6:c0639a30 r5:c0642038 r4:c0660cd4
Code: bad PC value
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!

Fixes: ee3a402 ("ARM: 8250/1: sa1100: provide OSTIMER0 clock for pxa_timer")
Acked-by: Dmitry Eremin-Solenikov <[email protected]>
Signed-off-by: Russell King <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
I got this:

    divide error: 0000 [tobetter#1] PREEMPT SMP KASAN
    CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ torvalds#189
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff8801120a9580 task.stack: ffff8801120b0000
    RIP: 0010:[<ffffffff82c8bd9a>]  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
    RSP: 0018:ffff88011aa87da8  EFLAGS: 00010006
    RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
    RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
    R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
    R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
    FS:  00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
    Stack:
     0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
     ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
     00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
    Call Trace:
     <IRQ>
     [<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
     [<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
     [<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
     [<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
     [<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
     [<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
     [<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
     [<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
     [<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
     <EOI>
     [<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
     [<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
     [<ffffffff82c87015>] snd_timer_continue+0x45/0x80
     [<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
     [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
     [<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
     [<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
     [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
     [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
     [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
     [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
     [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
     [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
     [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
     [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
    RIP  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
     RSP <ffff88011aa87da8>
    ---[ end trace 6aa380f756a21074 ]---

The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
completely new/unused timer -- it will have ->sticks == 0, which causes a
divide by 0 in snd_hrtimer_callback().

Signed-off-by: Vegard Nossum <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
I hit this with syzkaller:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [tobetter#1] PREEMPT SMP KASAN
    CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ torvalds#190
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff88011278d600 task.stack: ffff8801120c0000
    RIP: 0010:[<ffffffff82c8ba07>]  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
    RSP: 0018:ffff8801120c7a60  EFLAGS: 00010006
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007
    RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048
    RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790
    R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980
    R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286
    FS:  00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0
    Stack:
     ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0
     ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000
     ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0
    Call Trace:
     [<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670
     [<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0
     [<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830
     [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
     [<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
     [<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0
     [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
     [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
     [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
     [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
     [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
     [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
     [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
     [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46
    RIP  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
     RSP <ffff8801120c7a60>
    ---[ end trace 5955b08db7f2b029 ]---

This can happen if snd_hrtimer_open() fails to allocate memory and
returns an error, which is currently not checked by snd_timer_open():

    ioctl(SNDRV_TIMER_IOCTL_SELECT)
     - snd_timer_user_tselect()
	- snd_timer_close()
	   - snd_hrtimer_close()
	      - (struct snd_timer *) t->private_data = NULL
        - snd_timer_open()
           - snd_hrtimer_open()
              - kzalloc() fails; t->private_data is still NULL

    ioctl(SNDRV_TIMER_IOCTL_START)
     - snd_timer_user_start()
	- snd_timer_start()
	   - snd_timer_start1()
	      - snd_hrtimer_start()
		- t->private_data == NULL // boom

Signed-off-by: Vegard Nossum <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
If we hit the error path, we have never called drm_encoder_init() and so
have nothing to cleanup. Doing so hits a null dereference:

[   10.066261] BUG: unable to handle kernel NULL pointer dereference at 00000104
[   10.066273] IP: [<c16054b4>] mutex_lock+0xa/0x15
[   10.066287] *pde = 00000000
[   10.066295] Oops: 0002 [tobetter#1]
[   10.066302] Modules linked in: i915(+) video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm iTCO_wdt iTCO_vendor_support ppdev evdev snd_intel8x0 snd_ac97_codec ac97_bus psmouse snd_pcm snd_timer snd pcspkr uhci_hcd ehci_pci soundcore sr_mod ehci_hcd serio_raw i2c_i801 usbcore i2c_smbus cdrom lpc_ich mfd_core rng_core e100 mii floppy parport_pc parport acpi_cpufreq button processor usb_common eeprom lm85 hwmon_vid autofs4
[   10.066378] CPU: 0 PID: 132 Comm: systemd-udevd Not tainted 4.8.0-rc3-00013-gef0e1ea tobetter#34
[   10.066389] Hardware name: MicroLink                               /D865GLC                        , BIOS BF86510A.86A.0077.P25.0508040031 08/04/2005
[   10.066401] task: f62db800 task.stack: f5970000
[   10.066409] EIP: 0060:[<c16054b4>] EFLAGS: 00010286 CPU: 0
[   10.066417] EIP is at mutex_lock+0xa/0x15
[   10.066424] EAX: 00000104 EBX: 00000104 ECX: 00000000 EDX: 80000000
[   10.066432] ESI: 00000000 EDI: 00000104 EBP: f5be8000 ESP: f5971b58
[   10.066439]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[   10.066446] CR0: 80050033 CR2: 00000104 CR3: 35945000 CR4: 000006d0
[   10.066453] Stack:
[   10.066459]  f503d740 f824dddf 00000000 f61170c0 f61170c0 f82371ae f850f40e 00000001
[   10.066476]  f61170c0 f5971bcc f5be8000 f9c2d401 00000001 f8236fcc 00000001 00000000
[   10.066491]  f5144014 f5be8104 00000008 f9c5267c 00000007 f61170c0 f5144400 f9c4ff00
[   10.066507] Call Trace:
[   10.066526]  [<f824dddf>] ? drm_modeset_lock_all+0x27/0xb3 [drm]
[   10.066545]  [<f82371ae>] ? drm_encoder_cleanup+0x1a/0x132 [drm]
[   10.066559]  [<f850f40e>] ? drm_atomic_helper_connector_reset+0x3f/0x5c [drm_kms_helper]
[   10.066644]  [<f9c2d401>] ? intel_dvo_init+0x569/0x788 [i915]
[   10.066663]  [<f8236fcc>] ? drm_encoder_init+0x43/0x20b [drm]
[   10.066734]  [<f9bf1fce>] ? intel_modeset_init+0x1436/0x17dd [i915]
[   10.066791]  [<f9b37636>] ? i915_driver_load+0x85a/0x15d3 [i915]
[   10.066846]  [<f9b3603d>] ? i915_driver_open+0x5/0x5 [i915]
[   10.066857]  [<c14af4d0>] ? firmware_map_add_entry.part.2+0xc/0xc
[   10.066868]  [<c1343daf>] ? pci_device_probe+0x8e/0x11c
[   10.066878]  [<c140cec8>] ? driver_probe_device+0x1db/0x62e
[   10.066888]  [<c120c010>] ? kernfs_new_node+0x29/0x9c
[   10.066897]  [<c13438e0>] ? pci_match_device+0xd9/0x161
[   10.066905]  [<c120c48b>] ? kernfs_create_dir_ns+0x42/0x88
[   10.066914]  [<c140d401>] ? __driver_attach+0xe6/0x11b
[   10.066924]  [<c1303b13>] ? kobject_add_internal+0x1bb/0x44f
[   10.066933]  [<c140d31b>] ? driver_probe_device+0x62e/0x62e
[   10.066941]  [<c140a2d2>] ? bus_for_each_dev+0x46/0x7f
[   10.066950]  [<c140c502>] ? driver_attach+0x1a/0x34
[   10.066958]  [<c140d31b>] ? driver_probe_device+0x62e/0x62e
[   10.066966]  [<c140b758>] ? bus_add_driver+0x217/0x32a
[   10.066975]  [<f8403000>] ? 0xf8403000
[   10.066982]  [<c140de27>] ? driver_register+0x5f/0x108
[   10.066991]  [<c1000493>] ? do_one_initcall+0x49/0x1f6
[   10.067000]  [<c1082299>] ? pick_next_task_fair+0x14b/0x2a3
[   10.067008]  [<c1603c8d>] ? __schedule+0x15c/0x4fe
[   10.067016]  [<c1604104>] ? preempt_schedule_common+0x19/0x3c
[   10.067027]  [<c11051de>] ? do_init_module+0x17/0x230
[   10.067035]  [<c1604139>] ? _cond_resched+0x12/0x1a
[   10.067044]  [<c116f9aa>] ? kmem_cache_alloc+0x8f/0x11f
[   10.067052]  [<c11051de>] ? do_init_module+0x17/0x230
[   10.067060]  [<c11703dd>] ? kfree+0x137/0x203
[   10.067068]  [<c110523d>] ? do_init_module+0x76/0x230
[   10.067078]  [<c10cadf3>] ? load_module+0x2a39/0x333f
[   10.067087]  [<c10cb8b2>] ? SyS_finit_module+0x96/0xd5
[   10.067096]  [<c1132231>] ? vm_mmap_pgoff+0x79/0xa0
[   10.067105]  [<c1001e96>] ? do_fast_syscall_32+0xb5/0x1b0
[   10.067114]  [<c16086a6>] ? sysenter_past_esp+0x47/0x75
[   10.067121] Code: c8 f7 76 c1 e8 8e cc d2 ff e9 45 fe ff ff 66 90 66 90 66 90 66 90 90 ff 00 7f 05 e8 4e 0c 00 00 c3 53 89 c3 e8 75 ec ff ff 89 d8 <ff> 08 79 05 e8 fa 0a 00 00 5b c3 53 89 c3 85 c0 74 1b 8b 03 83
[   10.067180] EIP: [<c16054b4>] mutex_lock+0xa/0x15 SS:ESP 0068:f5971b58
[   10.067190] CR2: 0000000000000104
[   10.067222] ---[ end trace 049f1f09da45a856 ]---

Reported-by: Meelis Roos <[email protected]>
Fixes: 580d8ed ("drm/i915: Give encoders useful names")
Reviewed-by: David Weinehall <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: [email protected]
Signed-off-by: Jani Nikula <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8f76aa0)
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
On handling amsdu on rx path, get the rx_status from htt context. Without this
fix, we are seeing warnings when running DBDC traffic like this.

WARNING: CPU: 0 PID: 0 at net/mac80211/rx.c:4105 ieee80211_rx_napi+0x88/0x7d8 [mac80211]()

[ 1715.878248] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.21 tobetter#1
[ 1715.878273] [<c001d3f4>] (unwind_backtrace) from [<c001a4b0>] (show_stack+0x10/0x14)
[ 1715.878293] [<c001a4b0>] (show_stack) from [<c01bee64>] (dump_stack+0x70/0xbc)
[ 1715.878315] [<c01bee64>] (dump_stack) from [<c002a61c>] (warn_slowpath_common+0x64/0x88)
[ 1715.878339] [<c002a61c>] (warn_slowpath_common) from [<c002a6d0>] (warn_slowpath_null+0x18/0x20)
[ 1715.878395] [<c002a6d0>] (warn_slowpath_null) from [<bf4caa98>] (ieee80211_rx_napi+0x88/0x7d8 [mac80211])
[ 1715.878474] [<bf4caa98>] (ieee80211_rx_napi [mac80211]) from [<bf568658>] (ath10k_htt_t2h_msg_handler+0xb48/0xbfc [ath10k_core])
[ 1715.878535] [<bf568658>] (ath10k_htt_t2h_msg_handler [ath10k_core]) from [<bf568708>] (ath10k_htt_t2h_msg_handler+0xbf8/0xbfc [ath10k_core])
[ 1715.878597] [<bf568708>] (ath10k_htt_t2h_msg_handler [ath10k_core]) from [<bf569160>] (ath10k_htt_txrx_compl_task+0xa54/0x1170 [ath10k_core])
[ 1715.878639] [<bf569160>] (ath10k_htt_txrx_compl_task [ath10k_core]) from [<c002db14>] (tasklet_action+0xb4/0x130)
[ 1715.878659] [<c002db14>] (tasklet_action) from [<c002d110>] (__do_softirq+0xe0/0x210)
[ 1715.878678] [<c002d110>] (__do_softirq) from [<c002d4b4>] (irq_exit+0x84/0xe0)
[ 1715.878700] [<c002d4b4>] (irq_exit) from [<c005a544>] (__handle_domain_irq+0x98/0xd0)
[ 1715.878722] [<c005a544>] (__handle_domain_irq) from [<c00085f4>] (gic_handle_irq+0x38/0x5c)
[ 1715.878741] [<c00085f4>] (gic_handle_irq) from [<c0009680>] (__irq_svc+0x40/0x74)
[ 1715.878753] Exception stack(0xc05f9f50 to 0xc05f9f98)
[ 1715.878767] 9f40: ffffffed 00000000 00399e1e c000a220
[ 1715.878786] 9f60: 00000000 c05f6780 c05f8000 00000000 c05f5db8 ffffffed c05f8000 c04d1980
[ 1715.878802] 9f80: 00000000 c05f9f98 c0018110 c0018114 60000013 ffffffff
[ 1715.878822] [<c0009680>] (__irq_svc) from [<c0018114>] (arch_cpu_idle+0x2c/0x50)
[ 1715.878844] [<c0018114>] (arch_cpu_idle) from [<c00530d4>] (cpu_startup_entry+0x108/0x234)
[ 1715.878866] [<c00530d4>] (cpu_startup_entry) from [<c05c7be0>] (start_kernel+0x33c/0x3b8)
[ 1715.878879] ---[ end trace 6d5e1cc0fef8ed6a ]---
[ 1715.878899] ------------[ cut here ]------------

Fixes: 1823566 ("ath10k: cleanup amsdu processing for rx indication")
Signed-off-by: Ashok Raj Nagarajan <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
The warning was seen on AR5416 chip, which invoke ath9k_hw_gio_get()
before the GPIO initialized correctly.

    WARNING: CPU: 1 PID: 1159 at ~/drivers/net/wireless/ath/ath9k/hw.c:2776 ath9k_hw_gpio_get+0x148/0x1a0 [ath9k_hw]
    ...
    CPU: 1 PID: 1159 Comm: systemd-udevd Not tainted 4.7.0-rc7-aptosid-amd64 tobetter#1 aptosid 4.7~rc7-1~git92.slh.3
    Hardware name:                  /DH67CL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012
      0000000000000286 00000000f912d633 ffffffff81290fd3 0000000000000000
      0000000000000000 ffffffff81063fd4 ffff88040c6dc018 0000000000000000
      0000000000000002 0000000000000000 0000000000000100 ffff88040c6dc018
    Call Trace:
      [<ffffffff81290fd3>] ? dump_stack+0x5c/0x79
      [<ffffffff81063fd4>] ? __warn+0xb4/0xd0
      [<ffffffffa0668fb8>] ? ath9k_hw_gpio_get+0x148/0x1a0 [ath9k_hw]

Signed-off-by: Miaoqing Pan <[email protected]>
Reported-by: Stefan Lippers-Hollmann <[email protected]>
Tested-by: Stefan Lippers-Hollmann <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
According to the CI test machines, SNB also uses the
GEN7_PCODE_MIN_FREQ_TABLE_GT_RATIO_OUT_OF_RANGE value to report a bad
GEN6_PCODE_MIN_FREQ_TABLE request.

[  157.744641] WARNING: CPU: 5 PID: 9238 at
drivers/gpu/drm/i915/intel_pm.c:7760 sandybridge_pcode_write+0x141/0x200 [i915]
[  157.744642] Missing switch case (16) in gen6_check_mailbox_status
[  157.744642] Modules linked in: snd_hda_intel i915 ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me lpc_ich snd_pcm mei broadcom bcm_phy_lib tg3 ptp pps_core [last unloaded: vgem]
[  157.744658] CPU: 5 PID: 9238 Comm: drv_hangman Tainted: G     U  W 4.8.0-rc3-CI-CI_DRM_1589+ tobetter#1
[  157.744658] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
[  157.744659]  0000000000000000 ffff88011f093a98 ffffffff81426415 ffff88011f093ae8
[  157.744662]  0000000000000000 ffff88011f093ad8 ffffffff8107d2a6 00001e50810d3c9f
[  157.744663]  ffff880128680000 0000000000000008 0000000000000000 ffff88012868a650
[  157.744665] Call Trace:
[  157.744669]  [<ffffffff81426415>] dump_stack+0x67/0x92
[  157.744672]  [<ffffffff8107d2a6>] __warn+0xc6/0xe0
[  157.744673]  [<ffffffff8107d30a>] warn_slowpath_fmt+0x4a/0x50
[  157.744685]  [<ffffffffa0029831>] sandybridge_pcode_write+0x141/0x200 [i915]
[  157.744697]  [<ffffffffa002a88a>] intel_enable_gt_powersave+0x64a/0x1330 [i915]
[  157.744712]  [<ffffffffa006b4cb>] ? i9xx_emit_request+0x1b/0x80 [i915]
[  157.744725]  [<ffffffffa0055ed3>] __i915_add_request+0x1e3/0x370 [i915]
[  157.744738]  [<ffffffffa00428bd>] i915_gem_do_execbuffer.isra.16+0xced/0x1b80 [i915]
[  157.744740]  [<ffffffff811a232e>] ? __might_fault+0x3e/0x90
[  157.744752]  [<ffffffffa0043b72>] i915_gem_execbuffer2+0xc2/0x2a0 [i915]
[  157.744753]  [<ffffffff815485b7>] drm_ioctl+0x207/0x4c0
[  157.744765]  [<ffffffffa0043ab0>] ? i915_gem_execbuffer+0x360/0x360 [i915]
[  157.744767]  [<ffffffff810ea4ad>] ?  debug_lockdep_rcu_enabled+0x1d/0x20
[  157.744769]  [<ffffffff811fe09e>] do_vfs_ioctl+0x8e/0x680
[  157.744770]  [<ffffffff811a2377>] ? __might_fault+0x87/0x90
[  157.744771]  [<ffffffff811a232e>] ? __might_fault+0x3e/0x90
[  157.744773]  [<ffffffff810d3df2>] ?  trace_hardirqs_on_caller+0x122/0x1b0
[  157.744774]  [<ffffffff811fe6cc>] SyS_ioctl+0x3c/0x70
[  157.744776]  [<ffffffff8180fe69>] entry_SYSCALL_64_fastpath+0x1c/0xac

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97491
Fixes: 8766050 ("drm/i915/gen6+: Interpret mailbox error flags")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Lyude <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: [email protected]
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Mika Kuoppala <[email protected]>
(cherry picked from commit 7850d1c)
Signed-off-by: Jani Nikula <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
When a user timer instance is continued without the explicit start
beforehand, the system gets eventually zero-division error like:

  divide error: 0000 [tobetter#1] SMP DEBUG_PAGEALLOC KASAN
  CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ tobetter#8
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
   task: ffff88003c9b2280 task.stack: ffff880027280000
   RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns include/linux/ktime.h:195
   RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
  Call Trace:
   <IRQ>
   [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
   [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 kernel/time/hrtimer.c:1302
   [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
   [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 arch/x86/kernel/apic/apic.c:933
   [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:957
   [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:487
   <EOI>
   .....

Although a similar issue was spotted and a fix patch was merged in
commit [6b760bb: ALSA: timer: fix division by zero after
SNDRV_TIMER_IOCTL_CONTINUE], it seems covering only a part of
iceberg.

In this patch, we fix the issue a bit more drastically.  Basically the
continue of an uninitialized timer is supposed to be a fresh start, so
we do it for user timers.  For the direct snd_timer_continue() call,
there is no way to pass the initial tick value, so we kick out for the
uninitialized case.

Reported-by: Dmitry Vyukov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
When a seq-virmidi driver is initialized, it registers a rawmidi
instance with its callback to create an associated seq kernel client.
Currently it's done throughly in rawmidi's register_mutex context.
Recently it was found that this may lead to a deadlock another rawmidi
device that is being attached with the sequencer is accessed, as both
open with the same register_mutex.  This was actually triggered by
syzkaller, as Dmitry Vyukov reported:

======================================================
 [ INFO: possible circular locking dependency detected ]
 4.8.0-rc1+ tobetter#11 Not tainted
 -------------------------------------------------------
 syz-executor/7154 is trying to acquire lock:
  (register_mutex#5){+.+.+.}, at: [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341

 but task is already holding lock:
  (&grp->list_mutex){++++.+}, at: [<ffffffff850138bb>] check_and_subscribe_port+0x5b/0x5c0 sound/core/seq/seq_ports.c:495

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> tobetter#1 (&grp->list_mutex){++++.+}:
    [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
    [<ffffffff863f6199>] down_read+0x49/0xc0 kernel/locking/rwsem.c:22
    [<     inline     >] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:681
    [<ffffffff85005c5e>] snd_seq_deliver_event+0x35e/0x890 sound/core/seq/seq_clientmgr.c:822
    [<ffffffff85006e96>] > snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418
    [<ffffffff85012c52>] snd_seq_system_broadcast+0xb2/0xf0 sound/core/seq/seq_system.c:101
    [<ffffffff84fff70a>] snd_seq_create_kernel_client+0x24a/0x330 sound/core/seq/seq_clientmgr.c:2297
    [<     inline     >] snd_virmidi_dev_attach_seq sound/core/seq/seq_virmidi.c:383
    [<ffffffff8502d29f>] snd_virmidi_dev_register+0x29f/0x750 sound/core/seq/seq_virmidi.c:450
    [<ffffffff84fd208c>] snd_rawmidi_dev_register+0x30c/0xd40 sound/core/rawmidi.c:1645
    [<ffffffff84f816d3>] __snd_device_register.part.0+0x63/0xc0 sound/core/device.c:164
    [<     inline     >] __snd_device_register sound/core/device.c:162
    [<ffffffff84f8235d>] snd_device_register_all+0xad/0x110 sound/core/device.c:212
    [<ffffffff84f7546f>] snd_card_register+0xef/0x6c0 sound/core/init.c:749
    [<ffffffff85040b7f>] snd_virmidi_probe+0x3ef/0x590 sound/drivers/virmidi.c:123
    [<ffffffff833ebf7b>] platform_drv_probe+0x8b/0x170 drivers/base/platform.c:564
    ......

 -> #0 (register_mutex#5){+.+.+.}:
    [<     inline     >] check_prev_add kernel/locking/lockdep.c:1829
    [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1939
    [<     inline     >] validate_chain kernel/locking/lockdep.c:2266
    [<ffffffff814791f4>] __lock_acquire+0x4d44/0x4d80 kernel/locking/lockdep.c:3335
    [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
    [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
    [<ffffffff863f0ef1>] mutex_lock_nested+0xb1/0xa20 kernel/locking/mutex.c:621
    [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
    [<ffffffff8502e7c7>] midisynth_subscribe+0xf7/0x350 sound/core/seq/seq_midi.c:188
    [<     inline     >] subscribe_port sound/core/seq/seq_ports.c:427
    [<ffffffff85013cc7>] check_and_subscribe_port+0x467/0x5c0 sound/core/seq/seq_ports.c:510
    [<ffffffff85015da9>] snd_seq_port_connect+0x2c9/0x500 sound/core/seq/seq_ports.c:579
    [<ffffffff850079b8>] snd_seq_ioctl_subscribe_port+0x1d8/0x2b0 sound/core/seq/seq_clientmgr.c:1480
    [<ffffffff84ffe9e4>] snd_seq_do_ioctl+0x184/0x1e0 sound/core/seq/seq_clientmgr.c:2225
    [<ffffffff84ffeae8>] snd_seq_kernel_client_ctl+0xa8/0x110 sound/core/seq/seq_clientmgr.c:2440
    [<ffffffff85027664>] snd_seq_oss_midi_open+0x3b4/0x610 sound/core/seq/oss/seq_oss_midi.c:375
    [<ffffffff85023d67>] snd_seq_oss_synth_setup_midi+0x107/0x4c0 sound/core/seq/oss/seq_oss_synth.c:281
    [<ffffffff8501b0a8>] snd_seq_oss_open+0x748/0x8d0 sound/core/seq/oss/seq_oss_init.c:274
    [<ffffffff85019d8a>] odev_open+0x6a/0x90 sound/core/seq/oss/seq_oss.c:138
    [<ffffffff84f7040f>] soundcore_open+0x30f/0x640 sound/sound_core.c:639
    ......

 other info that might help us debug this:

 Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&grp->list_mutex);
                                lock(register_mutex#5);
                                lock(&grp->list_mutex);
   lock(register_mutex#5);

 *** DEADLOCK ***
======================================================

The fix is to simply move the registration parts in
snd_rawmidi_dev_register() to the outside of the register_mutex lock.
The lock is needed only to manage the linked list, and it's not
necessarily to cover the whole initialization process.

Reported-by: Dmitry Vyukov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
The resent conversion of the cpu hotplug support in the uncore driver
introduced a regression due to the way the callbacks are invoked at
initialization time.

The old code called the prepare/starting/online function on each online cpu
as a block. The new code registers the hotplug callbacks in the core for
each state. The core invokes the callbacks at each registration on all
online cpus.

The code implicitely relied on the prepare/starting/online callbacks being
called as combo on a particular cpu, which was not obvious and completely
undocumented.

The resulting subtle wreckage happens due to the way how the uncore code
manages shared data structures for cpus which share an uncore resource in
hardware. The sharing is determined in the cpu starting callback, but the
prepare callback allocates per cpu data for the upcoming cpu because
potential sharing is unknown at this point. If the starting callback finds
a online cpu which shares the hardware resource it takes a refcount on the
percpu data of that cpu and puts the own data structure into a
'free_at_online' pointer of that shared data structure. The online callback
frees that.

With the old model this worked because in a starting callback only one non
unused structure (the one of the starting cpu) was available. The new code
allocates the data structures for all cpus when the prepare callback is
registered.

Now the starting function iterates through all online cpus and looks for a
data structure (skipping its own) which has a matching hardware id. The id
member of the data structure is initialized to 0, but the hardware id can
be 0 as well. The resulting wreckage is:

  CPU0 finds a matching id on CPU1, takes a refcount on CPU1 data and puts
  its own data structure into CPU1s data structure to be freed.

  CPU1 skips CPU0 because the data structure is its allegedly unsued own.
  It finds a matching id on CPU2, takes a refcount on CPU1 data and puts
  its own data structure into CPU2s data structure to be freed.

  ....

Now the online callbacks are invoked.

  CPU0 has a pointer to CPU1s data and frees the original CPU0 data. So
  far so good.

  CPU1 has a pointer to CPU2s data and frees the original CPU1 data, which
  is still referenced by CPU0 ---> Booom

So there are two issues to be solved here:

1) The id field must be initialized at allocation time to a value which
   cannot be a valid hardware id, i.e. -1

   This prevents the above scenario, but now CPU1 and CPU2 both stick their
   own data structure into the free_at_online pointer of CPU0. So we leak
   CPU1s data structure.

2) Fix the memory leak described in tobetter#1

   Instead of having a single pointer, use a hlist to enqueue the
   superflous data structures which are then freed by the first cpu
   invoking the online callback.

Ideally we should know the sharing _before_ invoking the prepare callback,
but that's way beyond the scope of this bug fix.

[ tglx: Rewrote changelog ]

Fixes: 96b2bd3 ("perf/x86/amd/uncore: Convert to hotplug state machine")
Reported-and-tested-by: Eric Sandeen <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Cc: Borislav Petkov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
On lubbock board, the probe of the driver crashes by dereferencing very
early a platform_data structure which is not set, in
pxa2xx_configure_sockets().

The stack fixed is :
[    0.244353] SA1111 Microprocessor Companion Chip: silicon revision 1, metal revision 1
[    0.256321] sa1111 sa1111: Providing IRQ336-390
[    0.340899] clocksource: Switched to clocksource oscr0
[    0.472263] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[    0.480469] pgd = c0004000
[    0.483432] [00000004] *pgd=00000000
[    0.487105] Internal error: Oops: f5 [tobetter#1] ARM
[    0.491497] Modules linked in:
[    0.494650] CPU: 0 PID: 1 Comm: swapper Not tainted 4.8.0-rc3-00080-g1aaa68426f0c-dirty #2068
[    0.503229] Hardware name: Intel DBPXA250 Development Platform (aka Lubbock)
[    0.510344] task: c3e42000 task.stack: c3e44000
[    0.514984] PC is at pxa2xx_configure_sockets+0x4/0x24 (drivers/pcmcia/pxa2xx_base.c:227)
[    0.520193] LR is at pcmcia_lubbock_init+0x1c/0x38
[    0.525079] pc : [<c0247c30>]    lr : [<c02479b0>]    psr: a0000053
[    0.525079] sp : c3e45e70  ip : 100019ff  fp : 00000000
[    0.536651] r10: c0828900  r9 : c0434838  r8 : 00000000
[    0.541953] r7 : c0820700  r6 : c0857b30  r5 : c3ec1400  r4 : c0820758
[    0.548549] r3 : 00000000  r2 : 0000000c  r1 : c3c09c40  r0 : c3ec1400
[    0.555154] Flags: NzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
[    0.562450] Control: 0000397f  Table: a0004000  DAC: 00000053
[    0.568257] Process swapper (pid: 1, stack limit = 0xc3e44190)
[    0.574154] Stack: (0xc3e45e70 to 0xc3e46000)
[    0.578610] 5e60:                                     c4849800 00000000 c3ec1400 c024769c
[    0.586928] 5e80: 00000000 c3ec140c c3c0ee0c c3ec1400 c3ec1434 c020c41 c3ec1400 c3ec1434
[    0.595244] 5ea0: c0820700 c080b408 c0828900 c020c5f8 00000000 c0820700 c020c578 c020ac5c
[    0.603560] 5ec0: c3e687cc c3e71e10 c0820700 00000000 c3c02de0 c020bae4 c03c62f7 c03c62f7
[    0.611872] 5ee0: c3e68780 c0820700 c042e034 00000000 c043c440 c020cdec c080b408 00000005
[    0.620188] 5f00: c042e034 c00096c0 c0034440 c01c730c 20000053 ffffffff 00000000 00000000
[    0.628502] 5f20: 00000000 c3ffcb87 c3ffcb90 c00346ac c3e66ba0 c03f7914 00000092 00000005
[    0.636811] 5f40: 00000005 c03f847c 00000091 c03f847c 00000000 00000005 c0434828 00000005
[    0.645125] 5f60: c043482c 00000092 c043c440 c0828900 c0434838 c0418d2c 00000005 00000005
[    0.653430] 5f80: 00000000 c041858c 00000000 c032e9f0 00000000 00000000 00000000 00000000
[    0.661729] 5fa0: 00000000 c032e9f8 00000000 c000f0f0 00000000 00000000 00000000 00000000
[    0.670020] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.678311] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    0.686673] (pxa2xx_configure_sockets) from pcmcia_lubbock_init (/drivers/pcmcia/sa1111_lubbock.c:161)
[    0.696026] (pcmcia_lubbock_init) from pcmcia_probe (/drivers/pcmcia/sa1111_generic.c:213)
[    0.704358] (pcmcia_probe) from driver_probe_device (/drivers/base/dd.c:378 /drivers/base/dd.c:499)
[    0.712848] (driver_probe_device) from __driver_attach (/./include/linux/device.h:983 /drivers/base/dd.c:733)
[    0.721414] (__driver_attach) from bus_for_each_dev (/drivers/base/bus.c:313)
[    0.729723] (bus_for_each_dev) from bus_add_driver (/drivers/base/bus.c:708)
[    0.738036] (bus_add_driver) from driver_register (/drivers/base/driver.c:169)
[    0.746185] (driver_register) from do_one_initcall (/init/main.c:778)
[    0.754561] (do_one_initcall) from kernel_init_freeable (/init/main.c:843 /init/main.c:851 /init/main.c:869 /init/main.c:1016)
[    0.763409] (kernel_init_freeable) from kernel_init (/init/main.c:944)
[    0.771660] (kernel_init) from ret_from_fork (/arch/arm/kernel/entry-common.S:119)
[ 0.779347] Code: c03c6305 c03c631e c03c632e e5903048 (e993000c)
All code
========
   0:	c03c6305 	eorsgt	r6, ip, r5, lsl tobetter#6
   4:	c03c631e 	eorsgt	r6, ip, lr, lsl r3
   8:	c03c632e 	eorsgt	r6, ip, lr, lsr tobetter#6
   c:	e5903048 	ldr	r3, [r0, torvalds#72]	; 0x48
  10:*	e993000c 	ldmib	r3, {r2, r3}		<-- trapping instruction

Signed-off-by: Robert Jarzmik <[email protected]>
Signed-off-by: Russell King <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
rsc_lookup steals the passed-in memory to avoid doing an allocation of
its own, so we can't just pass in a pointer to memory that someone else
is using.

If we really want to avoid allocation there then maybe we should
preallocate somwhere, or reference count these handles.

For now we should revert.

On occasion I see this on my server:

kernel: kernel BUG at /home/cel/src/linux/linux-2.6/mm/slub.c:3851!
kernel: invalid opcode: 0000 [tobetter#1] SMP
kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd btrfs xor iTCO_wdt iTCO_vendor_support raid6_pq pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me sg mei shpchp wmi ioatdma ipmi_si ipmi_msghandler acpi_pad acpi_power_meter rpcrdma ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb mlx4_core ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 7 PID: 145 Comm: kworker/7:2 Not tainted 4.8.0-rc4-00006-g9d06b0b tobetter#15
kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
kernel: Workqueue: events do_cache_clean [sunrpc]
kernel: task: ffff8808541d8000 task.stack: ffff880854344000
kernel: RIP: 0010:[<ffffffff811e7075>]  [<ffffffff811e7075>] kfree+0x155/0x180
kernel: RSP: 0018:ffff880854347d70  EFLAGS: 00010246
kernel: RAX: ffffea0020fe7660 RBX: ffff88083f9db064 RCX: 146ff0f9d5ec5600
kernel: RDX: 000077ff80000000 RSI: ffff880853f01500 RDI: ffff88083f9db064
kernel: RBP: ffff880854347d88 R08: ffff8808594ee000 R09: ffff88087fdd8780
kernel: R10: 0000000000000000 R11: ffffea0020fe76c0 R12: ffff880853f01500
kernel: R13: ffffffffa013cf76 R14: ffffffffa013cff0 R15: ffffffffa04253a0
kernel: FS:  0000000000000000(0000) GS:ffff88087fdc0000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fed60b020c3 CR3: 0000000001c06000 CR4: 00000000001406e0
kernel: Stack:
kernel: ffff8808589f2f00 ffff880853f01500 0000000000000001 ffff880854347da0
kernel: ffffffffa013cf76 ffff8808589f2f00 ffff880854347db8 ffffffffa013d006
kernel: ffff8808589f2f20 ffff880854347e00 ffffffffa0406f60 0000000057c7044f
kernel: Call Trace:
kernel: [<ffffffffa013cf76>] rsc_free+0x16/0x90 [auth_rpcgss]
kernel: [<ffffffffa013d006>] rsc_put+0x16/0x30 [auth_rpcgss]
kernel: [<ffffffffa0406f60>] cache_clean+0x2e0/0x300 [sunrpc]
kernel: [<ffffffffa04073ee>] do_cache_clean+0xe/0x70 [sunrpc]
kernel: [<ffffffff8109a70f>] process_one_work+0x1ff/0x3b0
kernel: [<ffffffff8109b15c>] worker_thread+0x2bc/0x4a0
kernel: [<ffffffff8109aea0>] ? rescuer_thread+0x3a0/0x3a0
kernel: [<ffffffff810a0ba4>] kthread+0xe4/0xf0
kernel: [<ffffffff8169c47f>] ret_from_fork+0x1f/0x40
kernel: [<ffffffff810a0ac0>] ? kthread_stop+0x110/0x110
kernel: Code: f7 ff ff eb 3b 65 8b 05 da 30 e2 7e 89 c0 48 0f a3 05 a0 38 b8 00 0f 92 c0 84 c0 0f 85 d1 fe ff ff 0f 1f 44 00 00 e9 f5 fe ff ff <0f> 0b 49 8b 03 31 f6 f6 c4 40 0f 85 62 ff ff ff e9 61 ff ff ff
kernel: RIP  [<ffffffff811e7075>] kfree+0x155/0x180
kernel: RSP <ffff880854347d70>
kernel: ---[ end trace 3fdec044969def26 ]---

It seems to be most common after a server reboot where a client has been
using a Kerberos mount, and reconnects to continue its workload.

Signed-off-by: Chuck Lever <[email protected]>
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
af_iucv socket programs with HiperSockets as transport make use of the qdio
completion queue. Running such an af_iucv socket program may result in a
crash:

[90341.677709] Oops: 0038 ilc:2 [tobetter#1] SMP
[90341.677743] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.6.0-20160720.0.0e86ec7.5e62689.fc23.s390xperformance tobetter#1
[90341.677744] Hardware name: IBM              2964 N96              703              (LPAR)
[90341.677746] task: 00000000edb79f00 ti: 00000000edb84000 task.ti: 00000000edb84000
[90341.677748] Krnl PSW : 0704d00180000000 000000000075bc50 (qeth_qdio_input_handler+0x258/0x4e0)
[90341.677756]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 000003d10391e900 0000000000000001 00000000e61e6000 0000000000000005
[90341.677759]            0000000000a9e6ec 5420040001a77400 0000000000000001 000000000000006f
[90341.677761]            00000000e0d83f00 0000000000000003 0000000000000010 5420040001a77400
[90341.677784]            000000007ba8b000 0000000000943fd0 000000000075bc4e 00000000ed3b3c10
[90341.677793] Krnl Code: 000000000075bc42: e320cc180004        lg      %r2,3096(%r12)
           000000000075bc48: c0e5ffffc5cc       brasl   %r14,7547e0
          #000000000075bc4e: 1816               lr      %r1,%r6
          >000000000075bc50: ba19b008           cs      %r1,%r9,8(%r11)
           000000000075bc54: ec180041017e       cij     %r1,1,8,75bcd6
           000000000075bc5a: 5810b008           l       %r1,8(%r11)
           000000000075bc5e: ec16005c027e       cij     %r1,2,6,75bd16
           000000000075bc64: 5090b008           st      %r9,8(%r11)
[90341.677807] Call Trace:
[90341.677810] ([<000000000075bbc0>] qeth_qdio_input_handler+0x1c8/0x4e0)
[90341.677812] ([<000000000070efbc>] qdio_kick_handler+0x124/0x2a8)
[90341.677814] ([<0000000000713570>] __tiqdio_inbound_processing+0xf0/0xcd0)
[90341.677818] ([<0000000000143312>] tasklet_action+0x92/0x120)
[90341.677823] ([<00000000008b6e72>] __do_softirq+0x112/0x308)
[90341.677824] ([<0000000000142bce>] irq_exit+0xd6/0xf8)
[90341.677829] ([<000000000010b1d2>] do_IRQ+0x6a/0x88)
[90341.677830] ([<00000000008b6322>] io_int_handler+0x112/0x220)
[90341.677832] ([<0000000000102b2e>] enabled_wait+0x56/0xa8)
[90341.677833] ([<0000000000000000>]           (null))
[90341.677835] ([<0000000000102e32>] arch_cpu_idle+0x32/0x48)
[90341.677838] ([<000000000018a126>] cpu_startup_entry+0x266/0x2b0)
[90341.677841] ([<0000000000113b38>] smp_start_secondary+0x100/0x110)
[90341.677843] ([<00000000008b68a6>] restart_int_handler+0x62/0x78)
[90341.677845] ([<00000000008b6588>] psw_idle+0x3c/0x40)
[90341.677846] Last Breaking-Event-Address:
[90341.677848]  [<00000000007547ec>] qeth_dbf_longtext+0xc/0xc0
[90341.677849]
[90341.677850] Kernel panic - not syncing: Fatal exception in interrupt

qeth_qdio_cq_handler() analyzes SBALs on this completion queue, but does
not observe the limit of 16 SBAL elements per SBAL. This patch adds the
additional check to process not more than 16 SBAL elements.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
Disable creation of a UDP socket for ipv6 when
CONFIG_IPV6 is not enabeld. Since udp_sock_create6()
returns 0 when CONFIG_IPV6 is not set

[   46.888632] IP: [<c220705a>] setup_udp_tunnel_sock+0x6/0x4f
[   46.891355] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[   46.893918] Oops: 0002 [tobetter#1] PREEMPT
[   46.896014] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-rc4-00001-g8700e3e tobetter#1
[   46.900280] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   46.904905] task: cf06c040 ti: cf05e000 task.ti: cf05e000
[   46.907854] EIP: 0060:[<c220705a>] EFLAGS: 00210246 CPU: 0
[   46.911137] EIP is at setup_udp_tunnel_sock+0x6/0x4f
[   46.914070] EAX: 00000044 EBX: 00000001 ECX: cf05fef0 EDX: ca8142e0
[   46.917236] ESI: c2c4505b EDI: cf05fef0 EBP: cf05fed0 ESP: cf05fed0
[   46.919836]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[   46.922046] CR0: 80050033 CR2: 000001fc CR3: 02cec000 CR4: 000006b0
[   46.924550] Stack:
[   46.926014]  cf05ff10 c1fd4657 ca8142e0 0000000a 00000000 00000000 0000b712 00000008
[   46.931274]  00000000 6bb5bd01 c1fd48de 00000000 00000000 cf05ff1c 00000000 00000000
[   46.936122]  cf05ff1c c1fd4bdf 00000000 cf05ff28 c2c4507b ffffffff cf05ff88 c2bf1c74
[   46.942350] Call Trace:
[   46.944403]  [<c1fd4657>] rxe_setup_udp_tunnel+0x8f/0x99
[   46.947689]  [<c1fd48de>] ? net_to_rxe+0x4e/0x4e
[   46.950567]  [<c1fd4bdf>] rxe_net_init+0xe/0xa4
[   46.953147]  [<c2c4507b>] rxe_module_init+0x20/0x4c
[   46.955448]  [<c2bf1c74>] do_one_initcall+0x89/0x113
[   46.957797]  [<c2bf15eb>] ? set_debug_rodata+0xf/0xf
[   46.959966]  [<c2bf1dbc>] ? kernel_init_freeable+0xbe/0x15b
[   46.962262]  [<c2bf1ddc>] kernel_init_freeable+0xde/0x15b
[   46.964418]  [<c232eb54>] kernel_init+0x8/0xd0
[   46.966618]  [<c2333122>] ret_from_kernel_thread+0xe/0x24
[   46.969592]  [<c232eb4c>] ? rest_init+0x6f/0x6f

Fixes: 8700e3e ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
Since commit 4d4c474 ("perf/x86/intel/bts: Fix BTS PMI detection")
my box goes boom on boot:

| .... node  #0, CPUs:      tobetter#1 tobetter#2 tobetter#3 tobetter#4 tobetter#5 tobetter#6 tobetter#7
| BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
| IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
| Call Trace:
|  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
|  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40

This happens because the code introduced in this commit dereferences the
debug store pointer unconditionally. The debug store is not guaranteed to
be available, so a NULL pointer check as on other places is required.

Fixes: 4d4c474 ("perf/x86/intel/bts: Fix BTS PMI detection")
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Reviewed-by: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
While running a compile on arm64, I hit a memory exposure

usercopy: kernel memory exposure attempt detected from fffffc0000f3b1a8 (buffer_head) (1 bytes)
------------[ cut here ]------------
kernel BUG at mm/usercopy.c:75!
Internal error: Oops - BUG: 0 [tobetter#1] SMP
Modules linked in: ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp
llc ebtable_nat ip6table_security ip6table_raw ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
iptable_security iptable_raw iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
ebtable_filter ebtables ip6table_filter ip6_tables vfat fat xgene_edac
xgene_enet edac_core i2c_xgene_slimpro i2c_core at803x realtek xgene_dma
mdio_xgene gpio_dwapb gpio_xgene_sb xgene_rng mailbox_xgene_slimpro nfsd
auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sdhci_of_arasan
sdhci_pltfm sdhci mmc_core xhci_plat_hcd gpio_keys
CPU: 0 PID: 19744 Comm: updatedb Tainted: G        W 4.8.0-rc3-threadinfo+ tobetter#1
Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
task: fffffe03df944c00 task.stack: fffffe00d128c000
PC is at __check_object_size+0x70/0x3f0
LR is at __check_object_size+0x70/0x3f0
...
[<fffffc00082b4280>] __check_object_size+0x70/0x3f0
[<fffffc00082cdc30>] filldir64+0x158/0x1a0
[<fffffc0000f327e8>] __fat_readdir+0x4a0/0x558 [fat]
[<fffffc0000f328d4>] fat_readdir+0x34/0x40 [fat]
[<fffffc00082cd8f8>] iterate_dir+0x190/0x1e0
[<fffffc00082cde58>] SyS_getdents64+0x88/0x120
[<fffffc0008082c70>] el0_svc_naked+0x24/0x28

fffffc0000f3b1a8 is a module address. Modules may have compiled in
strings which could get copied to userspace. In this instance, it
looks like "." which matches with a size of 1 byte. Extend the
is_vmalloc_addr check to be is_vmalloc_or_module_addr to cover
all possible cases.

Signed-off-by: Laura Abbott <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
mhaehnel pushed a commit to mhaehnel/linux that referenced this issue Sep 26, 2016
The wq_numa_init() function makes a private CPU to node map by calling
cpu_to_node() early in the boot process, before the non-boot CPUs are
brought online.  Since the default implementation of cpu_to_node()
returns zero for CPUs that have never been brought online, the
workqueue system's view is that *all* CPUs are on node zero.

When the unbound workqueue for a non-zero node is created, the
tsk_cpus_allowed() for the worker threads is the empty set because
there are, in the view of the workqueue system, no CPUs on non-zero
nodes.  The code in try_to_wake_up() using this empty cpumask ends up
using the cpumask empty set value of NR_CPUS as an index into the
per-CPU area pointer array, and gets garbage as it is one past the end
of the array.  This results in:

[    0.881970] Unable to handle kernel paging request at virtual address fffffb1008b926a4
[    1.970095] pgd = fffffc00094b0000
[    1.973530] [fffffb1008b926a4] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[    1.982610] Internal error: Oops: 96000004 [tobetter#1] SMP
[    1.987541] Modules linked in:
[    1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: G        W       4.8.0-rc6-preempt-vol+ tobetter#9
[    1.999435] Hardware name: Cavium ThunderX CN88XX board (DT)
[    2.005159] task: fffffe0fe89cc300 task.stack: fffffe0fe8b8c000
[    2.011158] PC is at try_to_wake_up+0x194/0x34c
[    2.015737] LR is at try_to_wake_up+0x150/0x34c
[    2.020318] pc : [<fffffc00080e7468>] lr : [<fffffc00080e7424>] pstate: 600000c5
[    2.027803] sp : fffffe0fe8b8fb10
[    2.031149] x29: fffffe0fe8b8fb10 x28: 0000000000000000
[    2.036522] x27: fffffc0008c63bc8 x26: 0000000000001000
[    2.041896] x25: fffffc0008c63c80 x24: fffffc0008bfb200
[    2.047270] x23: 00000000000000c0 x22: 0000000000000004
[    2.052642] x21: fffffe0fe89d25bc x20: 0000000000001000
[    2.058014] x19: fffffe0fe89d1d00 x18: 0000000000000000
[    2.063386] x17: 0000000000000000 x16: 0000000000000000
[    2.068760] x15: 0000000000000018 x14: 0000000000000000
[    2.074133] x13: 0000000000000000 x12: 0000000000000000
[    2.079505] x11: 0000000000000000 x10: 0000000000000000
[    2.084879] x9 : 0000000000000000 x8 : 0000000000000000
[    2.090251] x7 : 0000000000000040 x6 : 0000000000000000
[    2.095621] x5 : ffffffffffffffff x4 : 0000000000000000
[    2.100991] x3 : 0000000000000000 x2 : 0000000000000000
[    2.106364] x1 : fffffc0008be4c24 x0 : ffffff0ffffada80
[    2.111737]
[    2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfffffe0fe8b8c020)
[    2.120102] Stack: (0xfffffe0fe8b8fb10 to 0xfffffe0fe8b90000)
[    2.125914] fb00:                                   fffffe0fe8b8fb80 fffffc00080e7648
.
.
.
[    2.442859] Call trace:
[    2.445327] Exception stack(0xfffffe0fe8b8f940 to 0xfffffe0fe8b8fa70)
[    2.451843] f940: fffffe0fe89d1d00 0000040000000000 fffffe0fe8b8fb10 fffffc00080e7468
[    2.459767] f960: fffffe0fe8b8f980 fffffc00080e4958 ffffff0ff91ab200 fffffc00080e4b64
[    2.467690] f980: fffffe0fe8b8f9d0 fffffc00080e515c fffffe0fe8b8fa80 0000000000000000
[    2.475614] f9a0: fffffe0fe8b8f9d0 fffffc00080e58e4 fffffe0fe8b8fa80 0000000000000000
[    2.483540] f9c0: fffffe0fe8d10000 0000000000000040 fffffe0fe8b8fa50 fffffc00080e5ac4
[    2.491465] f9e0: ffffff0ffffada80 fffffc0008be4c24 0000000000000000 0000000000000000
[    2.499387] fa00: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000040
[    2.507309] fa20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    2.515233] fa40: 0000000000000000 0000000000000000 0000000000000000 0000000000000018
[    2.523156] fa60: 0000000000000000 0000000000000000
[    2.528089] [<fffffc00080e7468>] try_to_wake_up+0x194/0x34c
[    2.533723] [<fffffc00080e7648>] wake_up_process+0x28/0x34
[    2.539275] [<fffffc00080d3764>] create_worker+0x110/0x19c
[    2.544824] [<fffffc00080d69dc>] alloc_unbound_pwq+0x3cc/0x4b0
[    2.550724] [<fffffc00080d6bcc>] wq_update_unbound_numa+0x10c/0x1e4
[    2.557066] [<fffffc00080d7d78>] workqueue_online_cpu+0x220/0x28c
[    2.563234] [<fffffc00080bd288>] cpuhp_invoke_callback+0x6c/0x168
[    2.569398] [<fffffc00080bdf74>] cpuhp_up_callbacks+0x44/0xe4
[    2.575210] [<fffffc00080be194>] cpuhp_thread_fun+0x13c/0x148
[    2.581027] [<fffffc00080dfbac>] smpboot_thread_fn+0x19c/0x1a8
[    2.586929] [<fffffc00080dbd64>] kthread+0xdc/0xf0
[    2.591776] [<fffffc0008083380>] ret_from_fork+0x10/0x50
[    2.597147] Code: b00057e 91304021 91005021 b8626822 (b8606821)
[    2.603464] ---[ end trace 58c0cd36b88802bc ]---
[    2.608138] Kernel panic - not syncing: Fatal exception

Fix by moving call to numa_store_cpu_info() for all CPUs into
smp_prepare_cpus(), which happens before wq_numa_init().  Since
smp_store_cpu_info() now contains only a single function call,
simplify by removing the function and out-lining its contents.

Suggested-by: Robert Richter <[email protected]>
Fixes: 1a2db30 ("arm64, numa: Add NUMA support for arm64 platforms.")
Cc: <[email protected]> # 4.7.x-
Signed-off-by: David Daney <[email protected]>
Reviewed-by: Robert Richter <[email protected]>
Tested-by: Yisheng Xie <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
tobetter pushed a commit that referenced this issue Oct 6, 2016
A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ torvalds#98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park <[email protected]>
Acked-by: Li Zefan <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]> # 3.17+
Signed-off-by: Tejun Heo <[email protected]>
tobetter pushed a commit that referenced this issue Oct 6, 2016
…_route

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&mrt->ipmr_expire_timer))){+.-...}, at: [<ffffffff810fbbf5>] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [<ffffffff8161e005>] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ torvalds#179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]  <IRQ>  [<ffffffff813a7804>] dump_stack+0x85/0xc1
[ 7858.215662]  [<ffffffff810a4a72>] ___might_sleep+0x192/0x250
[ 7858.215868]  [<ffffffff810a4b9f>] __might_sleep+0x6f/0x100
[ 7858.216072]  [<ffffffff8165bea3>] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [<ffffffff815a7a5f>] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [<ffffffff8157474b>] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [<ffffffff815a9a0c>] netlink_unicast+0x19c/0x260
[ 7858.216900]  [<ffffffff81573c70>] rtnl_unicast+0x20/0x30
[ 7858.217128]  [<ffffffff8161cd39>] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [<ffffffff8161e06f>] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [<ffffffff810fbc95>] call_timer_fn+0xa5/0x350
[ 7858.218192]  [<ffffffff810fbbf5>] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [<ffffffff810fde10>] run_timer_softirq+0x260/0x640
[ 7858.218865]  [<ffffffff8166379b>] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [<ffffffff816637c8>] __do_softirq+0xe8/0x54f
[ 7858.219269]  [<ffffffff8107a948>] irq_exit+0xb8/0xc0
[ 7858.219463]  [<ffffffff81663452>] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [<ffffffff816625bc>] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]  <EOI>  [<ffffffff81055f16>] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [<ffffffff810d64dd>] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [<ffffffff810298e3>] default_idle+0x23/0x190
[ 7858.220574]  [<ffffffff8102a20f>] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [<ffffffff810c9f8c>] default_idle_call+0x4c/0x60
[ 7858.221016]  [<ffffffff810ca33b>] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [<ffffffff8164f995>] rest_init+0x135/0x140
[ 7858.221469]  [<ffffffff81f83014>] start_kernel+0x50e/0x51b
[ 7858.221670]  [<ffffffff81f82120>] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [<ffffffff81f8243f>] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [<ffffffff81f8257c>] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e90 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
tobetter pushed a commit that referenced this issue Oct 6, 2016
While the driver is probing the adapter, an error may occur before the
netdev structure is allocated and attached to pci_dev. In this case,
not only netdev isn't available, but the tg3 private structure is also
not available as it is just math from the NULL pointer, so dereferences
must be skipped.

The following trace is seen when the error is triggered:

  [1.402247] Unable to handle kernel paging request for data at address 0x00001a99
  [1.402410] Faulting instruction address: 0xc0000000007e33f8
  [1.402450] Oops: Kernel access of bad area, sig: 11 [#1]
  [1.402481] SMP NR_CPUS=2048 NUMA PowerNV
  [1.402513] Modules linked in:
  [1.402545] CPU: 0 PID: 651 Comm: eehd Not tainted 4.4.0-36-generic torvalds#55-Ubuntu
  [1.402591] task: c000001fe4e42a20 ti: c000001fe4e88000 task.ti: c000001fe4e88000
  [1.402742] NIP: c0000000007e33f8 LR: c0000000007e3164 CTR: c000000000595ea0
  [1.402787] REGS: c000001fe4e8b790 TRAP: 0300   Not tainted  (4.4.0-36-generic)
  [1.402832] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28000422  XER: 20000000
  [1.403058] CFAR: c000000000008468 DAR: 0000000000001a99 DSISR: 42000000 SOFTE: 1
  GPR00: c0000000007e3164 c000001fe4e8ba10 c0000000015c5e00 0000000000000000
  GPR04: 0000000000000001 0000000000000000 0000000000000039 0000000000000299
  GPR08: 0000000000000000 0000000000000001 c000001fe4e88000 0000000000000006
  GPR12: 0000000000000000 c00000000fb40000 c0000000000e6558 c000003ca1bffd00
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000d52768
  GPR24: c000000000d52740 0000000000000100 c000003ca1b52000 0000000000000002
  GPR28: 0000000000000900 0000000000000000 c00000000152a0c0 c000003ca1b52000
  [1.404226] NIP [c0000000007e33f8] tg3_io_error_detected+0x308/0x340
  [1.404265] LR [c0000000007e3164] tg3_io_error_detected+0x74/0x340

This patch avoids the NULL pointer dereference by moving the access after
the netdev NULL pointer check on tg3_io_error_detected(). Also, we add a
check for netdev being NULL on tg3_io_resume() [suggested by Michael Chan].

Fixes: 0486a06 ("tg3: prevent ifup/ifdown during PCI error recovery")
Fixes: dfc8f37 ("net/tg3: Release IRQs on permanent error")
Tested-by: Guilherme G. Piccoli <[email protected]>
Signed-off-by: Milton Miller <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
Acked-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
tobetter pushed a commit that referenced this issue Oct 6, 2016
…age_cache_page()

Antonio reports the following crash when using fuse under memory pressure:

  kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: all of them
  CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic torvalds#55-Ubuntu
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
  RIP: shadow_lru_isolate+0x181/0x190
  Call Trace:
    __list_lru_walk_one.isra.3+0x8f/0x130
    list_lru_walk_one+0x23/0x30
    scan_shadow_nodes+0x34/0x50
    shrink_slab.part.40+0x1ed/0x3d0
    shrink_zone+0x2ca/0x2e0
    kswapd+0x51e/0x990
    kthread+0xd8/0xf0
    ret_from_fork+0x3f/0x70

which corresponds to the following sanity check in the shadow node
tracking:

  BUG_ON(node->count & RADIX_TREE_COUNT_MASK);

The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.

While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page.  Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.

To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert().  This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.

Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.

Fixes: 449dd69 ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Johannes Weiner <[email protected]>
Reported-by: Antonio SJ Musumeci <[email protected]>
Debugged-by: Miklos Szeredi <[email protected]>
Cc: <[email protected]>	[3.15+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
bkrepo pushed a commit to bkrepo/linux that referenced this issue Oct 28, 2016
[ Upstream commit d3e6952 ]

I ran into this:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [tobetter#1] PREEMPT SMP KASAN
    CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ tobetter#19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000
    RIP: 0010:[<ffffffff82bbf066>]  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
    RSP: 0018:ffff880111747bb8  EFLAGS: 00010286
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358
    RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048
    RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000
    R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000
    FS:  00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0
    Stack:
     0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220
     ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232
     ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000
    Call Trace:
     [<ffffffff82bca542>] irda_connect+0x562/0x1190
     [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0
     [<ffffffff825b4489>] SyS_connect+0x9/0x10
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74
    RIP  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
     RSP <ffff880111747bb8>
    ---[ end trace 4cda2588bc055b30 ]---

The problem is that irda_open_tsap() can fail and leave self->tsap = NULL,
and then irttp_connect_request() almost immediately dereferences it.

Cc: [email protected]
Signed-off-by: Vegard Nossum <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
bkrepo pushed a commit to bkrepo/linux that referenced this issue Oct 28, 2016
commit bd975d1 upstream.

The secmech hmac(md5) structures are present in the TCP_Server_Info
struct and can be shared among multiple CIFS sessions.  However, the
server mutex is not currently held when these structures are allocated
and used, which can lead to a kernel crashes, as in the scenario below:

mount.cifs(8) tobetter#1				mount.cifs(8) tobetter#2

Is secmech.sdeschmaccmd5 allocated?
// false

						Is secmech.sdeschmaccmd5 allocated?
						// false

secmech.hmacmd = crypto_alloc_shash..
secmech.sdeschmaccmd5 = kzalloc..
sdeschmaccmd5->shash.tfm = &secmec.hmacmd;

						secmech.sdeschmaccmd5 = kzalloc
						// sdeschmaccmd5->shash.tfm
						// not yet assigned

crypto_shash_update()
 deref NULL sdeschmaccmd5->shash.tfm

 Unable to handle kernel paging request at virtual address 00000030
 epc   : 8027ba34 crypto_shash_update+0x38/0x158
 ra    : 8020f2e8 setup_ntlmv2_rsp+0x4bc/0xa84
 Call Trace:
  crypto_shash_update+0x38/0x158
  setup_ntlmv2_rsp+0x4bc/0xa84
  build_ntlmssp_auth_blob+0xbc/0x34c
  sess_auth_rawntlmssp_authenticate+0xac/0x248
  CIFS_SessSetup+0xf0/0x178
  cifs_setup_session+0x4c/0x84
  cifs_get_smb_ses+0x2c8/0x314
  cifs_mount+0x38c/0x76c
  cifs_do_mount+0x98/0x440
  mount_fs+0x20/0xc0
  vfs_kern_mount+0x58/0x138
  do_mount+0x1e8/0xccc
  SyS_mount+0x88/0xd4
  syscall_common+0x30/0x54

Fix this by locking the srv_mutex around the code which uses these
hmac(md5) structures.  All the other secmech algos already have similar
locking.

Fixes: 95dc8dd ("Limit allocation of crypto mechanisms to dialect which requires")
Signed-off-by: Rabin Vincent <[email protected]>
Acked-by: Sachin Prabhu <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit f1aff4b upstream.

The blammed commit copied to argv the size of the reallocated argv,
instead of the size of the old_argv, thus reading and copying from
past the old_argv allocated memory.

Following BUG_ON was hit:
[    3.038929][    T1] kernel BUG at lib/string_helpers.c:1040!
[    3.039147][    T1] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
...
[    3.056489][    T1] Call trace:
[    3.056591][    T1]  __fortify_panic+0x10/0x18 (P)
[    3.056773][    T1]  dm_split_args+0x20c/0x210
[    3.056942][    T1]  dm_table_add_target+0x13c/0x360
[    3.057132][    T1]  table_load+0x110/0x3ac
[    3.057292][    T1]  dm_ctl_ioctl+0x424/0x56c
[    3.057457][    T1]  __arm64_sys_ioctl+0xa8/0xec
[    3.057634][    T1]  invoke_syscall+0x58/0x10c
[    3.057804][    T1]  el0_svc_common+0xa8/0xdc
[    3.057970][    T1]  do_el0_svc+0x1c/0x28
[    3.058123][    T1]  el0_svc+0x50/0xac
[    3.058266][    T1]  el0t_64_sync_handler+0x60/0xc4
[    3.058452][    T1]  el0t_64_sync+0x1b0/0x1b4
[    3.058620][    T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000)
[    3.058897][    T1] ---[ end trace 0000000000000000 ]---
[    3.059083][    T1] Kernel panic - not syncing: Oops - BUG: Fatal exception

Fix it by copying the size of src, and not the size of dst, as it was.

Fixes: 5a2a6c4 ("dm: always update the array size in realloc_argv on success")
Cc: [email protected]
Signed-off-by: Tudor Ambarus <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit f93431c upstream.

Resurrect ubsan overflow checks and ubsan report this warning,
fix it by change the variable [length] type to size_t.

UBSAN: signed-integer-overflow in net/ipv6/ip6_output.c:1489:19
2147479552 + 8567 cannot be represented in type 'int'
CPU: 0 PID: 253 Comm: err Not tainted 5.16.0+ #1
Hardware name: linux,dummy-virt (DT)
Call trace:
  dump_backtrace+0x214/0x230
  show_stack+0x30/0x78
  dump_stack_lvl+0xf8/0x118
  dump_stack+0x18/0x30
  ubsan_epilogue+0x18/0x60
  handle_overflow+0xd0/0xf0
  __ubsan_handle_add_overflow+0x34/0x44
  __ip6_append_data.isra.48+0x1598/0x1688
  ip6_append_data+0x128/0x260
  udpv6_sendmsg+0x680/0xdd0
  inet6_sendmsg+0x54/0x90
  sock_sendmsg+0x70/0x88
  ____sys_sendmsg+0xe8/0x368
  ___sys_sendmsg+0x98/0xe0
  __sys_sendmmsg+0xf4/0x3b8
  __arm64_sys_sendmmsg+0x34/0x48
  invoke_syscall+0x64/0x160
  el0_svc_common.constprop.4+0x124/0x300
  do_el0_svc+0x44/0xc8
  el0_svc+0x3c/0x1e8
  el0t_64_sync_handler+0x88/0xb0
  el0t_64_sync+0x16c/0x170

Changes since v1:
-Change the variable [length] type to unsigned, as Eric Dumazet suggested.
Changes since v2:
-Don't change exthdrlen type in ip6_make_skb, as Paolo Abeni suggested.
Changes since v3:
-Don't change ulen type in udpv6_sendmsg and l2tp_ip6_sendmsg, as
Jakub Kicinski suggested.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Yufen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Conflict due to f37a4cc ("udp6: pass flow in ip6_make_skb
  together with cork") not in the tree ]
Signed-off-by: Abdelkareem Abdelsaamad <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 4b8eeed ]

When removing LAG device from bridge, NETDEV_CHANGEUPPER event is
triggered. Driver finds the lower devices (PFs) to flush all the
offloaded entries. And mlx5_lag_is_shared_fdb is checked, it returns
false if one of PF is unloaded. In such case,
mlx5_esw_bridge_lag_rep_get() and its caller return NULL, instead of
the alive PF, and the flush is skipped.

Besides, the bridge fdb entry's lastuse is updated in mlx5 bridge
event handler. But this SWITCHDEV_FDB_ADD_TO_BRIDGE event can be
ignored in this case because the upper interface for bond is deleted,
and the entry will never be aged because lastuse is never updated.

To make things worse, as the entry is alive, mlx5 bridge workqueue
keeps sending that event, which is then handled by kernel bridge
notifier. It causes the following crash when accessing the passed bond
netdev which is already destroyed.

To fix this issue, remove such checks. LAG state is already checked in
commit 15f8f16 ("net/mlx5: Bridge, verify LAG state when adding
bond to bridge"), driver still need to skip offload if LAG becomes
invalid state after initialization.

 Oops: stack segment: 0000 [#1] SMP
 CPU: 3 UID: 0 PID: 23695 Comm: kworker/u40:3 Tainted: G           OE      6.11.0_mlnx #1
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5_bridge_wq mlx5_esw_bridge_update_work [mlx5_core]
 RIP: 0010:br_switchdev_event+0x2c/0x110 [bridge]
 Code: 44 00 00 48 8b 02 48 f7 00 00 02 00 00 74 69 41 54 55 53 48 83 ec 08 48 8b a8 08 01 00 00 48 85 ed 74 4a 48 83 fe 02 48 89 d3 <4c> 8b 65 00 74 23 76 49 48 83 fe 05 74 7e 48 83 fe 06 75 2f 0f b7
 RSP: 0018:ffffc900092cfda0 EFLAGS: 00010297
 RAX: ffff888123bfe000 RBX: ffffc900092cfe08 RCX: 00000000ffffffff
 RDX: ffffc900092cfe08 RSI: 0000000000000001 RDI: ffffffffa0c585f0
 RBP: 6669746f6e690a30 R08: 0000000000000000 R09: ffff888123ae92c8
 R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888123ae9c60
 R13: 0000000000000001 R14: ffffc900092cfe08 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f15914c8734 CR3: 0000000002830005 CR4: 0000000000770ef0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __die_body+0x1a/0x60
  ? die+0x38/0x60
  ? do_trap+0x10b/0x120
  ? do_error_trap+0x64/0xa0
  ? exc_stack_segment+0x33/0x50
  ? asm_exc_stack_segment+0x22/0x30
  ? br_switchdev_event+0x2c/0x110 [bridge]
  ? sched_balance_newidle.isra.149+0x248/0x390
  notifier_call_chain+0x4b/0xa0
  atomic_notifier_call_chain+0x16/0x20
  mlx5_esw_bridge_update+0xec/0x170 [mlx5_core]
  mlx5_esw_bridge_update_work+0x19/0x40 [mlx5_core]
  process_scheduled_works+0x81/0x390
  worker_thread+0x106/0x250
  ? bh_worker+0x110/0x110
  kthread+0xb7/0xe0
  ? kthread_park+0x80/0x80
  ret_from_fork+0x2d/0x50
  ? kthread_park+0x80/0x80
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fixes: ff9b752 ("net/mlx5: Bridge, support LAG")
Signed-off-by: Jianbo Liu <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 5363ee9 ]

Filesystems can write to disk from page reclaim with __GFP_FS
set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in
page reclaim with GFP_KERNEL, where it could try to take filesystem
locks again, leading to a deadlock.

WARNING: possible circular locking dependency detected
6.13.0 #1 Not tainted
------------------------------------------------------
kswapd0/70 is trying to acquire lock:
ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0

but task is already holding lock:
ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760

The full lockdep splat can be found in Marc's report:

https://lkml.org/lkml/2025/1/24/1101

Avoid the potential deadlock by doing the allocation with GFP_NOIO, which
prevents both filesystem and block layer recursion.

Reported-by: Marc Aurèle La France <[email protected]>
Signed-off-by: Rik van Riel <[email protected]>
Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit 654b33a upstream.

Fix race between rmmod and /proc/XXX's inode instantiation.

The bug is that pde->proc_ops don't belong to /proc, it belongs to a
module, therefore dereferencing it after /proc entry has been registered
is a bug unless use_pde/unuse_pde() pair has been used.

use_pde/unuse_pde can be avoided (2 atomic ops!) because pde->proc_ops
never changes so information necessary for inode instantiation can be
saved _before_ proc_register() in PDE itself and used later, avoiding
pde->proc_ops->...  dereference.

      rmmod                         lookup
sys_delete_module
                         proc_lookup_de
			   pde_get(de);
			   proc_get_inode(dir->i_sb, de);
  mod->exit()
    proc_remove
      remove_proc_subtree
       proc_entry_rundown(de);
  free_module(mod);

                               if (S_ISREG(inode->i_mode))
	                         if (de->proc_ops->proc_read_iter)
                           --> As module is already freed, will trigger UAF

BUG: unable to handle page fault for address: fffffbfff80a702b
PGD 817fc4067 P4D 817fc4067 PUD 817fc0067 PMD 102ef4067 PTE 0
Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 26 UID: 0 PID: 2667 Comm: ls Tainted: G
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:proc_get_inode+0x302/0x6e0
RSP: 0018:ffff88811c837998 EFLAGS: 00010a06
RAX: dffffc0000000000 RBX: ffffffffc0538140 RCX: 0000000000000007
RDX: 1ffffffff80a702b RSI: 0000000000000001 RDI: ffffffffc0538158
RBP: ffff8881299a6000 R08: 0000000067bbe1e5 R09: 1ffff11023906f20
R10: ffffffffb560ca07 R11: ffffffffb2b43a58 R12: ffff888105bb78f0
R13: ffff888100518048 R14: ffff8881299a6004 R15: 0000000000000001
FS:  00007f95b9686840(0000) GS:ffff8883af100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff80a702b CR3: 0000000117dd2000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 proc_lookup_de+0x11f/0x2e0
 __lookup_slow+0x188/0x350
 walk_component+0x2ab/0x4f0
 path_lookupat+0x120/0x660
 filename_lookup+0x1ce/0x560
 vfs_statx+0xac/0x150
 __do_sys_newstat+0x96/0x110
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

[[email protected]: don't do 2 atomic ops on the common path]
Link: https://lkml.kernel.org/r/3d25ded0-1739-447e-812b-e34da7990dcf@p183
Fixes: 778f3dd ("Fix procfs compat_ioctl regression")
Signed-off-by: Ye Bin <[email protected]>
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Al Viro <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit 4676741 upstream.

This fixes the following crash:

==================================================================
BUG: KASAN: slab-use-after-free in rtsx_usb_ms_poll_card+0x159/0x200 [rtsx_usb_ms]
Read of size 8 at addr ffff888136335380 by task kworker/6:0/140241

CPU: 6 UID: 0 PID: 140241 Comm: kworker/6:0 Kdump: loaded Tainted: G            E      6.14.0-rc6+ #1
Tainted: [E]=UNSIGNED_MODULE
Hardware name: LENOVO 30FNA1V7CW/1057, BIOS S0EKT54A 07/01/2024
Workqueue: events rtsx_usb_ms_poll_card [rtsx_usb_ms]
Call Trace:
 <TASK>
 dump_stack_lvl+0x51/0x70
 print_address_description.constprop.0+0x27/0x320
 ? rtsx_usb_ms_poll_card+0x159/0x200 [rtsx_usb_ms]
 print_report+0x3e/0x70
 kasan_report+0xab/0xe0
 ? rtsx_usb_ms_poll_card+0x159/0x200 [rtsx_usb_ms]
 rtsx_usb_ms_poll_card+0x159/0x200 [rtsx_usb_ms]
 ? __pfx_rtsx_usb_ms_poll_card+0x10/0x10 [rtsx_usb_ms]
 ? __pfx___schedule+0x10/0x10
 ? kick_pool+0x3b/0x270
 process_one_work+0x357/0x660
 worker_thread+0x390/0x4c0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0x190/0x1d0
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2d/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 161446:
 kasan_save_stack+0x20/0x40
 kasan_save_track+0x10/0x30
 __kasan_kmalloc+0x7b/0x90
 __kmalloc_noprof+0x1a7/0x470
 memstick_alloc_host+0x1f/0xe0 [memstick]
 rtsx_usb_ms_drv_probe+0x47/0x320 [rtsx_usb_ms]
 platform_probe+0x60/0xe0
 call_driver_probe+0x35/0x120
 really_probe+0x123/0x410
 __driver_probe_device+0xc7/0x1e0
 driver_probe_device+0x49/0xf0
 __device_attach_driver+0xc6/0x160
 bus_for_each_drv+0xe4/0x160
 __device_attach+0x13a/0x2b0
 bus_probe_device+0xbd/0xd0
 device_add+0x4a5/0x760
 platform_device_add+0x189/0x370
 mfd_add_device+0x587/0x5e0
 mfd_add_devices+0xb1/0x130
 rtsx_usb_probe+0x28e/0x2e0 [rtsx_usb]
 usb_probe_interface+0x15c/0x460
 call_driver_probe+0x35/0x120
 really_probe+0x123/0x410
 __driver_probe_device+0xc7/0x1e0
 driver_probe_device+0x49/0xf0
 __device_attach_driver+0xc6/0x160
 bus_for_each_drv+0xe4/0x160
 __device_attach+0x13a/0x2b0
 rebind_marked_interfaces.isra.0+0xcc/0x110
 usb_reset_device+0x352/0x410
 usbdev_do_ioctl+0xe5c/0x1860
 usbdev_ioctl+0xa/0x20
 __x64_sys_ioctl+0xc5/0xf0
 do_syscall_64+0x59/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 161506:
 kasan_save_stack+0x20/0x40
 kasan_save_track+0x10/0x30
 kasan_save_free_info+0x36/0x60
 __kasan_slab_free+0x34/0x50
 kfree+0x1fd/0x3b0
 device_release+0x56/0xf0
 kobject_cleanup+0x73/0x1c0
 rtsx_usb_ms_drv_remove+0x13d/0x220 [rtsx_usb_ms]
 platform_remove+0x2f/0x50
 device_release_driver_internal+0x24b/0x2e0
 bus_remove_device+0x124/0x1d0
 device_del+0x239/0x530
 platform_device_del.part.0+0x19/0xe0
 platform_device_unregister+0x1c/0x40
 mfd_remove_devices_fn+0x167/0x170
 device_for_each_child_reverse+0xc9/0x130
 mfd_remove_devices+0x6e/0xa0
 rtsx_usb_disconnect+0x2e/0xd0 [rtsx_usb]
 usb_unbind_interface+0xf3/0x3f0
 device_release_driver_internal+0x24b/0x2e0
 proc_disconnect_claim+0x13d/0x220
 usbdev_do_ioctl+0xb5e/0x1860
 usbdev_ioctl+0xa/0x20
 __x64_sys_ioctl+0xc5/0xf0
 do_syscall_64+0x59/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x20/0x40
 kasan_record_aux_stack+0x85/0x90
 insert_work+0x29/0x100
 __queue_work+0x34a/0x540
 call_timer_fn+0x2a/0x160
 expire_timers+0x5f/0x1f0
 __run_timer_base.part.0+0x1b6/0x1e0
 run_timer_softirq+0x8b/0xe0
 handle_softirqs+0xf9/0x360
 __irq_exit_rcu+0x114/0x130
 sysvec_apic_timer_interrupt+0x72/0x90
 asm_sysvec_apic_timer_interrupt+0x16/0x20

Second to last potentially related work creation:
 kasan_save_stack+0x20/0x40
 kasan_record_aux_stack+0x85/0x90
 insert_work+0x29/0x100
 __queue_work+0x34a/0x540
 call_timer_fn+0x2a/0x160
 expire_timers+0x5f/0x1f0
 __run_timer_base.part.0+0x1b6/0x1e0
 run_timer_softirq+0x8b/0xe0
 handle_softirqs+0xf9/0x360
 __irq_exit_rcu+0x114/0x130
 sysvec_apic_timer_interrupt+0x72/0x90
 asm_sysvec_apic_timer_interrupt+0x16/0x20

The buggy address belongs to the object at ffff888136335000
 which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 896 bytes inside of
 freed 2048-byte region [ffff888136335000, ffff888136335800)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x136330
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f5(slab)
raw: 0017ffffc0000040 ffff888100042f00 ffffea000417a000 dead000000000002
raw: 0000000000000000 0000000000080008 00000000f5000000 0000000000000000
head: 0017ffffc0000040 ffff888100042f00 ffffea000417a000 dead000000000002
head: 0000000000000000 0000000000080008 00000000f5000000 0000000000000000
head: 0017ffffc0000003 ffffea0004d8cc01 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888136335280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888136335300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888136335380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888136335400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888136335480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Fixes: 6827ca5 ("memstick: rtsx_usb_ms: Support runtime power management")
Signed-off-by: Luo Qiu <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit efdde3d ]

There is case as below could trigger kernel dump:
Use U-Boot to start remote processor(rproc) with resource table
published to a fixed address by rproc. After Kernel boots up,
stop the rproc, load a new firmware which doesn't have resource table
,and start rproc.

When starting rproc with a firmware not have resource table,
`memcpy(loaded_table, rproc->cached_table, rproc->table_sz)` will
trigger dump, because rproc->cache_table is set to NULL during the last
stop operation, but rproc->table_sz is still valid.

This issue is found on i.MX8MP and i.MX9.

Dump as below:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
  ESR = 0x0000000096000004
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x04: level 0 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000000010af63000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 UID: 0 PID: 1060 Comm: sh Not tainted 6.14.0-rc7-next-20250317-dirty #38
Hardware name: NXP i.MX8MPlus EVK board (DT)
pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __pi_memcpy_generic+0x110/0x22c
lr : rproc_start+0x88/0x1e0
Call trace:
 __pi_memcpy_generic+0x110/0x22c (P)
 rproc_boot+0x198/0x57c
 state_store+0x40/0x104
 dev_attr_store+0x18/0x2c
 sysfs_kf_write+0x7c/0x94
 kernfs_fop_write_iter+0x120/0x1cc
 vfs_write+0x240/0x378
 ksys_write+0x70/0x108
 __arm64_sys_write+0x1c/0x28
 invoke_syscall+0x48/0x10c
 el0_svc_common.constprop.0+0xc0/0xe0
 do_el0_svc+0x1c/0x28
 el0_svc+0x30/0xcc
 el0t_64_sync_handler+0x10c/0x138
 el0t_64_sync+0x198/0x19c

Clear rproc->table_sz to address the issue.

Fixes: 9dc9507 ("remoteproc: Properly deal with the resource table when detaching")
Signed-off-by: Peng Fan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit d19d734 ]

With UBSAN_ARRAY_BOUNDS=y, I'm hitting the below panic due to
dereferencing `ctx->clk_data.hws` before setting
`ctx->clk_data.num = nr_clks`. Move that up to fix the crash.

  UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP
  <snip>
  Call trace:
   samsung_clk_init+0x110/0x124 (P)
   samsung_clk_init+0x48/0x124 (L)
   samsung_cmu_register_one+0x3c/0xa0
   exynos_arm64_register_cmu+0x54/0x64
   __gs101_cmu_top_of_clk_init_declare+0x28/0x60
   ...

Fixes: e620a1e ("drivers/clk: convert VL struct to struct_size")
Signed-off-by: Will McVicker <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit a1ecb30 ]

Commit 467f432 ("RDMA/core: Split port and device counter sysfs
attributes") accidentally almost exposed hw counters to non-init net
namespaces. It didn't expose them fully, as an attempt to read any of
those counters leads to a crash like this one:

[42021.807566] BUG: kernel NULL pointer dereference, address: 0000000000000028
[42021.814463] #PF: supervisor read access in kernel mode
[42021.819549] #PF: error_code(0x0000) - not-present page
[42021.824636] PGD 0 P4D 0
[42021.827145] Oops: 0000 [#1] SMP PTI
[42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump: loaded Tainted: G S      W I        XXX
[42021.841697] Hardware name: XXX
[42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff ff <48> 8b 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
[42021.873931] RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287
[42021.879108] RAX: ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
[42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI: ffff940c7517aef0
[42021.893230] RBP: ffff97fe90f03e70 R08: ffff94085f1aa000 R09: 0000000000000000
[42021.900294] R10: ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
[42021.907355] R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
[42021.914418] FS:  00007fda1a3b9700(0000) GS:ffff94453fb80000(0000) knlGS:0000000000000000
[42021.922423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42021.928130] CR2: 0000000000000028 CR3: 00000042dcfb8003 CR4: 00000000003726f0
[42021.935194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[42021.949324] Call Trace:
[42021.951756]  <TASK>
[42021.953842]  [<ffffffff86c58674>] ? show_regs+0x64/0x70
[42021.959030]  [<ffffffff86c58468>] ? __die+0x78/0xc0
[42021.963874]  [<ffffffff86c9ef75>] ? page_fault_oops+0x2b5/0x3b0
[42021.969749]  [<ffffffff87674b92>] ? exc_page_fault+0x1a2/0x3c0
[42021.975549]  [<ffffffff87801326>] ? asm_exc_page_fault+0x26/0x30
[42021.981517]  [<ffffffffc0775680>] ? __pfx_show_hw_stats+0x10/0x10 [ib_core]
[42021.988482]  [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.995438]  [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
[42022.000803]  [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
[42022.006508]  [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
[42022.011954]  [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
[42022.017058]  [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
[42022.022073]  [<ffffffff8766f1ca>] do_syscall_64+0x6a/0xa0
[42022.027441]  [<ffffffff8780013b>] entry_SYSCALL_64_after_hwframe+0x78/0xe2

The problem can be reproduced using the following steps:
  ip netns add foo
  ip netns exec foo bash
  cat /sys/class/infiniband/mlx4_0/hw_counters/*

The panic occurs because of casting the device pointer into an
ib_device pointer using container_of() in hw_stat_device_show() is
wrong and leads to a memory corruption.

However the real problem is that hw counters should never been exposed
outside of the non-init net namespace.

Fix this by saving the index of the corresponding attribute group
(it might be 1 or 2 depending on the presence of driver-specific
attributes) and zeroing the pointer to hw_counters group for compat
devices during the initialization.

With this fix applied hw_counters are not available in a non-init
net namespace:
  find /sys/class/infiniband/mlx4_0/ -name hw_counters
    /sys/class/infiniband/mlx4_0/ports/1/hw_counters
    /sys/class/infiniband/mlx4_0/ports/2/hw_counters
    /sys/class/infiniband/mlx4_0/hw_counters

  ip netns add foo
  ip netns exec foo bash
  find /sys/class/infiniband/mlx4_0/ -name hw_counters

Fixes: 467f432 ("RDMA/core: Split port and device counter sysfs attributes")
Signed-off-by: Roman Gushchin <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: Maher Sanalla <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://patch.msgid.link/[email protected]
Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 5ed3b0c ]

When cur_qp isn't NULL, in order to avoid fetching the QP from
the radix tree again we check if the next cqe QP is identical to
the one we already have.

The bug however is that we are checking if the QP is identical by
checking the QP number inside the CQE against the QP number inside the
mlx5_ib_qp, but that's wrong since the QP number from the CQE is from
FW so it should be matched against mlx5_core_qp which is our FW QP
number.

Otherwise we could use the wrong QP when handling a CQE which could
cause the kernel trace below.

This issue is mainly noticeable over QPs 0 & 1, since for now they are
the only QPs in our driver whereas the QP number inside mlx5_ib_qp
doesn't match the QP number inside mlx5_core_qp.

BUG: kernel NULL pointer dereference, address: 0000000000000012
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP
 CPU: 0 UID: 0 PID: 7927 Comm: kworker/u62:1 Not tainted 6.14.0-rc3+ torvalds#189
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
 RIP: 0010:mlx5_ib_poll_cq+0x4c7/0xd90 [mlx5_ib]
 Code: 03 00 00 8d 58 ff 21 cb 66 39 d3 74 39 48 c7 c7 3c 89 6e a0 0f b7 db e8 b7 d2 b3 e0 49 8b 86 60 03 00 00 48 c7 c7 4a 89 6e a0 <0f> b7 5c 98 02 e8 9f d2 b3 e0 41 0f b7 86 78 03 00 00 83 e8 01 21
 RSP: 0018:ffff88810511bd60 EFLAGS: 00010046
 RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff88885fa1b3c0 RDI: ffffffffa06e894a
 RBP: 00000000000000b0 R08: 0000000000000000 R09: ffff88810511bc10
 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88810d593000
 R13: ffff88810e579108 R14: ffff888105146000 R15: 00000000000000b0
 FS:  0000000000000000(0000) GS:ffff88885fa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000012 CR3: 00000001077e6001 CR4: 0000000000370eb0
 Call Trace:
  <TASK>
  ? __die+0x20/0x60
  ? page_fault_oops+0x150/0x3e0
  ? exc_page_fault+0x74/0x130
  ? asm_exc_page_fault+0x22/0x30
  ? mlx5_ib_poll_cq+0x4c7/0xd90 [mlx5_ib]
  __ib_process_cq+0x5a/0x150 [ib_core]
  ib_cq_poll_work+0x31/0x90 [ib_core]
  process_one_work+0x169/0x320
  worker_thread+0x288/0x3a0
  ? work_busy+0xb0/0xb0
  kthread+0xd7/0x1f0
  ? kthreads_online_cpu+0x130/0x130
  ? kthreads_online_cpu+0x130/0x130
  ret_from_fork+0x2d/0x50
  ? kthreads_online_cpu+0x130/0x130
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fixes: e126ba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Patrisious Haddad <[email protected]>
Reviewed-by: Edward Srouji <[email protected]>
Link: https://patch.msgid.link/4ada09d41f1e36db62c44a9b25c209ea5f054316.1741875692.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 23f0080 ]

Commit 30aad41 ("net/core: Add support for getting VF GUIDs")
added support for getting VF port and node GUIDs in netlink ifinfo
messages, but their size was not taken into consideration in the
function that allocates the netlink message, causing the following
warning when a netlink message is filled with many VF port and node
GUIDs:
 # echo 64 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
 # ip link show dev ib0
 RTNETLINK answers: Message too long
 Cannot send link get request: Message too long

Kernel warning:

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1930 at net/core/rtnetlink.c:4151 rtnl_getlink+0x586/0x5a0
 Modules linked in: xt_conntrack xt_MASQUERADE nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay mlx5_ib macsec mlx5_core tls rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm iw_cm ib_ipoib fuse ib_cm ib_core
 CPU: 2 UID: 0 PID: 1930 Comm: ip Not tainted 6.14.0-rc2+ #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:rtnl_getlink+0x586/0x5a0
 Code: cb 82 e8 3d af 0a 00 4d 85 ff 0f 84 08 ff ff ff 4c 89 ff 41 be ea ff ff ff e8 66 63 5b ff 49 c7 07 80 4f cb 82 e9 36 fc ff ff <0f> 0b e9 16 fe ff ff e8 de a0 56 00 66 66 2e 0f 1f 84 00 00 00 00
 RSP: 0018:ffff888113557348 EFLAGS: 00010246
 RAX: 00000000ffffffa6 RBX: ffff88817e87aa34 RCX: dffffc0000000000
 RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff88817e87afb8
 RBP: 0000000000000009 R08: ffffffff821f44aa R09: 0000000000000000
 R10: ffff8881260f79a8 R11: ffff88817e87af00 R12: ffff88817e87aa00
 R13: ffffffff8563d300 R14: 00000000ffffffa6 R15: 00000000ffffffff
 FS:  00007f63a5dbf280(0000) GS:ffff88881ee00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f63a5ba4493 CR3: 00000001700fe002 CR4: 0000000000772eb0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __warn+0xa5/0x230
  ? rtnl_getlink+0x586/0x5a0
  ? report_bug+0x22d/0x240
  ? handle_bug+0x53/0xa0
  ? exc_invalid_op+0x14/0x50
  ? asm_exc_invalid_op+0x16/0x20
  ? skb_trim+0x6a/0x80
  ? rtnl_getlink+0x586/0x5a0
  ? __pfx_rtnl_getlink+0x10/0x10
  ? rtnetlink_rcv_msg+0x1e5/0x860
  ? __pfx___mutex_lock+0x10/0x10
  ? rcu_is_watching+0x34/0x60
  ? __pfx_lock_acquire+0x10/0x10
  ? stack_trace_save+0x90/0xd0
  ? filter_irq_stacks+0x1d/0x70
  ? kasan_save_stack+0x30/0x40
  ? kasan_save_stack+0x20/0x40
  ? kasan_save_track+0x10/0x30
  rtnetlink_rcv_msg+0x21c/0x860
  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
  ? arch_stack_walk+0x9e/0xf0
  ? rcu_is_watching+0x34/0x60
  ? lock_acquire+0xd5/0x410
  ? rcu_is_watching+0x34/0x60
  netlink_rcv_skb+0xe0/0x210
  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
  ? __pfx_netlink_rcv_skb+0x10/0x10
  ? rcu_is_watching+0x34/0x60
  ? __pfx___netlink_lookup+0x10/0x10
  ? lock_release+0x62/0x200
  ? netlink_deliver_tap+0xfd/0x290
  ? rcu_is_watching+0x34/0x60
  ? lock_release+0x62/0x200
  ? netlink_deliver_tap+0x95/0x290
  netlink_unicast+0x31f/0x480
  ? __pfx_netlink_unicast+0x10/0x10
  ? rcu_is_watching+0x34/0x60
  ? lock_acquire+0xd5/0x410
  netlink_sendmsg+0x369/0x660
  ? lock_release+0x62/0x200
  ? __pfx_netlink_sendmsg+0x10/0x10
  ? import_ubuf+0xb9/0xf0
  ? __import_iovec+0x254/0x2b0
  ? lock_release+0x62/0x200
  ? __pfx_netlink_sendmsg+0x10/0x10
  ____sys_sendmsg+0x559/0x5a0
  ? __pfx_____sys_sendmsg+0x10/0x10
  ? __pfx_copy_msghdr_from_user+0x10/0x10
  ? rcu_is_watching+0x34/0x60
  ? do_read_fault+0x213/0x4a0
  ? rcu_is_watching+0x34/0x60
  ___sys_sendmsg+0xe4/0x150
  ? __pfx____sys_sendmsg+0x10/0x10
  ? do_fault+0x2cc/0x6f0
  ? handle_pte_fault+0x2e3/0x3d0
  ? __pfx_handle_pte_fault+0x10/0x10
  ? preempt_count_sub+0x14/0xc0
  ? __down_read_trylock+0x150/0x270
  ? __handle_mm_fault+0x404/0x8e0
  ? __pfx___handle_mm_fault+0x10/0x10
  ? lock_release+0x62/0x200
  ? __rcu_read_unlock+0x65/0x90
  ? rcu_is_watching+0x34/0x60
  __sys_sendmsg+0xd5/0x150
  ? __pfx___sys_sendmsg+0x10/0x10
  ? __up_read+0x192/0x480
  ? lock_release+0x62/0x200
  ? __rcu_read_unlock+0x65/0x90
  ? rcu_is_watching+0x34/0x60
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f63a5b13367
 Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
 RSP: 002b:00007fff8c726bc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 0000000067b687c2 RCX: 00007f63a5b13367
 RDX: 0000000000000000 RSI: 00007fff8c726c30 RDI: 0000000000000004
 RBP: 00007fff8c726cb8 R08: 0000000000000000 R09: 0000000000000034
 R10: 00007fff8c726c7c R11: 0000000000000246 R12: 0000000000000001
 R13: 0000000000000000 R14: 00007fff8c726cd0 R15: 00007fff8c726cd0
  </TASK>
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>] 0x0
 hardirqs last disabled at (0): [<ffffffff813f9e58>] copy_process+0xd08/0x2830
 softirqs last  enabled at (0): [<ffffffff813f9e58>] copy_process+0xd08/0x2830
 softirqs last disabled at (0): [<0000000000000000>] 0x0
 ---[ end trace 0000000000000000 ]---

Thus, when calculating ifinfo message size, take VF GUIDs sizes into
account when supported.

Fixes: 30aad41 ("net/core: Add support for getting VF GUIDs")
Signed-off-by: Mark Zhang <[email protected]>
Reviewed-by: Maher Sanalla <[email protected]>
Signed-off-by: Mark Bloch <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 8c1624b ]

nvme_tcp_poll() may race with the send path error handler because
it may complete the request while it is actively being polled for
completion, resulting in a UAF panic [1]:

We should make sure to stop polling when we see an error when
trying to read from the socket. Hence make sure to propagate the
error so that the block layer breaks the polling cycle.

[1]:
--
[35665.692310] nvme nvme2: failed to send request -13
[35665.702265] nvme nvme2: unsupported pdu type (3)
[35665.702272] BUG: kernel NULL pointer dereference, address: 0000000000000000
[35665.702542] nvme nvme2: queue 1 receive failed:  -22
[35665.703209] #PF: supervisor write access in kernel mode
[35665.703213] #PF: error_code(0x0002) - not-present page
[35665.703214] PGD 8000003801cce067 P4D 8000003801cce067 PUD 37e6f79067 PMD 0
[35665.703220] Oops: 0002 [#1] SMP PTI
[35665.703658] nvme nvme2: starting error recovery
[35665.705809] Hardware name: Inspur aaabbb/YZMB-00882-104, BIOS 4.1.26 09/22/2022
[35665.705812] Workqueue: kblockd blk_mq_requeue_work
[35665.709172] RIP: 0010:_raw_spin_lock+0xc/0x30
[35665.715788] Call Trace:
[35665.716201]  <TASK>
[35665.716613]  ? show_trace_log_lvl+0x1c1/0x2d9
[35665.717049]  ? show_trace_log_lvl+0x1c1/0x2d9
[35665.717457]  ? blk_mq_request_bypass_insert+0x2c/0xb0
[35665.717950]  ? __die_body.cold+0x8/0xd
[35665.718361]  ? page_fault_oops+0xac/0x140
[35665.718749]  ? blk_mq_start_request+0x30/0xf0
[35665.719144]  ? nvme_tcp_queue_rq+0xc7/0x170 [nvme_tcp]
[35665.719547]  ? exc_page_fault+0x62/0x130
[35665.719938]  ? asm_exc_page_fault+0x22/0x30
[35665.720333]  ? _raw_spin_lock+0xc/0x30
[35665.720723]  blk_mq_request_bypass_insert+0x2c/0xb0
[35665.721101]  blk_mq_requeue_work+0xa5/0x180
[35665.721451]  process_one_work+0x1e8/0x390
[35665.721809]  worker_thread+0x53/0x3d0
[35665.722159]  ? process_one_work+0x390/0x390
[35665.722501]  kthread+0x124/0x150
[35665.722849]  ? set_kthread_struct+0x50/0x50
[35665.723182]  ret_from_fork+0x1f/0x30

Reported-by: Zhang Guanghui <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
…cal section

[ Upstream commit 85b2b9c ]

A circular lock dependency splat has been seen involving down_trylock():

  ======================================================
  WARNING: possible circular locking dependency detected
  6.12.0-41.el10.s390x+debug
  ------------------------------------------------------
  dd/32479 is trying to acquire lock:
  0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90

  but task is already holding lock:
  000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0

  the existing dependency chain (in reverse order) is:
  -> #4 (&zone->lock){-.-.}-{2:2}:
  -> #3 (hrtimer_bases.lock){-.-.}-{2:2}:
  -> #2 (&rq->__lock){-.-.}-{2:2}:
  -> #1 (&p->pi_lock){-.-.}-{2:2}:
  -> #0 ((console_sem).lock){-.-.}-{2:2}:

The console_sem -> pi_lock dependency is due to calling try_to_wake_up()
while holding the console_sem raw_spinlock. This dependency can be broken
by using wake_q to do the wakeup instead of calling try_to_wake_up()
under the console_sem lock. This will also make the semaphore's
raw_spinlock become a terminal lock without taking any further locks
underneath it.

The hrtimer_bases.lock is a raw_spinlock while zone->lock is a
spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via
the debug_objects_fill_pool() helper function in the debugobjects code.

  -> #4 (&zone->lock){-.-.}-{2:2}:
         __lock_acquire+0xe86/0x1cc0
         lock_acquire.part.0+0x258/0x630
         lock_acquire+0xb8/0xe0
         _raw_spin_lock_irqsave+0xb4/0x120
         rmqueue_bulk+0xac/0x8f0
         __rmqueue_pcplist+0x580/0x830
         rmqueue_pcplist+0xfc/0x470
         rmqueue.isra.0+0xdec/0x11b0
         get_page_from_freelist+0x2ee/0xeb0
         __alloc_pages_noprof+0x2c2/0x520
         alloc_pages_mpol_noprof+0x1fc/0x4d0
         alloc_pages_noprof+0x8c/0xe0
         allocate_slab+0x320/0x460
         ___slab_alloc+0xa58/0x12b0
         __slab_alloc.isra.0+0x42/0x60
         kmem_cache_alloc_noprof+0x304/0x350
         fill_pool+0xf6/0x450
         debug_object_activate+0xfe/0x360
         enqueue_hrtimer+0x34/0x190
         __run_hrtimer+0x3c8/0x4c0
         __hrtimer_run_queues+0x1b2/0x260
         hrtimer_interrupt+0x316/0x760
         do_IRQ+0x9a/0xe0
         do_irq_async+0xf6/0x160

Normally a raw_spinlock to spinlock dependency is not legitimate
and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled,
but debug_objects_fill_pool() is an exception as it explicitly
allows this dependency for non-PREEMPT_RT kernel without causing
PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is
legitimate and not a bug.

Anyway, semaphore is the only locking primitive left that is still
using try_to_wake_up() to do wakeup inside critical section, all the
other locking primitives had been migrated to use wake_q to do wakeup
outside of the critical section. It is also possible that there are
other circular locking dependencies involving printk/console_sem or
other existing/new semaphores lurking somewhere which may show up in
the future. Let just do the migration now to wake_q to avoid headache
like this.

Reported-by: [email protected]
Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 1b755d8 ]

When handling multiple NFTA_TUNNEL_KEY_OPTS_GENEVE attributes, the
parsing logic should place every geneve_opt structure one by one
compactly. Hence, when deciding the next geneve_opt position, the
pointer addition should be in units of char *.

However, the current implementation erroneously does type conversion
before the addition, which will lead to heap out-of-bounds write.

[    6.989857] ==================================================================
[    6.990293] BUG: KASAN: slab-out-of-bounds in nft_tunnel_obj_init+0x977/0xa70
[    6.990725] Write of size 124 at addr ffff888005f18974 by task poc/178
[    6.991162]
[    6.991259] CPU: 0 PID: 178 Comm: poc-oob-write Not tainted 6.1.132 #1
[    6.991655] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[    6.992281] Call Trace:
[    6.992423]  <TASK>
[    6.992586]  dump_stack_lvl+0x44/0x5c
[    6.992801]  print_report+0x184/0x4be
[    6.993790]  kasan_report+0xc5/0x100
[    6.994252]  kasan_check_range+0xf3/0x1a0
[    6.994486]  memcpy+0x38/0x60
[    6.994692]  nft_tunnel_obj_init+0x977/0xa70
[    6.995677]  nft_obj_init+0x10c/0x1b0
[    6.995891]  nf_tables_newobj+0x585/0x950
[    6.996922]  nfnetlink_rcv_batch+0xdf9/0x1020
[    6.998997]  nfnetlink_rcv+0x1df/0x220
[    6.999537]  netlink_unicast+0x395/0x530
[    7.000771]  netlink_sendmsg+0x3d0/0x6d0
[    7.001462]  __sock_sendmsg+0x99/0xa0
[    7.001707]  ____sys_sendmsg+0x409/0x450
[    7.002391]  ___sys_sendmsg+0xfd/0x170
[    7.003145]  __sys_sendmsg+0xea/0x170
[    7.004359]  do_syscall_64+0x5e/0x90
[    7.005817]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[    7.006127] RIP: 0033:0x7ec756d4e407
[    7.006339] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf
[    7.007364] RSP: 002b:00007ffed5d46760 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[    7.007827] RAX: ffffffffffffffda RBX: 00007ec756cc4740 RCX: 00007ec756d4e407
[    7.008223] RDX: 0000000000000000 RSI: 00007ffed5d467f0 RDI: 0000000000000003
[    7.008620] RBP: 00007ffed5d468a0 R08: 0000000000000000 R09: 0000000000000000
[    7.009039] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[    7.009429] R13: 00007ffed5d478b0 R14: 00007ec756ee5000 R15: 00005cbd4e655cb8

Fix this bug with correct pointer addition and conversion in parse
and dump code.

Fixes: 925d844 ("netfilter: nft_tunnel: add support for geneve opts")
Signed-off-by: Lin Ma <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit b27055a ]

struct geneve_opt uses 5 bit length for each single option, which
means every vary size option should be smaller than 128 bytes.

However, all current related Netlink policies cannot promise this
length condition and the attacker can exploit a exact 128-byte size
option to *fake* a zero length option and confuse the parsing logic,
further achieve heap out-of-bounds read.

One example crash log is like below:

[    3.905425] ==================================================================
[    3.905925] BUG: KASAN: slab-out-of-bounds in nla_put+0xa9/0xe0
[    3.906255] Read of size 124 at addr ffff888005f291cc by task poc/177
[    3.906646]
[    3.906775] CPU: 0 PID: 177 Comm: poc-oob-read Not tainted 6.1.132 #1
[    3.907131] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[    3.907784] Call Trace:
[    3.907925]  <TASK>
[    3.908048]  dump_stack_lvl+0x44/0x5c
[    3.908258]  print_report+0x184/0x4be
[    3.909151]  kasan_report+0xc5/0x100
[    3.909539]  kasan_check_range+0xf3/0x1a0
[    3.909794]  memcpy+0x1f/0x60
[    3.909968]  nla_put+0xa9/0xe0
[    3.910147]  tunnel_key_dump+0x945/0xba0
[    3.911536]  tcf_action_dump_1+0x1c1/0x340
[    3.912436]  tcf_action_dump+0x101/0x180
[    3.912689]  tcf_exts_dump+0x164/0x1e0
[    3.912905]  fw_dump+0x18b/0x2d0
[    3.913483]  tcf_fill_node+0x2ee/0x460
[    3.914778]  tfilter_notify+0xf4/0x180
[    3.915208]  tc_new_tfilter+0xd51/0x10d0
[    3.918615]  rtnetlink_rcv_msg+0x4a2/0x560
[    3.919118]  netlink_rcv_skb+0xcd/0x200
[    3.919787]  netlink_unicast+0x395/0x530
[    3.921032]  netlink_sendmsg+0x3d0/0x6d0
[    3.921987]  __sock_sendmsg+0x99/0xa0
[    3.922220]  __sys_sendto+0x1b7/0x240
[    3.922682]  __x64_sys_sendto+0x72/0x90
[    3.922906]  do_syscall_64+0x5e/0x90
[    3.923814]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[    3.924122] RIP: 0033:0x7e83eab84407
[    3.924331] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf
[    3.925330] RSP: 002b:00007ffff505e370 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[    3.925752] RAX: ffffffffffffffda RBX: 00007e83eaafa740 RCX: 00007e83eab84407
[    3.926173] RDX: 00000000000001a8 RSI: 00007ffff505e3c0 RDI: 0000000000000003
[    3.926587] RBP: 00007ffff505f460 R08: 00007e83eace1000 R09: 000000000000000c
[    3.926977] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffff505f3c0
[    3.927367] R13: 00007ffff505f5c8 R14: 00007e83ead1b000 R15: 00005d4fbbe6dcb8

Fix these issues by enforing correct length condition in related
policies.

Fixes: 925d844 ("netfilter: nft_tunnel: add support for geneve opts")
Fixes: 4ece477 ("lwtunnel: add options setting and dumping for geneve")
Fixes: 0ed5269 ("net/sched: add tunnel option support to act_tunnel_key")
Fixes: 0a6e777 ("net/sched: allow flower to match tunnel options")
Signed-off-by: Lin Ma <[email protected]>
Reviewed-by: Xin Long <[email protected]>
Acked-by: Cong Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit d5e2067 upstream.

Mounting a corrupted filesystem with directory which contains '.' dir
entry with rec_len == block size results in out-of-bounds read (later
on, when the corrupted directory is removed).

ext4_empty_dir() assumes every ext4 directory contains at least '.'
and '..' as directory entries in the first data block. It first loads
the '.' dir entry, performs sanity checks by calling ext4_check_dir_entry()
and then uses its rec_len member to compute the location of '..' dir
entry (in ext4_next_entry). It assumes the '..' dir entry fits into the
same data block.

If the rec_len of '.' is precisely one block (4KB), it slips through the
sanity checks (it is considered the last directory entry in the data
block) and leaves "struct ext4_dir_entry_2 *de" point exactly past the
memory slot allocated to the data block. The following call to
ext4_check_dir_entry() on new value of de then dereferences this pointer
which results in out-of-bounds mem access.

Fix this by extending __ext4_check_dir_entry() to check for '.' dir
entries that reach the end of data block. Make sure to ignore the phony
dir entries for checksum (by checking name_len for non-zero).

Note: This is reported by KASAN as use-after-free in case another
structure was recently freed from the slot past the bound, but it is
really an OOB read.

This issue was found by syzkaller tool.

Call Trace:
[   38.594108] BUG: KASAN: slab-use-after-free in __ext4_check_dir_entry+0x67e/0x710
[   38.594649] Read of size 2 at addr ffff88802b41a004 by task syz-executor/5375
[   38.595158]
[   38.595288] CPU: 0 UID: 0 PID: 5375 Comm: syz-executor Not tainted 6.14.0-rc7 #1
[   38.595298] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[   38.595304] Call Trace:
[   38.595308]  <TASK>
[   38.595311]  dump_stack_lvl+0xa7/0xd0
[   38.595325]  print_address_description.constprop.0+0x2c/0x3f0
[   38.595339]  ? __ext4_check_dir_entry+0x67e/0x710
[   38.595349]  print_report+0xaa/0x250
[   38.595359]  ? __ext4_check_dir_entry+0x67e/0x710
[   38.595368]  ? kasan_addr_to_slab+0x9/0x90
[   38.595378]  kasan_report+0xab/0xe0
[   38.595389]  ? __ext4_check_dir_entry+0x67e/0x710
[   38.595400]  __ext4_check_dir_entry+0x67e/0x710
[   38.595410]  ext4_empty_dir+0x465/0x990
[   38.595421]  ? __pfx_ext4_empty_dir+0x10/0x10
[   38.595432]  ext4_rmdir.part.0+0x29a/0xd10
[   38.595441]  ? __dquot_initialize+0x2a7/0xbf0
[   38.595455]  ? __pfx_ext4_rmdir.part.0+0x10/0x10
[   38.595464]  ? __pfx___dquot_initialize+0x10/0x10
[   38.595478]  ? down_write+0xdb/0x140
[   38.595487]  ? __pfx_down_write+0x10/0x10
[   38.595497]  ext4_rmdir+0xee/0x140
[   38.595506]  vfs_rmdir+0x209/0x670
[   38.595517]  ? lookup_one_qstr_excl+0x3b/0x190
[   38.595529]  do_rmdir+0x363/0x3c0
[   38.595537]  ? __pfx_do_rmdir+0x10/0x10
[   38.595544]  ? strncpy_from_user+0x1ff/0x2e0
[   38.595561]  __x64_sys_unlinkat+0xf0/0x130
[   38.595570]  do_syscall_64+0x5b/0x180
[   38.595583]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: ac27a0e ("[PATCH] ext4: initial copy of files from ext3")
Signed-off-by: Jakub Acs <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Andreas Dilger <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Mahmoud Adam <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit b61e69b ]

syzbot report a deadlock in diFree. [1]

When calling "ioctl$LOOP_SET_STATUS64", the offset value passed in is 4,
which does not match the mounted loop device, causing the mapping of the
mounted loop device to be invalidated.

When creating the directory and creating the inode of iag in diReadSpecial(),
read the page of fixed disk inode (AIT) in raw mode in read_metapage(), the
metapage data it returns is corrupted, which causes the nlink value of 0 to be
assigned to the iag inode when executing copy_from_dinode(), which ultimately
causes a deadlock when entering diFree().

To avoid this, first check the nlink value of dinode before setting iag inode.

[1]
WARNING: possible recursive locking detected
6.12.0-rc7-syzkaller-00212-g4a5df3796467 #0 Not tainted
--------------------------------------------
syz-executor301/5309 is trying to acquire lock:
ffff888044548920 (&(imap->im_aglock[index])){+.+.}-{3:3}, at: diFree+0x37c/0x2fb0 fs/jfs/jfs_imap.c:889

but task is already holding lock:
ffff888044548920 (&(imap->im_aglock[index])){+.+.}-{3:3}, at: diAlloc+0x1b6/0x1630

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(imap->im_aglock[index]));
  lock(&(imap->im_aglock[index]));

 *** DEADLOCK ***

 May be due to missing lock nesting notation

5 locks held by syz-executor301/5309:
 #0: ffff8880422a4420 (sb_writers#9){.+.+}-{0:0}, at: mnt_want_write+0x3f/0x90 fs/namespace.c:515
 #1: ffff88804755b390 (&type->i_mutex_dir_key#6/1){+.+.}-{3:3}, at: inode_lock_nested include/linux/fs.h:850 [inline]
 #1: ffff88804755b390 (&type->i_mutex_dir_key#6/1){+.+.}-{3:3}, at: filename_create+0x260/0x540 fs/namei.c:4026
 #2: ffff888044548920 (&(imap->im_aglock[index])){+.+.}-{3:3}, at: diAlloc+0x1b6/0x1630
 #3: ffff888044548890 (&imap->im_freelock){+.+.}-{3:3}, at: diNewIAG fs/jfs/jfs_imap.c:2460 [inline]
 #3: ffff888044548890 (&imap->im_freelock){+.+.}-{3:3}, at: diAllocExt fs/jfs/jfs_imap.c:1905 [inline]
 #3: ffff888044548890 (&imap->im_freelock){+.+.}-{3:3}, at: diAllocAG+0x4b7/0x1e50 fs/jfs/jfs_imap.c:1669
 #4: ffff88804755a618 (&jfs_ip->rdwrlock/1){++++}-{3:3}, at: diNewIAG fs/jfs/jfs_imap.c:2477 [inline]
 #4: ffff88804755a618 (&jfs_ip->rdwrlock/1){++++}-{3:3}, at: diAllocExt fs/jfs/jfs_imap.c:1905 [inline]
 #4: ffff88804755a618 (&jfs_ip->rdwrlock/1){++++}-{3:3}, at: diAllocAG+0x869/0x1e50 fs/jfs/jfs_imap.c:1669

stack backtrace:
CPU: 0 UID: 0 PID: 5309 Comm: syz-executor301 Not tainted 6.12.0-rc7-syzkaller-00212-g4a5df3796467 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_deadlock_bug+0x483/0x620 kernel/locking/lockdep.c:3037
 check_deadlock kernel/locking/lockdep.c:3089 [inline]
 validate_chain+0x15e2/0x5920 kernel/locking/lockdep.c:3891
 __lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 diFree+0x37c/0x2fb0 fs/jfs/jfs_imap.c:889
 jfs_evict_inode+0x32d/0x440 fs/jfs/inode.c:156
 evict+0x4e8/0x9b0 fs/inode.c:725
 diFreeSpecial fs/jfs/jfs_imap.c:552 [inline]
 duplicateIXtree+0x3c6/0x550 fs/jfs/jfs_imap.c:3022
 diNewIAG fs/jfs/jfs_imap.c:2597 [inline]
 diAllocExt fs/jfs/jfs_imap.c:1905 [inline]
 diAllocAG+0x17dc/0x1e50 fs/jfs/jfs_imap.c:1669
 diAlloc+0x1d2/0x1630 fs/jfs/jfs_imap.c:1590
 ialloc+0x8f/0x900 fs/jfs/jfs_inode.c:56
 jfs_mkdir+0x1c5/0xba0 fs/jfs/namei.c:225
 vfs_mkdir+0x2f9/0x4f0 fs/namei.c:4257
 do_mkdirat+0x264/0x3a0 fs/namei.c:4280
 __do_sys_mkdirat fs/namei.c:4295 [inline]
 __se_sys_mkdirat fs/namei.c:4293 [inline]
 __x64_sys_mkdirat+0x87/0xa0 fs/namei.c:4293
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=355da3b3a74881008e8f
Signed-off-by: Edward Adam Davis <[email protected]>
Signed-off-by: Dave Kleikamp <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 27b9180 ]

With the device instance lock, there is now a possibility of a deadlock:

[    1.211455] ============================================
[    1.211571] WARNING: possible recursive locking detected
[    1.211687] 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 Not tainted
[    1.211823] --------------------------------------------
[    1.211936] ip/184 is trying to acquire lock:
[    1.212032] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_set_allmulti+0x4e/0xb0
[    1.212207]
[    1.212207] but task is already holding lock:
[    1.212332] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0
[    1.212487]
[    1.212487] other info that might help us debug this:
[    1.212626]  Possible unsafe locking scenario:
[    1.212626]
[    1.212751]        CPU0
[    1.212815]        ----
[    1.212871]   lock(&dev->lock);
[    1.212944]   lock(&dev->lock);
[    1.213016]
[    1.213016]  *** DEADLOCK ***
[    1.213016]
[    1.213143]  May be due to missing lock nesting notation
[    1.213143]
[    1.213294] 3 locks held by ip/184:
[    1.213371]  #0: ffffffff838b53e0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x1b/0xa0
[    1.213543]  #1: ffffffff84e5fc70 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x37/0xa0
[    1.213727]  #2: ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0
[    1.213895]
[    1.213895] stack backtrace:
[    1.213991] CPU: 0 UID: 0 PID: 184 Comm: ip Not tainted 6.14.0-rc5-01215-g032756b4ca7a-dirty #5
[    1.213993] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[    1.213994] Call Trace:
[    1.213995]  <TASK>
[    1.213996]  dump_stack_lvl+0x8e/0xd0
[    1.214000]  print_deadlock_bug+0x28b/0x2a0
[    1.214020]  lock_acquire+0xea/0x2a0
[    1.214027]  __mutex_lock+0xbf/0xd40
[    1.214038]  dev_set_allmulti+0x4e/0xb0 # real_dev->flags & IFF_ALLMULTI
[    1.214040]  vlan_dev_open+0xa5/0x170 # ndo_open on vlandev
[    1.214042]  __dev_open+0x145/0x270
[    1.214046]  __dev_change_flags+0xb0/0x1e0
[    1.214051]  netif_change_flags+0x22/0x60 # IFF_UP vlandev
[    1.214053]  dev_change_flags+0x61/0xb0 # for each device in group from dev->vlan_info
[    1.214055]  vlan_device_event+0x766/0x7c0 # on netdevsim0
[    1.214058]  notifier_call_chain+0x78/0x120
[    1.214062]  netif_open+0x6d/0x90
[    1.214064]  dev_open+0x5b/0xb0 # locks netdevsim0
[    1.214066]  bond_enslave+0x64c/0x1230
[    1.214075]  do_set_master+0x175/0x1e0 # on netdevsim0
[    1.214077]  do_setlink+0x516/0x13b0
[    1.214094]  rtnl_newlink+0xaba/0xb80
[    1.214132]  rtnetlink_rcv_msg+0x440/0x490
[    1.214144]  netlink_rcv_skb+0xeb/0x120
[    1.214150]  netlink_unicast+0x1f9/0x320
[    1.214153]  netlink_sendmsg+0x346/0x3f0
[    1.214157]  __sock_sendmsg+0x86/0xb0
[    1.214160]  ____sys_sendmsg+0x1c8/0x220
[    1.214164]  ___sys_sendmsg+0x28f/0x2d0
[    1.214179]  __x64_sys_sendmsg+0xef/0x140
[    1.214184]  do_syscall_64+0xec/0x1d0
[    1.214190]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[    1.214191] RIP: 0033:0x7f2d1b4a7e56

Device setup:

     netdevsim0 (down)
     ^        ^
  bond        netdevsim1.100@netdevsim1 allmulticast=on (down)

When we enslave the lower device (netdevsim0) which has a vlan, we
propagate vlan's allmuti/promisc flags during ndo_open. This causes
(re)locking on of the real_dev.

Propagate allmulti/promisc on flags change, not on the open. There
is a slight semantics change that vlans that are down now propagate
the flags, but this seems unlikely to result in the real issues.

Reproducer:

  echo 0 1 > /sys/bus/netdevsim/new_device

  dev_path=$(ls -d /sys/bus/netdevsim/devices/netdevsim0/net/*)
  dev=$(echo $dev_path | rev | cut -d/ -f1 | rev)

  ip link set dev $dev name netdevsim0
  ip link set dev netdevsim0 up

  ip link add link netdevsim0 name netdevsim0.100 type vlan id 100
  ip link set dev netdevsim0.100 allmulticast on down
  ip link add name bond1 type bond mode 802.3ad
  ip link set dev netdevsim0 down
  ip link set dev netdevsim0 master bond1
  ip link set dev bond1 up
  ip link show

Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/Z9CfXjLMKn6VLG5d@mini-arch/T/#m15ba130f53227c883e79fb969687d69d670337a0
Signed-off-by: Stanislav Fomichev <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 378677e ]

After ieee80211_do_stop() SKB from vif's txq could still be processed.
Indeed another concurrent vif schedule_and_wake_txq call could cause
those packets to be dequeued (see ieee80211_handle_wake_tx_queue())
without checking the sdata current state.

Because vif.drv_priv is now cleared in this function, this could lead to
driver crash.

For example in ath12k, ahvif is store in vif.drv_priv. Thus if
ath12k_mac_op_tx() is called after ieee80211_do_stop(), ahvif->ah can be
NULL, leading the ath12k_warn(ahvif->ah,...) call in this function to
trigger the NULL deref below.

  Unable to handle kernel paging request at virtual address dfffffc000000001
  KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
  batman_adv: bat0: Interface deactivated: brbh1337
  Mem abort info:
    ESR = 0x0000000096000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x04: level 0 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  [dfffffc000000001] address between user and kernel address ranges
  Internal error: Oops: 0000000096000004 [#1] SMP
  CPU: 1 UID: 0 PID: 978 Comm: lbd Not tainted 6.13.0-g633f875b8f1e torvalds#114
  Hardware name: HW (DT)
  pstate: 10000005 (nzcV daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : ath12k_mac_op_tx+0x6cc/0x29b8 [ath12k]
  lr : ath12k_mac_op_tx+0x174/0x29b8 [ath12k]
  sp : ffffffc086ace450
  x29: ffffffc086ace450 x28: 0000000000000000 x27: 1ffffff810d59ca4
  x26: ffffff801d05f7c0 x25: 0000000000000000 x24: 000000004000001e
  x23: ffffff8009ce4926 x22: ffffff801f9c0800 x21: ffffff801d05f7f0
  x20: ffffff8034a19f40 x19: 0000000000000000 x18: ffffff801f9c0958
  x17: ffffff800bc0a504 x16: dfffffc000000000 x15: ffffffc086ace4f8
  x14: ffffff801d05f83c x13: 0000000000000000 x12: ffffffb003a0bf03
  x11: 0000000000000000 x10: ffffffb003a0bf02 x9 : ffffff8034a19f40
  x8 : ffffff801d05f818 x7 : 1ffffff0069433dc x6 : ffffff8034a19ee0
  x5 : ffffff801d05f7f0 x4 : 0000000000000000 x3 : 0000000000000001
  x2 : 0000000000000000 x1 : dfffffc000000000 x0 : 0000000000000008
  Call trace:
   ath12k_mac_op_tx+0x6cc/0x29b8 [ath12k] (P)
   ieee80211_handle_wake_tx_queue+0x16c/0x260
   ieee80211_queue_skb+0xeec/0x1d20
   ieee80211_tx+0x200/0x2c8
   ieee80211_xmit+0x22c/0x338
   __ieee80211_subif_start_xmit+0x7e8/0xc60
   ieee80211_subif_start_xmit+0xc4/0xee0
   __ieee80211_subif_start_xmit_8023.isra.0+0x854/0x17a0
   ieee80211_subif_start_xmit_8023+0x124/0x488
   dev_hard_start_xmit+0x160/0x5a8
   __dev_queue_xmit+0x6f8/0x3120
   br_dev_queue_push_xmit+0x120/0x4a8
   __br_forward+0xe4/0x2b0
   deliver_clone+0x5c/0xd0
   br_flood+0x398/0x580
   br_dev_xmit+0x454/0x9f8
   dev_hard_start_xmit+0x160/0x5a8
   __dev_queue_xmit+0x6f8/0x3120
   ip6_finish_output2+0xc28/0x1b60
   __ip6_finish_output+0x38c/0x638
   ip6_output+0x1b4/0x338
   ip6_local_out+0x7c/0xa8
   ip6_send_skb+0x7c/0x1b0
   ip6_push_pending_frames+0x94/0xd0
   rawv6_sendmsg+0x1a98/0x2898
   inet_sendmsg+0x94/0xe0
   __sys_sendto+0x1e4/0x308
   __arm64_sys_sendto+0xc4/0x140
   do_el0_svc+0x110/0x280
   el0_svc+0x20/0x60
   el0t_64_sync_handler+0x104/0x138
   el0t_64_sync+0x154/0x158

To avoid that, empty vif's txq at ieee80211_do_stop() so no packet could
be dequeued after ieee80211_do_stop() (new packets cannot be queued
because SDATA_STATE_RUNNING is cleared at this point).

Fixes: ba8c3d6 ("mac80211: add an intermediate software queue implementation")
Signed-off-by: Remi Pommarel <[email protected]>
Link: https://patch.msgid.link/ff7849e268562456274213c0476e09481a48f489.1742833382.git.repk@triplefau.lt
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit 424eafe upstream.

When i2c-cros-ec-tunnel and the EC driver are built-in, the EC parent
device will not be found, leading to NULL pointer dereference.

That can also be reproduced by unbinding the controller driver and then
loading i2c-cros-ec-tunnel module (or binding the device).

[  271.991245] BUG: kernel NULL pointer dereference, address: 0000000000000058
[  271.998215] #PF: supervisor read access in kernel mode
[  272.003351] #PF: error_code(0x0000) - not-present page
[  272.008485] PGD 0 P4D 0
[  272.011022] Oops: Oops: 0000 [#1] SMP NOPTI
[  272.015207] CPU: 0 UID: 0 PID: 3859 Comm: insmod Tainted: G S                  6.15.0-rc1-00004-g44722359ed83 #30 PREEMPT(full)  3c7fb39a552e7d949de2ad921a7d6588d3a4fdc5
[  272.030312] Tainted: [S]=CPU_OUT_OF_SPEC
[  272.034233] Hardware name: HP Berknip/Berknip, BIOS Google_Berknip.13434.356.0 05/17/2021
[  272.042400] RIP: 0010:ec_i2c_probe+0x2b/0x1c0 [i2c_cros_ec_tunnel]
[  272.048577] Code: 1f 44 00 00 41 57 41 56 41 55 41 54 53 48 83 ec 10 65 48 8b 05 06 a0 6c e7 48 89 44 24 08 4c 8d 7f 10 48 8b 47 50 4c 8b 60 78 <49> 83 7c 24 58 00 0f 84 2f 01 00 00 48 89 fb be 30 06 00 00 4c 9
[  272.067317] RSP: 0018:ffffa32082a03940 EFLAGS: 00010282
[  272.072541] RAX: ffff969580b6a810 RBX: ffff969580b68c10 RCX: 0000000000000000
[  272.079672] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff969580b68c00
[  272.086804] RBP: 00000000fffffdfb R08: 0000000000000000 R09: 0000000000000000
[  272.093936] R10: 0000000000000000 R11: ffffffffc0600000 R12: 0000000000000000
[  272.101067] R13: ffffffffa666fbb8 R14: ffffffffc05b5528 R15: ffff969580b68c10
[  272.108198] FS:  00007b930906fc40(0000) GS:ffff969603149000(0000) knlGS:0000000000000000
[  272.116282] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  272.122024] CR2: 0000000000000058 CR3: 000000012631c000 CR4: 00000000003506f0
[  272.129155] Call Trace:
[  272.131606]  <TASK>
[  272.133709]  ? acpi_dev_pm_attach+0xdd/0x110
[  272.137985]  platform_probe+0x69/0xa0
[  272.141652]  really_probe+0x152/0x310
[  272.145318]  __driver_probe_device+0x77/0x110
[  272.149678]  driver_probe_device+0x1e/0x190
[  272.153864]  __driver_attach+0x10b/0x1e0
[  272.157790]  ? driver_attach+0x20/0x20
[  272.161542]  bus_for_each_dev+0x107/0x150
[  272.165553]  bus_add_driver+0x15d/0x270
[  272.169392]  driver_register+0x65/0x110
[  272.173232]  ? cleanup_module+0xa80/0xa80 [i2c_cros_ec_tunnel 3a00532f3f4af4a9eade753f86b0f8dd4e4e5698]
[  272.182617]  do_one_initcall+0x110/0x350
[  272.186543]  ? security_kernfs_init_security+0x49/0xd0
[  272.191682]  ? __kernfs_new_node+0x1b9/0x240
[  272.195954]  ? security_kernfs_init_security+0x49/0xd0
[  272.201093]  ? __kernfs_new_node+0x1b9/0x240
[  272.205365]  ? kernfs_link_sibling+0x105/0x130
[  272.209810]  ? kernfs_next_descendant_post+0x1c/0xa0
[  272.214773]  ? kernfs_activate+0x57/0x70
[  272.218699]  ? kernfs_add_one+0x118/0x160
[  272.222710]  ? __kernfs_create_file+0x71/0xa0
[  272.227069]  ? sysfs_add_bin_file_mode_ns+0xd6/0x110
[  272.232033]  ? internal_create_group+0x453/0x4a0
[  272.236651]  ? __vunmap_range_noflush+0x214/0x2d0
[  272.241355]  ? __free_frozen_pages+0x1dc/0x420
[  272.245799]  ? free_vmap_area_noflush+0x10a/0x1c0
[  272.250505]  ? load_module+0x1509/0x16f0
[  272.254431]  do_init_module+0x60/0x230
[  272.258181]  __se_sys_finit_module+0x27a/0x370
[  272.262627]  do_syscall_64+0x6a/0xf0
[  272.266206]  ? do_syscall_64+0x76/0xf0
[  272.269956]  ? irqentry_exit_to_user_mode+0x79/0x90
[  272.274836]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
[  272.279887] RIP: 0033:0x7b9309168d39
[  272.283466] Code: 5b 41 5c 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d af 40 0c 00 f7 d8 64 89 01 8
[  272.302210] RSP: 002b:00007fff50f1a288 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  272.309774] RAX: ffffffffffffffda RBX: 000058bf9b50f6d0 RCX: 00007b9309168d39
[  272.316905] RDX: 0000000000000000 RSI: 000058bf6c103a77 RDI: 0000000000000003
[  272.324036] RBP: 00007fff50f1a2e0 R08: 00007fff50f19218 R09: 0000000021ec4150
[  272.331166] R10: 000058bf9b50f7f0 R11: 0000000000000246 R12: 0000000000000000
[  272.338296] R13: 00000000fffffffe R14: 0000000000000000 R15: 000058bf6c103a77
[  272.345428]  </TASK>
[  272.347617] Modules linked in: i2c_cros_ec_tunnel(+)
[  272.364585] gsmi: Log Shutdown Reason 0x03

Returning -EPROBE_DEFER will allow the device to be bound once the
controller is bound, in the case of built-in drivers.

Fixes: 9d230c9 ("i2c: ChromeOS EC tunnel driver")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Cc: <[email protected]> # v3.16+
Signed-off-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit 8ec0fbb upstream.

Fix an oops in ttm_bo_delayed_delete which results from dererencing a
dangling pointer:

Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP
CPU: 4 UID: 0 PID: 1082 Comm: kworker/u65:2 Not tainted 6.14.0-rc4-00267-g505460b44513-dirty torvalds#216
Hardware name: LENOVO 82N6/LNVNB161216, BIOS GKCN65WW 01/16/2024
Workqueue: ttm ttm_bo_delayed_delete [ttm]
RIP: 0010:dma_resv_iter_first_unlocked+0x55/0x290
Code: 31 f6 48 c7 c7 00 2b fa aa e8 97 bd 52 ff e8 a2 c1 53 00 5a 85 c0 74 48 e9 88 01 00 00 4c 89 63 20 4d 85 e4 0f 84 30 01 00 00 <41> 8b 44 24 10 c6 43 2c 01 48 89 df 89 43 28 e8 97 fd ff ff 4c 8b
RSP: 0018:ffffbf9383473d60 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffffbf9383473d88 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffbf9383473d78 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 6b6b6b6b6b6b6b6b
R13: ffffa003bbf78580 R14: ffffa003a6728040 R15: 00000000000383cc
FS:  0000000000000000(0000) GS:ffffa00991c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000758348024dd0 CR3: 000000012c259000 CR4: 0000000000f50ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die_body.cold+0x19/0x26
 ? die_addr+0x3d/0x70
 ? exc_general_protection+0x159/0x460
 ? asm_exc_general_protection+0x27/0x30
 ? dma_resv_iter_first_unlocked+0x55/0x290
 dma_resv_wait_timeout+0x56/0x100
 ttm_bo_delayed_delete+0x69/0xb0 [ttm]
 process_one_work+0x217/0x5c0
 worker_thread+0x1c8/0x3d0
 ? apply_wqattrs_cleanup.part.0+0xc0/0xc0
 kthread+0x10b/0x240
 ? kthreads_online_cpu+0x140/0x140
 ret_from_fork+0x40/0x70
 ? kthreads_online_cpu+0x140/0x140
 ret_from_fork_asm+0x11/0x20
 </TASK>

The cause of this is:

- drm_prime_gem_destroy calls dma_buf_put(dma_buf) which releases the
  reference to the shared dma_buf. The reference count is 0, so the
  dma_buf is destroyed, which in turn decrements the corresponding
  amdgpu_bo reference count to 0, and the amdgpu_bo is destroyed -
  calling drm_gem_object_release then dma_resv_fini (which destroys the
  reservation object), then finally freeing the amdgpu_bo.

- nouveau_bo obj->bo.base.resv is now a dangling pointer to the memory
  formerly allocated to the amdgpu_bo.

- nouveau_gem_object_del calls ttm_bo_put(&nvbo->bo) which calls
  ttm_bo_release, which schedules ttm_bo_delayed_delete.

- ttm_bo_delayed_delete runs and dereferences the dangling resv pointer,
  resulting in a general protection fault.

Fix this by moving the drm_prime_gem_destroy call from
nouveau_gem_object_del to nouveau_bo_del_ttm. This ensures that it will
be run after ttm_bo_delayed_delete.

Signed-off-by: Chris Bainbridge <[email protected]>
Suggested-by: Christian König <[email protected]>
Fixes: 22b33e8 ("nouveau: add PRIME support")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3937
Cc: [email protected]
Signed-off-by: Danilo Krummrich <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit b61badd upstream.

[  +0.000021] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000027] Read of size 8 at addr ffff8881b8605f88 by task amd_pci_unplug/2147

[  +0.000023] CPU: 6 PID: 2147 Comm: amd_pci_unplug Not tainted 6.10.0+ #1
[  +0.000016] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[  +0.000016] Call Trace:
[  +0.000008]  <TASK>
[  +0.000009]  dump_stack_lvl+0x76/0xa0
[  +0.000017]  print_report+0xce/0x5f0
[  +0.000017]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000019]  ? srso_return_thunk+0x5/0x5f
[  +0.000015]  ? kasan_complete_mode_report_info+0x72/0x200
[  +0.000016]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000019]  kasan_report+0xbe/0x110
[  +0.000015]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000023]  __asan_report_load8_noabort+0x14/0x30
[  +0.000014]  drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000020]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? __kasan_check_write+0x14/0x30
[  +0.000016]  ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched]
[  +0.000020]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? __kasan_check_write+0x14/0x30
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? enable_work+0x124/0x220
[  +0.000015]  ? __pfx_enable_work+0x10/0x10
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? free_large_kmalloc+0x85/0xf0
[  +0.000016]  drm_sched_entity_destroy+0x18/0x30 [gpu_sched]
[  +0.000020]  amdgpu_vce_sw_fini+0x55/0x170 [amdgpu]
[  +0.000735]  ? __kasan_check_read+0x11/0x20
[  +0.000016]  vce_v4_0_sw_fini+0x80/0x110 [amdgpu]
[  +0.000726]  amdgpu_device_fini_sw+0x331/0xfc0 [amdgpu]
[  +0.000679]  ? mutex_unlock+0x80/0xe0
[  +0.000017]  ? __pfx_amdgpu_device_fini_sw+0x10/0x10 [amdgpu]
[  +0.000662]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? __kasan_check_write+0x14/0x30
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? mutex_unlock+0x80/0xe0
[  +0.000016]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
[  +0.000663]  drm_minor_release+0xc9/0x140 [drm]
[  +0.000081]  drm_release+0x1fd/0x390 [drm]
[  +0.000082]  __fput+0x36c/0xad0
[  +0.000018]  __fput_sync+0x3c/0x50
[  +0.000014]  __x64_sys_close+0x7d/0xe0
[  +0.000014]  x64_sys_call+0x1bc6/0x2680
[  +0.000014]  do_syscall_64+0x70/0x130
[  +0.000014]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? irqentry_exit_to_user_mode+0x60/0x190
[  +0.000015]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? irqentry_exit+0x43/0x50
[  +0.000012]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? exc_page_fault+0x7c/0x110
[  +0.000015]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000014] RIP: 0033:0x7ffff7b14f67
[  +0.000013] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
[  +0.000026] RSP: 002b:00007fffffffe378 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[  +0.000019] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
[  +0.000014] RDX: 0000000000000000 RSI: 00007ffff7f6f47a RDI: 0000000000000003
[  +0.000014] RBP: 00007fffffffe3a0 R08: 0000555555569890 R09: 0000000000000000
[  +0.000014] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
[  +0.000013] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
[  +0.000020]  </TASK>

[  +0.000016] Allocated by task 383 on cpu 7 at 26.880319s:
[  +0.000014]  kasan_save_stack+0x28/0x60
[  +0.000008]  kasan_save_track+0x18/0x70
[  +0.000007]  kasan_save_alloc_info+0x38/0x60
[  +0.000007]  __kasan_kmalloc+0xc1/0xd0
[  +0.000007]  kmalloc_trace_noprof+0x180/0x380
[  +0.000007]  drm_sched_init+0x411/0xec0 [gpu_sched]
[  +0.000012]  amdgpu_device_init+0x695f/0xa610 [amdgpu]
[  +0.000658]  amdgpu_driver_load_kms+0x1a/0x120 [amdgpu]
[  +0.000662]  amdgpu_pci_probe+0x361/0xf30 [amdgpu]
[  +0.000651]  local_pci_probe+0xe7/0x1b0
[  +0.000009]  pci_device_probe+0x248/0x890
[  +0.000008]  really_probe+0x1fd/0x950
[  +0.000008]  __driver_probe_device+0x307/0x410
[  +0.000007]  driver_probe_device+0x4e/0x150
[  +0.000007]  __driver_attach+0x223/0x510
[  +0.000006]  bus_for_each_dev+0x102/0x1a0
[  +0.000007]  driver_attach+0x3d/0x60
[  +0.000006]  bus_add_driver+0x2ac/0x5f0
[  +0.000006]  driver_register+0x13d/0x490
[  +0.000008]  __pci_register_driver+0x1ee/0x2b0
[  +0.000007]  llc_sap_close+0xb0/0x160 [llc]
[  +0.000009]  do_one_initcall+0x9c/0x3e0
[  +0.000008]  do_init_module+0x241/0x760
[  +0.000008]  load_module+0x51ac/0x6c30
[  +0.000006]  __do_sys_init_module+0x234/0x270
[  +0.000007]  __x64_sys_init_module+0x73/0xc0
[  +0.000006]  x64_sys_call+0xe3/0x2680
[  +0.000006]  do_syscall_64+0x70/0x130
[  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[  +0.000015] Freed by task 2147 on cpu 6 at 160.507651s:
[  +0.000013]  kasan_save_stack+0x28/0x60
[  +0.000007]  kasan_save_track+0x18/0x70
[  +0.000007]  kasan_save_free_info+0x3b/0x60
[  +0.000007]  poison_slab_object+0x115/0x1c0
[  +0.000007]  __kasan_slab_free+0x34/0x60
[  +0.000007]  kfree+0xfa/0x2f0
[  +0.000007]  drm_sched_fini+0x19d/0x410 [gpu_sched]
[  +0.000012]  amdgpu_fence_driver_sw_fini+0xc4/0x2f0 [amdgpu]
[  +0.000662]  amdgpu_device_fini_sw+0x77/0xfc0 [amdgpu]
[  +0.000653]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
[  +0.000655]  drm_minor_release+0xc9/0x140 [drm]
[  +0.000071]  drm_release+0x1fd/0x390 [drm]
[  +0.000071]  __fput+0x36c/0xad0
[  +0.000008]  __fput_sync+0x3c/0x50
[  +0.000007]  __x64_sys_close+0x7d/0xe0
[  +0.000007]  x64_sys_call+0x1bc6/0x2680
[  +0.000007]  do_syscall_64+0x70/0x130
[  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[  +0.000014] The buggy address belongs to the object at ffff8881b8605f80
               which belongs to the cache kmalloc-64 of size 64
[  +0.000020] The buggy address is located 8 bytes inside of
               freed 64-byte region [ffff8881b8605f80, ffff8881b8605fc0)

[  +0.000028] The buggy address belongs to the physical page:
[  +0.000011] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b8605
[  +0.000008] anon flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[  +0.000007] page_type: 0xffffefff(slab)
[  +0.000009] raw: 0017ffffc0000000 ffff8881000428c0 0000000000000000 dead000000000001
[  +0.000006] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000
[  +0.000006] page dumped because: kasan: bad access detected

[  +0.000012] Memory state around the buggy address:
[  +0.000011]  ffff8881b8605e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  +0.000015]  ffff8881b8605f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[  +0.000015] >ffff8881b8605f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  +0.000013]                       ^
[  +0.000011]  ffff8881b8606000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
[  +0.000014]  ffff8881b8606080: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb
[  +0.000013] ==================================================================

The issue reproduced on VG20 during the IGT pci_unplug test.
The root cause of the issue is that the function drm_sched_fini is called before drm_sched_entity_kill.
In drm_sched_fini, the drm_sched_rq structure is freed, but this structure is later accessed by
each entity within the run queue, leading to invalid memory access.
To resolve this, the order of cleanup calls is updated:

    Before:
        amdgpu_fence_driver_sw_fini
        amdgpu_device_ip_fini

    After:
        amdgpu_device_ip_fini
        amdgpu_fence_driver_sw_fini

This updated order ensures that all entities in the IPs are cleaned up first, followed by proper
cleanup of the schedulers.

Additional Investigation:

During debugging, another issue was identified in the amdgpu_vce_sw_fini function. The vce.vcpu_bo
buffer must be freed only as the final step in the cleanup process to prevent any premature
access during earlier cleanup stages.

v2: Using Christian suggestion call drm_sched_entity_destroy before drm_sched_fini.

Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Signed-off-by: Vitaly Prosyak <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
[Minor context change fixed.]
Signed-off-by: Bin Lan <[email protected]>
Signed-off-by: He Zhe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit d328c09 upstream.

Skip SMB sessions that are being teared down
(e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show()
to avoid use-after-free in @SES.

This fixes the following GPF when reading from /proc/fs/cifs/DebugData
while mounting and umounting

  [ 816.251274] general protection fault, probably for non-canonical
  address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI
  ...
  [  816.260138] Call Trace:
  [  816.260329]  <TASK>
  [  816.260499]  ? die_addr+0x36/0x90
  [  816.260762]  ? exc_general_protection+0x1b3/0x410
  [  816.261126]  ? asm_exc_general_protection+0x26/0x30
  [  816.261502]  ? cifs_debug_tcon+0xbd/0x240 [cifs]
  [  816.261878]  ? cifs_debug_tcon+0xab/0x240 [cifs]
  [  816.262249]  cifs_debug_data_proc_show+0x516/0xdb0 [cifs]
  [  816.262689]  ? seq_read_iter+0x379/0x470
  [  816.262995]  seq_read_iter+0x118/0x470
  [  816.263291]  proc_reg_read_iter+0x53/0x90
  [  816.263596]  ? srso_alias_return_thunk+0x5/0x7f
  [  816.263945]  vfs_read+0x201/0x350
  [  816.264211]  ksys_read+0x75/0x100
  [  816.264472]  do_syscall_64+0x3f/0x90
  [  816.264750]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  [  816.265135] RIP: 0033:0x7fd5e669d381

Cc: [email protected]
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
[ This patch removed lock/unlock operation due to ses_lock is
not present in v5.15 and not ported yet. ses->status is protected
by a global lock, cifs_tcp_ses_lock, in v5.15. ]
Signed-off-by: Xiangyu Chen <[email protected]>
Signed-off-by: He Zhe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit 4bdec0d upstream.

Neither SMB3.0 or SMB3.02 supports encryption negotiate context, so
when SMB2_GLOBAL_CAP_ENCRYPTION flag is set in the negotiate response,
the client uses AES-128-CCM as the default cipher.  See MS-SMB2
3.3.5.4.

Commit b0abcd6 ("smb: client: fix UAF in async decryption") added
a @server->cipher_type check to conditionally call
smb3_crypto_aead_allocate(), but that check would always be false as
@server->cipher_type is unset for SMB3.02.

Fix the following KASAN splat by setting @server->cipher_type for
SMB3.02 as well.

mount.cifs //srv/share /mnt -o vers=3.02,seal,...

BUG: KASAN: null-ptr-deref in crypto_aead_setkey+0x2c/0x130
Read of size 8 at addr 0000000000000020 by task mount.cifs/1095
CPU: 1 UID: 0 PID: 1095 Comm: mount.cifs Not tainted 6.12.0 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41
04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x5d/0x80
 ? crypto_aead_setkey+0x2c/0x130
 kasan_report+0xda/0x110
 ? crypto_aead_setkey+0x2c/0x130
 crypto_aead_setkey+0x2c/0x130
 crypt_message+0x258/0xec0 [cifs]
 ? __asan_memset+0x23/0x50
 ? __pfx_crypt_message+0x10/0x10 [cifs]
 ? mark_lock+0xb0/0x6a0
 ? hlock_class+0x32/0xb0
 ? mark_lock+0xb0/0x6a0
 smb3_init_transform_rq+0x352/0x3f0 [cifs]
 ? lock_acquire.part.0+0xf4/0x2a0
 smb_send_rqst+0x144/0x230 [cifs]
 ? __pfx_smb_send_rqst+0x10/0x10 [cifs]
 ? hlock_class+0x32/0xb0
 ? smb2_setup_request+0x225/0x3a0 [cifs]
 ? __pfx_cifs_compound_last_callback+0x10/0x10 [cifs]
 compound_send_recv+0x59b/0x1140 [cifs]
 ? __pfx_compound_send_recv+0x10/0x10 [cifs]
 ? __create_object+0x5e/0x90
 ? hlock_class+0x32/0xb0
 ? do_raw_spin_unlock+0x9a/0xf0
 cifs_send_recv+0x23/0x30 [cifs]
 SMB2_tcon+0x3ec/0xb30 [cifs]
 ? __pfx_SMB2_tcon+0x10/0x10 [cifs]
 ? lock_acquire.part.0+0xf4/0x2a0
 ? __pfx_lock_release+0x10/0x10
 ? do_raw_spin_trylock+0xc6/0x120
 ? lock_acquire+0x3f/0x90
 ? _get_xid+0x16/0xd0 [cifs]
 ? __pfx_SMB2_tcon+0x10/0x10 [cifs]
 ? cifs_get_smb_ses+0xcdd/0x10a0 [cifs]
 cifs_get_smb_ses+0xcdd/0x10a0 [cifs]
 ? __pfx_cifs_get_smb_ses+0x10/0x10 [cifs]
 ? cifs_get_tcp_session+0xaa0/0xca0 [cifs]
 cifs_mount_get_session+0x8a/0x210 [cifs]
 dfs_mount_share+0x1b0/0x11d0 [cifs]
 ? __pfx___lock_acquire+0x10/0x10
 ? __pfx_dfs_mount_share+0x10/0x10 [cifs]
 ? lock_acquire.part.0+0xf4/0x2a0
 ? find_held_lock+0x8a/0xa0
 ? hlock_class+0x32/0xb0
 ? lock_release+0x203/0x5d0
 cifs_mount+0xb3/0x3d0 [cifs]
 ? do_raw_spin_trylock+0xc6/0x120
 ? __pfx_cifs_mount+0x10/0x10 [cifs]
 ? lock_acquire+0x3f/0x90
 ? find_nls+0x16/0xa0
 ? smb3_update_mnt_flags+0x372/0x3b0 [cifs]
 cifs_smb3_do_mount+0x1e2/0xc80 [cifs]
 ? __pfx_vfs_parse_fs_string+0x10/0x10
 ? __pfx_cifs_smb3_do_mount+0x10/0x10 [cifs]
 smb3_get_tree+0x1bf/0x330 [cifs]
 vfs_get_tree+0x4a/0x160
 path_mount+0x3c1/0xfb0
 ? kasan_quarantine_put+0xc7/0x1d0
 ? __pfx_path_mount+0x10/0x10
 ? kmem_cache_free+0x118/0x3e0
 ? user_path_at+0x74/0xa0
 __x64_sys_mount+0x1a6/0x1e0
 ? __pfx___x64_sys_mount+0x10/0x10
 ? mark_held_locks+0x1a/0x90
 do_syscall_64+0xbb/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Cc: Tom Talpey <[email protected]>
Reported-by: Jianhong Yin <[email protected]>
Cc: [email protected] # v6.12
Fixes: b0abcd6 ("smb: client: fix UAF in async decryption")
Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
Signed-off-by: Steve French <[email protected]>
[Commit b0abcd6 ("smb: client: fix UAF in async decryption")
fixes CVE-2024-50047 but brings NULL-pointer dereferebce. So
commit 4bdec0d ("smb: client: fix NULL ptr deref in crypto_aead_setkey()")
should be backported too.]
Signed-off-by: Jianqi Ren <[email protected]>
Signed-off-by: He Zhe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit d63527e ]

syzbot reported:

tipc: Node number set to 1055423674
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 3 UID: 0 PID: 6017 Comm: kworker/3:5 Not tainted 6.15.0-rc1-syzkaller-00246-g900241a5cc15 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Workqueue: events tipc_net_finalize_work
RIP: 0010:tipc_mon_reinit_self+0x11c/0x210 net/tipc/monitor.c:719
...
RSP: 0018:ffffc9000356fb68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003ee87cba
RDX: 0000000000000000 RSI: ffffffff8dbc56a7 RDI: ffff88804c2cc010
RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000007
R13: fffffbfff2111097 R14: ffff88804ead8000 R15: ffff88804ead9010
FS:  0000000000000000(0000) GS:ffff888097ab9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000f720eb00 CR3: 000000000e182000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tipc_net_finalize+0x10b/0x180 net/tipc/net.c:140
 process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238
 process_scheduled_works kernel/workqueue.c:3319 [inline]
 worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400
 kthread+0x3c2/0x780 kernel/kthread.c:464
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>
...
RIP: 0010:tipc_mon_reinit_self+0x11c/0x210 net/tipc/monitor.c:719
...
RSP: 0018:ffffc9000356fb68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003ee87cba
RDX: 0000000000000000 RSI: ffffffff8dbc56a7 RDI: ffff88804c2cc010
RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000007
R13: fffffbfff2111097 R14: ffff88804ead8000 R15: ffff88804ead9010
FS:  0000000000000000(0000) GS:ffff888097ab9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000f720eb00 CR3: 000000000e182000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

There is a racing condition between workqueue created when enabling
bearer and another thread created when disabling bearer right after
that as follow:

enabling_bearer                          | disabling_bearer
---------------                          | ----------------
tipc_disc_timeout()                      |
{                                        | bearer_disable()
 ...                                     | {
 schedule_work(&tn->work);               |  tipc_mon_delete()
 ...                                     |  {
}                                        |   ...
                                         |   write_lock_bh(&mon->lock);
                                         |   mon->self = NULL;
                                         |   write_unlock_bh(&mon->lock);
                                         |   ...
                                         |  }
tipc_net_finalize_work()                 | }
{                                        |
 ...                                     |
 tipc_net_finalize()                     |
 {                                       |
  ...                                    |
  tipc_mon_reinit_self()                 |
  {                                      |
   ...                                   |
   write_lock_bh(&mon->lock);            |
   mon->self->addr = tipc_own_addr(net); |
   write_unlock_bh(&mon->lock);          |
   ...                                   |
  }                                      |
  ...                                    |
 }                                       |
 ...                                     |
}                                        |

'mon->self' is set to NULL in disabling_bearer thread and dereferenced
later in enabling_bearer thread.

This commit fixes this issue by validating 'mon->self' before assigning
node address to it.

Reported-by: [email protected]
Fixes: 46cb01e ("tipc: update mon's self addr when node addr generated")
Signed-off-by: Tung Nguyen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 461f24b ]

Intel Merrifield SoC uses these endpoints for tracing and they cannot
be re-allocated if being used because the side band flow control signals
are hard wired to certain endpoints:

• 1 High BW Bulk IN (IN#1) (RTIT)
• 1 1KB BW Bulk IN (IN#8) + 1 1KB BW Bulk OUT (Run Control) (OUT#8)

In device mode, since RTIT (EP#1) and EXI/RunControl (EP#8) uses
External Buffer Control (EBC) mode, these endpoints are to be mapped to
EBC mode (to be done by EXI target driver). Additionally TRB for RTIT
and EXI are maintained in STM (System Trace Module) unit and the EXI
target driver will as well configure the TRB location for EP #1 IN
and EP#8 (IN and OUT). Since STM/PTI and EXI hardware blocks manage
these endpoints and interface to OTG3 controller through EBC interface,
there is no need to enable any events (such as XferComplete etc)
for these end points.

Signed-off-by: Andy Shevchenko <[email protected]>
Tested-by: Ferry Toth <[email protected]>
Acked-by: Thinh Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
[ Upstream commit 9b04461 ]

Running lib_ubsan.ko on arm64 (without CONFIG_UBSAN_TRAP) panics the
kernel:

[   31.616546] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: test_ubsan_out_of_bounds+0x158/0x158 [test_ubsan]
[   31.646817] CPU: 3 UID: 0 PID: 179 Comm: insmod Not tainted 6.15.0-rc2 #1 PREEMPT
[   31.648153] Hardware name: linux,dummy-virt (DT)
[   31.648970] Call trace:
[   31.649345]  show_stack+0x18/0x24 (C)
[   31.650960]  dump_stack_lvl+0x40/0x84
[   31.651559]  dump_stack+0x18/0x24
[   31.652264]  panic+0x138/0x3b4
[   31.652812]  __ktime_get_real_seconds+0x0/0x10
[   31.653540]  test_ubsan_load_invalid_value+0x0/0xa8 [test_ubsan]
[   31.654388]  init_module+0x24/0xff4 [test_ubsan]
[   31.655077]  do_one_initcall+0xd4/0x280
[   31.655680]  do_init_module+0x58/0x2b4

That happens because the test corrupts other data in the stack:
400:   d5384108        mrs     x8, sp_el0
404:   f9426d08        ldr     x8, [x8, torvalds#1240]
408:   f85f83a9        ldur    x9, [x29, #-8]
40c:   eb09011f        cmp     x8, x9
410:   54000301        b.ne    470 <test_ubsan_out_of_bounds+0x154>  // b.any

As there is no guarantee the compiler will order the local variables
as declared in the module:
        volatile char above[4] = { }; /* Protect surrounding memory. */
        volatile int arr[4];
        volatile char below[4] = { }; /* Protect surrounding memory. */

There is another problem where the out-of-bound index is 5 which is larger
than the extra surrounding memory for protection.

So, use a struct to enforce the ordering, and fix the index to be 4.
Also, remove some of the volatiles and rely on OPTIMIZER_HIDE_VAR()

Signed-off-by: Mostafa Saleh <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit fee4d17 upstream.

Commit a595138 ("arm64: errata: Add newer ARM cores to the
spectre_bhb_loop_affected() lists") added some additional CPUs to the
Spectre-BHB workaround, including some new arrays for designs that
require new 'k' values for the workaround to be effective.

Unfortunately, the new arrays omitted the sentinel entry and so
is_midr_in_range_list() will walk off the end when it doesn't find a
match. With UBSAN enabled, this leads to a crash during boot when
is_midr_in_range_list() is inlined (which was more common prior to
c8c2647 ("arm64: Make  _midr_in_range_list() an exported
function")):

 |  Internal error: aarch64 BRK: 00000000f2000001 [#1] PREEMPT SMP
 |  pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 |  pc : spectre_bhb_loop_affected+0x28/0x30
 |  lr : is_spectre_bhb_affected+0x170/0x190
 | [...]
 |  Call trace:
 |   spectre_bhb_loop_affected+0x28/0x30
 |   update_cpu_capabilities+0xc0/0x184
 |   init_cpu_features+0x188/0x1a4
 |   cpuinfo_store_boot_cpu+0x4c/0x60
 |   smp_prepare_boot_cpu+0x38/0x54
 |   start_kernel+0x8c/0x478
 |   __primary_switched+0xc8/0xd4
 |  Code: 6b09011f 54000061 52801080 d65f03c0 (d4200020)
 |  ---[ end trace 0000000000000000 ]---
 |  Kernel panic - not syncing: aarch64 BRK: Fatal exception

Add the missing sentinel entries.

Cc: Lee Jones <[email protected]>
Cc: James Morse <[email protected]>
Cc: Doug Anderson <[email protected]>
Cc: Shameer Kolothum <[email protected]>
Cc: <[email protected]>
Reported-by: Greg Kroah-Hartman <[email protected]>
Fixes: a595138 ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists")
Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Lee Jones <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 14, 2025
commit f1aff4b upstream.

The blammed commit copied to argv the size of the reallocated argv,
instead of the size of the old_argv, thus reading and copying from
past the old_argv allocated memory.

Following BUG_ON was hit:
[    3.038929][    T1] kernel BUG at lib/string_helpers.c:1040!
[    3.039147][    T1] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
...
[    3.056489][    T1] Call trace:
[    3.056591][    T1]  __fortify_panic+0x10/0x18 (P)
[    3.056773][    T1]  dm_split_args+0x20c/0x210
[    3.056942][    T1]  dm_table_add_target+0x13c/0x360
[    3.057132][    T1]  table_load+0x110/0x3ac
[    3.057292][    T1]  dm_ctl_ioctl+0x424/0x56c
[    3.057457][    T1]  __arm64_sys_ioctl+0xa8/0xec
[    3.057634][    T1]  invoke_syscall+0x58/0x10c
[    3.057804][    T1]  el0_svc_common+0xa8/0xdc
[    3.057970][    T1]  do_el0_svc+0x1c/0x28
[    3.058123][    T1]  el0_svc+0x50/0xac
[    3.058266][    T1]  el0t_64_sync_handler+0x60/0xc4
[    3.058452][    T1]  el0t_64_sync+0x1b0/0x1b4
[    3.058620][    T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000)
[    3.058897][    T1] ---[ end trace 0000000000000000 ]---
[    3.059083][    T1] Kernel panic - not syncing: Oops - BUG: Fatal exception

Fix it by copying the size of src, and not the size of dst, as it was.

Fixes: 5a2a6c4 ("dm: always update the array size in realloc_argv on success")
Cc: [email protected]
Signed-off-by: Tudor Ambarus <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
tobetter pushed a commit that referenced this issue May 20, 2025
commit ae08d55 upstream.

When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
available, the kernel crashes:

Oops - illegal instruction [#1]
    [snip]
epc : set_tagged_addr_ctrl+0x112/0x15a
 ra : set_tagged_addr_ctrl+0x74/0x15a
epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
    [snip]
status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
    set_tagged_addr_ctrl+0x112/0x15a
    __riscv_sys_prctl+0x352/0x73c
    do_trap_ecall_u+0x17c/0x20c
    andle_exception+0x150/0x15c

Fix it by checking if Supm is available.

Fixes: 09d6775 ("riscv: Add support for userspace pointer masking")
Signed-off-by: Nam Cao <[email protected]>
Cc: [email protected]
Reviewed-by: Samuel Holland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant