Skip to content

see if we can get kernel src rpm #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed

Conversation

bmastbergen
Copy link
Owner

No description provided.

pvts-mat and others added 6 commits July 30, 2025 16:57
jira VULN-8164
cve CVE-2023-6040
commit-author Phil Sutter <[email protected]>
commit f1082dd

An nftables family is merely a hollow container, its family just a
number and such not reliant on compile-time options other than nftables
support itself. Add an artificial check so attempts at using a family
the kernel can't support fail as early as possible. This helps user
space detect kernels which lack e.g. NFPROTO_INET.

	Signed-off-by: Phil Sutter <[email protected]>
	Signed-off-by: Pablo Neira Ayuso <[email protected]>
(cherry picked from commit f1082dd)
	Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-6585
cve CVE-2023-3777
commit-author Pablo Neira Ayuso <[email protected]>
commit 6eaf41e

Skip bound chain when flushing table rules, the rule that owns this
chain releases these objects.

Otherwise, the following warning is triggered:

  WARNING: CPU: 2 PID: 1217 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
  CPU: 2 PID: 1217 Comm: chain-flush Not tainted 6.1.39 #1
  RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]

Fixes: d0e2c7d ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
	Reported-by: Kevin Rich <[email protected]>
	Signed-off-by: Pablo Neira Ayuso <[email protected]>
	Signed-off-by: Florian Westphal <[email protected]>
(cherry picked from commit 6eaf41e)
	Signed-off-by: Anmol Jain <[email protected]>
jira VULN-334
cve CVE-2023-28464
commit-author ZhengHan Wang <[email protected]>
commit a85fb91

syzbot reports a slab use-after-free in hci_conn_hash_flush [1].
After releasing an object using hci_conn_del_sysfs in the
hci_conn_cleanup function, releasing the same object again
using the hci_dev_put and hci_conn_put functions causes a double free.
Here's a simplified flow:

hci_conn_del_sysfs:
  hci_dev_put
    put_device
      kobject_put
        kref_put
          kobject_release
            kobject_cleanup
              kfree_const
                kfree(name)

hci_dev_put:
  ...
    kfree(name)

hci_conn_put:
  put_device
    ...
      kfree(name)

This patch drop the hci_dev_put and hci_conn_put function
call in hci_conn_cleanup function, because the object is
freed in hci_conn_del_sysfs function.

This patch also fixes the refcounting in hci_conn_add_sysfs() and
hci_conn_del_sysfs() to take into account device_add() failures.

This fixes CVE-2023-28464.

Link: https://syzkaller.appspot.com/bug?id=1bb51491ca5df96a5f724899d1dbb87afda61419 [1]

	Signed-off-by: ZhengHan Wang <[email protected]>
Co-developed-by: Luiz Augusto von Dentz <[email protected]>
	Signed-off-by: Luiz Augusto von Dentz <[email protected]>
(cherry picked from commit a85fb91)
	Signed-off-by: Anmol Jain <[email protected]>
jira VULN-6760
cve CVE-2023-5717
commit-author Peter Zijlstra <[email protected]>
commit 32671e3
upstream-diff This patch causes kABI breakage due to a change in the
struct perf_event layout after adding the group_generation field.
Hence, to preserve kABI compatibility, use RH_KABI_EXTEND macro
to safely append the new field without affecting the existing layout.
Also, add an upstream patch 28a6c6e ("perf/core: Fix potential NULL deref")
which fixes a NULL pointer deref issue in the existing CVE fix.

Because group consistency is non-atomic between parent (filedesc) and children
(inherited) events, it is possible for PERF_FORMAT_GROUP read() to try and sum
non-matching counter groups -- with non-sensical results.

Add group_generation to distinguish the case where a parent group removes and
adds an event and thus has the same number, but a different configuration of
events as inherited groups.

This became a problem when commit fa8c269 ("perf/core: Invert
perf_read_group() loops") flipped the order of child_list and sibling_list.
Previously it would iterate the group (sibling_list) first, and for each
sibling traverse the child_list. In this order, only the group composition of
the parent is relevant. By flipping the order the group composition of the
child (inherited) events becomes an issue and the mis-match in group
composition becomes evident.

That said; even prior to this commit, while reading of a group that is not
equally inherited was not broken, it still made no sense.

(Ab)use ECHILD as error return to indicate issues with child process group
composition.

Fixes: fa8c269 ("perf/core: Invert perf_read_group() loops")
	Reported-by: Budimir Markovic <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
(cherry picked from commit 32671e3)
	Signed-off-by: Shreeya Patel <[email protected]>
jira VULN-6760
cve-bf CVE-2023-5717
commit-author Peter Zijlstra <[email protected]>
commit a71ef31

Smatch is awesome.

Fixes: 32671e3 ("perf: Disallow mis-matched inherited group reads")
	Reported-by: Dan Carpenter <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
	Signed-off-by: Ingo Molnar <[email protected]>
(cherry picked from commit a71ef31)
	Signed-off-by: Shreeya Patel <[email protected]>
jira VULN-70868
cve CVE-2025-38052
commit-author Wang Liang <[email protected]>
commit e279024

Syzbot reported a slab-use-after-free with the following call trace:

  ==================================================================
  BUG: KASAN: slab-use-after-free in tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
  Read of size 8 at addr ffff88807a733000 by task kworker/1:0/25

  Call Trace:
   kasan_report+0xd9/0x110 mm/kasan/report.c:601
   tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
   crypto_request_complete include/crypto/algapi.h:266
   aead_request_complete include/crypto/internal/aead.h:85
   cryptd_aead_crypt+0x3b8/0x750 crypto/cryptd.c:772
   crypto_request_complete include/crypto/algapi.h:266
   cryptd_queue_worker+0x131/0x200 crypto/cryptd.c:181
   process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231

  Allocated by task 8355:
   kzalloc_noprof include/linux/slab.h:778
   tipc_crypto_start+0xcc/0x9e0 net/tipc/crypto.c:1466
   tipc_init_net+0x2dd/0x430 net/tipc/core.c:72
   ops_init+0xb9/0x650 net/core/net_namespace.c:139
   setup_net+0x435/0xb40 net/core/net_namespace.c:343
   copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508
   create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110
   unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228
   ksys_unshare+0x419/0x970 kernel/fork.c:3323
   __do_sys_unshare kernel/fork.c:3394

  Freed by task 63:
   kfree+0x12a/0x3b0 mm/slub.c:4557
   tipc_crypto_stop+0x23c/0x500 net/tipc/crypto.c:1539
   tipc_exit_net+0x8c/0x110 net/tipc/core.c:119
   ops_exit_list+0xb0/0x180 net/core/net_namespace.c:173
   cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
   process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231

After freed the tipc_crypto tx by delete namespace, tipc_aead_encrypt_done
may still visit it in cryptd_queue_worker workqueue.

I reproduce this issue by:
  ip netns add ns1
  ip link add veth1 type veth peer name veth2
  ip link set veth1 netns ns1
  ip netns exec ns1 tipc bearer enable media eth dev veth1
  ip netns exec ns1 tipc node set key this_is_a_master_key master
  ip netns exec ns1 tipc bearer disable media eth dev veth1
  ip netns del ns1

The key of reproduction is that, simd_aead_encrypt is interrupted, leading
to crypto_simd_usable() return false. Thus, the cryptd_queue_worker is
triggered, and the tipc_crypto tx will be visited.

  tipc_disc_timeout
    tipc_bearer_xmit_skb
      tipc_crypto_xmit
        tipc_aead_encrypt
          crypto_aead_encrypt
            // encrypt()
            simd_aead_encrypt
              // crypto_simd_usable() is false
              child = &ctx->cryptd_tfm->base;

  simd_aead_encrypt
    crypto_aead_encrypt
      // encrypt()
      cryptd_aead_encrypt_enqueue
        cryptd_aead_enqueue
          cryptd_enqueue_request
            // trigger cryptd_queue_worker
            queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work)

Fix this by holding net reference count before encrypt.

	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=55c12726619ff85ce1f6
Fixes: fc1b6d6 ("tipc: introduce TIPC encryption & authentication")
	Signed-off-by: Wang Liang <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit e279024)
	Signed-off-by: Brett Mastbergen <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 2, 2025
pert script tests fails with segmentation fault as below:

  92: perf script tests:
  --- start ---
  test child forked, pid 103769
  DB test
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB /tmp/perf-test-script.7rbftEpOzX/perf.data (9 samples) ]
  /usr/libexec/perf-core/tests/shell/script.sh: line 35:
  103780 Segmentation fault      (core dumped)
  perf script -i "${perfdatafile}" -s "${db_test}"
  --- Cleaning up ---
  ---- end(-1) ----
  92: perf script tests                                               : FAILED!

Backtrace pointed to :
	#0  0x0000000010247dd0 in maps.machine ()
	#1  0x00000000101d178c in db_export.sample ()
	#2  0x00000000103412c8 in python_process_event ()
	#3  0x000000001004eb28 in process_sample_event ()
	#4  0x000000001024fcd0 in machines.deliver_event ()
	#5  0x000000001025005c in perf_session.deliver_event ()
	#6  0x00000000102568b0 in __ordered_events__flush.part.0 ()
	ctrliq#7  0x0000000010251618 in perf_session.process_events ()
	ctrliq#8  0x0000000010053620 in cmd_script ()
	ctrliq#9  0x00000000100b5a28 in run_builtin ()
	ctrliq#10 0x00000000100b5f94 in handle_internal_command ()
	ctrliq#11 0x0000000010011114 in main ()

Further investigation reveals that this occurs in the `perf script tests`,
because it uses `db_test.py` script. This script sets `perf_db_export_mode = True`.

With `perf_db_export_mode` enabled, if a sample originates from a hypervisor,
perf doesn't set maps for "[H]" sample in the code. Consequently, `al->maps` remains NULL
when `maps__machine(al->maps)` is called from `db_export__sample`.

As al->maps can be NULL in case of Hypervisor samples , use thread->maps
because even for Hypervisor sample, machine should exist.
If we don't have machine for some reason, return -1 to avoid segmentation fault.

Reported-by: Disha Goel <[email protected]>
Signed-off-by: Aditya Bodkhe <[email protected]>
Reviewed-by: Adrian Hunter <[email protected]>
Tested-by: Disha Goel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Adrian Hunter <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 2, 2025
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:

    $ perf record -a -g --call-graph=dwarf
    $ perf report
    # hung

`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`

    $ strace -y -f -p 2780484
    strace: Process 2780484 attached
    pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached

It's call trace descends into `elfutils`:

    $ gdb -p 2780484
    (gdb) bt
    #0  0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
        at ../sysdeps/unix/sysv/linux/pread64.c:25
    #1  0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
    #2  0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #3  0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #4  0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
       from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #5  0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
        at util/dso.h:537
    #6  0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
    ctrliq#7  frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
    ctrliq#8  0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    ctrliq#9  0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    ctrliq#10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    ctrliq#11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    ctrliq#12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
        thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
        best_effort=best_effort@entry=false) at util/thread.h:152
    ctrliq#13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
        sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
    ctrliq#14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
        max_stack=127, symbols=true) at util/machine.c:2920
    ctrliq#15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
        sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
        at util/machine.c:2970
    ctrliq#16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
        sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
    ctrliq#17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
        evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
    ctrliq#18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
        arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
    ctrliq#19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
        evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
    ctrliq#20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
        file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
    ctrliq#21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
    ctrliq#22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
    ctrliq#23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
        file_path=0x105ffbf0 "perf.data") at util/session.c:1419
    ctrliq#24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
    --Type <RET> for more, q to quit, c to continue without paging--
    quit
        prog=prog@entry=0x7fff9df81220) at util/session.c:2132
    ctrliq#25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
        at util/session.c:2181
    ctrliq#26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
    ctrliq#27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
    ctrliq#28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
    ctrliq#29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
    ctrliq#30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
        at perf.c:351
    ctrliq#31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
    ctrliq#32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
    ctrliq#33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556

The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.

The change conservatively skips all non-regular files.

Signed-off-by: Sergei Trofimovich <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 2, 2025
Symbolize stack traces by creating a live machine. Add this
functionality to dump_stack and switch dump_stack users to use
it. Switch TUI to use it. Add stack traces to the child test function
which can be useful to diagnose blocked code.

Example output:
```
$ perf test -vv PERF_RECORD_
...
  7: PERF_RECORD_* events & perf_sample fields:
  7: PERF_RECORD_* events & perf_sample fields                       : Running (1 active)
^C
Signal (2) while running tests.
Terminating tests with the same signal
Internal test harness failure. Completing any started tests:
:  7: PERF_RECORD_* events & perf_sample fields:

---- unexpected signal (2) ----
    #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
    #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
    #2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
    #3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
    #4 0x7fc12fef1393 in __nanosleep nanosleep.c:26
    #5 0x7fc12ff02d68 in __sleep sleep.c:55
    #6 0x55788c63196b in test__PERF_RECORD perf-record.c:0
    ctrliq#7 0x55788c620fb0 in run_test_child builtin-test.c:0
    ctrliq#8 0x55788c5bd18d in start_command run-command.c:127
    ctrliq#9 0x55788c621ef3 in __cmd_test builtin-test.c:0
    ctrliq#10 0x55788c6225bf in cmd_test ??:0
    ctrliq#11 0x55788c5afbd0 in run_builtin perf.c:0
    ctrliq#12 0x55788c5afeeb in handle_internal_command perf.c:0
    ctrliq#13 0x55788c52b383 in main ??:0
    ctrliq#14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
    ctrliq#15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
    ctrliq#16 0x55788c52b9d1 in _start ??:0

---- unexpected signal (2) ----
    #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
    #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
    #2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45
    #3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26
    #4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36
    #5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0
    #6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
    ctrliq#7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
    ctrliq#8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
    ctrliq#9 0x7fc12fef1393 in __nanosleep nanosleep.c:26
    ctrliq#10 0x7fc12ff02d68 in __sleep sleep.c:55
    ctrliq#11 0x55788c63196b in test__PERF_RECORD perf-record.c:0
    ctrliq#12 0x55788c620fb0 in run_test_child builtin-test.c:0
    ctrliq#13 0x55788c5bd18d in start_command run-command.c:127
    ctrliq#14 0x55788c621ef3 in __cmd_test builtin-test.c:0
    ctrliq#15 0x55788c6225bf in cmd_test ??:0
    ctrliq#16 0x55788c5afbd0 in run_builtin perf.c:0
    ctrliq#17 0x55788c5afeeb in handle_internal_command perf.c:0
    ctrliq#18 0x55788c52b383 in main ??:0
    ctrliq#19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
    ctrliq#20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
    ctrliq#21 0x55788c52b9d1 in _start ??:0
  7: PERF_RECORD_* events & perf_sample fields                       : Skip (permissions)
```

Signed-off-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 2, 2025
Calling perf top with branch filters enabled on Intel CPU's
with branch counters logging (A.K.A LBR event logging [1]) support
results in a segfault.

$ perf top  -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter
...
Thread 27 "perf" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffafff76c0 (LWP 949003)]
perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
653			*width = env->cpu_pmu_caps ? env->br_cntr_width :
(gdb) bt
 #0  perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
 #1  0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345
 #2  0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389
 #3  0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422
 #4  0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850
 #5  0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737
 #6  0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359
 ctrliq#7  0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845
 ctrliq#8  0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211
 ctrliq#9  0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245
 ctrliq#10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
 ctrliq#11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342
 ctrliq#12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120
 ctrliq#13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448
 ctrliq#14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

The cause is that perf_env__find_br_cntr_info tries to access a
null pointer pmu_caps in the perf_env struct. A similar issue exists
for homogeneous core systems which use the cpu_pmu_caps structure.

Fix this by populating cpu_pmu_caps and pmu_caps structures with
values from sysfs when calling perf top with branch stack sampling
enabled.

[1], LBR event logging introduced here:
https://lore.kernel.org/all/[email protected]/

Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Thomas Falcon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 2, 2025
…terface

[ Upstream commit a90b2a1 ]

collect_md property on xfrm interfaces can only be set on device creation,
thus xfrmi_changelink() should fail when called on such interfaces.

The check to enforce this was done only in the case where the xi was
returned from xfrmi_locate() which doesn't look for the collect_md
interface, and thus the validation was never reached.

Calling changelink would thus errornously place the special interface xi
in the xfrmi_net->xfrmi hash, but since it also exists in the
xfrmi_net->collect_md_xfrmi pointer it would lead to a double free when
the net namespace was taken down [1].

Change the check to use the xi from netdev_priv which is available earlier
in the function to prevent changes in xfrm collect_md interfaces.

[1] resulting oops:
[    8.516540] kernel BUG at net/core/dev.c:12029!
[    8.516552] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[    8.516559] CPU: 0 UID: 0 PID: 12 Comm: kworker/u80:0 Not tainted 6.15.0-virtme #5 PREEMPT(voluntary)
[    8.516565] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[    8.516569] Workqueue: netns cleanup_net
[    8.516579] RIP: 0010:unregister_netdevice_many_notify+0x101/0xab0
[    8.516590] Code: 90 0f 0b 90 48 8b b0 78 01 00 00 48 8b 90 80 01 00 00 48 89 56 08 48 89 32 4c 89 80 78 01 00 00 48 89 b8 80 01 00 00 eb ac 90 <0f> 0b 48 8b 45 00 4c 8d a0 88 fe ff ff 48 39 c5 74 5c 41 80 bc 24
[    8.516593] RSP: 0018:ffffa93b8006bd30 EFLAGS: 00010206
[    8.516598] RAX: ffff98fe4226e000 RBX: ffffa93b8006bd58 RCX: ffffa93b8006bc60
[    8.516601] RDX: 0000000000000004 RSI: 0000000000000000 RDI: dead000000000122
[    8.516603] RBP: ffffa93b8006bdd8 R08: dead000000000100 R09: ffff98fe4133c100
[    8.516605] R10: 0000000000000000 R11: 00000000000003d2 R12: ffffa93b8006be00
[    8.516608] R13: ffffffff96c1a510 R14: ffffffff96c1a510 R15: ffffa93b8006be00
[    8.516615] FS:  0000000000000000(0000) GS:ffff98fee73b7000(0000) knlGS:0000000000000000
[    8.516619] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.516622] CR2: 00007fcd2abd0700 CR3: 000000003aa40000 CR4: 0000000000752ef0
[    8.516625] PKRU: 55555554
[    8.516627] Call Trace:
[    8.516632]  <TASK>
[    8.516635]  ? rtnl_is_locked+0x15/0x20
[    8.516641]  ? unregister_netdevice_queue+0x29/0xf0
[    8.516650]  ops_undo_list+0x1f2/0x220
[    8.516659]  cleanup_net+0x1ad/0x2e0
[    8.516664]  process_one_work+0x160/0x380
[    8.516673]  worker_thread+0x2aa/0x3c0
[    8.516679]  ? __pfx_worker_thread+0x10/0x10
[    8.516686]  kthread+0xfb/0x200
[    8.516690]  ? __pfx_kthread+0x10/0x10
[    8.516693]  ? __pfx_kthread+0x10/0x10
[    8.516697]  ret_from_fork+0x82/0xf0
[    8.516705]  ? __pfx_kthread+0x10/0x10
[    8.516709]  ret_from_fork_asm+0x1a/0x30
[    8.516718]  </TASK>

Fixes: abc340b ("xfrm: interface: support collect metadata mode")
Reported-by: Lonial Con <[email protected]>
Signed-off-by: Eyal Birger <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jira VULN-8190
cve CVE-2024-26598
commit-author Oliver Upton <[email protected]>
commit ad362fe

There is a potential UAF scenario in the case of an LPI translation
cache hit racing with an operation that invalidates the cache, such
as a DISCARD ITS command. The root of the problem is that
vgic_its_check_cache() does not elevate the refcount on the vgic_irq
before dropping the lock that serializes refcount changes.

Have vgic_its_check_cache() raise the refcount on the returned vgic_irq
and add the corresponding decrement after queueing the interrupt.

	Cc: [email protected]
	Signed-off-by: Oliver Upton <[email protected]>
	Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit ad362fe)
	Signed-off-by: Marcin Wcisło <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 5, 2025
profile allocation is wrongly setting the number of entries on the
rules vector before any ruleset is assigned. If profile allocation
fails between ruleset allocation and assigning the first ruleset,
free_ruleset() will be called with a null pointer resulting in an
oops.

[  107.350226] kernel BUG at mm/slub.c:545!
[  107.350912] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  107.351447] CPU: 1 UID: 0 PID: 27 Comm: ksoftirqd/1 Not tainted 6.14.6-hwe-rlee287-dev+ #5
[  107.353279] Hardware name:[   107.350218] -QE-----------[ cutMU here ]--------- Ub---
[  107.3502untu26] kernel BUG a 24t mm/slub.c:545.!04 P
[  107.350912]C ( Oops: invalid oi4pcode: 0000 [#1]40 PREEMPT SMP NOPFXTI
 + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  107.356054] RIP: 0010:__slab_free+0x152/0x340
[  107.356444] Code: 00 4c 89 ff e8 0f ac df 00 48 8b 14 24 48 8b 4c 24 20 48 89 44 24 08 48 8b 03 48 c1 e8 09 83 e0 01 88 44 24 13 e9 71 ff ff ff <0f> 0b 41 f7 44 24 08 87 04 00 00 75 b2 eb a8 41 f7 44 24 08 87 04
[  107.357856] RSP: 0018:ffffad4a800fbbb0 EFLAGS: 00010246
[  107.358937] RAX: ffff97ebc2a88e70 RBX: ffffd759400aa200 RCX: 0000000000800074
[  107.359976] RDX: ffff97ebc2a88e60 RSI: ffffd759400aa200 RDI: ffffad4a800fbc20
[  107.360600] RBP: ffffad4a800fbc50 R08: 0000000000000001 R09: ffffffff86f02cf2
[  107.361254] R10: 0000000000000000 R11: 0000000000000000 R12: ffff97ecc0049400
[  107.361934] R13: ffff97ebc2a88e60 R14: ffff97ecc0049400 R15: 0000000000000000
[  107.362597] FS:  0000000000000000(0000) GS:ffff97ecfb200000(0000) knlGS:0000000000000000
[  107.363332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  107.363784] CR2: 000061c9545ac000 CR3: 0000000047aa6000 CR4: 0000000000750ef0
[  107.364331] PKRU: 55555554
[  107.364545] Call Trace:
[  107.364761]  <TASK>
[  107.364931]  ? local_clock+0x15/0x30
[  107.365219]  ? srso_alias_return_thunk+0x5/0xfbef5
[  107.365593]  ? kfree_sensitive+0x32/0x70
[  107.365900]  kfree+0x29d/0x3a0
[  107.366144]  ? srso_alias_return_thunk+0x5/0xfbef5
[  107.366510]  ? local_clock_noinstr+0xe/0xd0
[  107.366841]  ? srso_alias_return_thunk+0x5/0xfbef5
[  107.367209]  kfree_sensitive+0x32/0x70
[  107.367502]  aa_free_profile.part.0+0xa2/0x400
[  107.367850]  ? rcu_do_batch+0x1e6/0x5e0
[  107.368148]  aa_free_profile+0x23/0x60
[  107.368438]  label_free_switch+0x4c/0x80
[  107.368751]  label_free_rcu+0x1c/0x50
[  107.369038]  rcu_do_batch+0x1e8/0x5e0
[  107.369324]  ? rcu_do_batch+0x157/0x5e0
[  107.369626]  rcu_core+0x1b0/0x2f0
[  107.369888]  rcu_core_si+0xe/0x20
[  107.370156]  handle_softirqs+0x9b/0x3d0
[  107.370460]  ? smpboot_thread_fn+0x26/0x210
[  107.370790]  run_ksoftirqd+0x3a/0x70
[  107.371070]  smpboot_thread_fn+0xf9/0x210
[  107.371383]  ? __pfx_smpboot_thread_fn+0x10/0x10
[  107.371746]  kthread+0x10d/0x280
[  107.372010]  ? __pfx_kthread+0x10/0x10
[  107.372310]  ret_from_fork+0x44/0x70
[  107.372655]  ? __pfx_kthread+0x10/0x10
[  107.372974]  ret_from_fork_asm+0x1a/0x30
[  107.373316]  </TASK>
[  107.373505] Modules linked in: af_packet_diag mptcp_diag tcp_diag udp_diag raw_diag inet_diag snd_seq_dummy snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore qrtr binfmt_misc intel_rapl_msr intel_rapl_common kvm_amd ccp kvm irqbypass polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd i2c_piix4 i2c_smbus input_leds joydev sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci dmi_sysfs qemu_fw_cfg ip_tables x_tables autofs4 hid_generic usbhid hid psmouse serio_raw floppy bochs pata_acpi
[  107.379086] ---[ end trace 0000000000000000 ]---

Don't set the count until a ruleset is actually allocated and
guard against free_ruleset() being called with a null pointer.

Reported-by: Ryan Lee <[email protected]>
Fixes: 217af7e ("apparmor: refactor profile rules and attachments")
Signed-off-by: John Johansen <[email protected]>
jainanmol84 and others added 5 commits August 6, 2025 21:46
jira VULN-8056
cve CVE-2023-1652
commit-author Xingyuan Mo <[email protected]>
commit e6cf91b

If signal_pending() returns true, schedule_timeout() will not be executed,
causing the waiting task to remain in the wait queue.
Fixed by adding a call to finish_wait(), which ensures that the waiting
task will always be removed from the wait queue.

Fixes: f4e44b3 ("NFSD: delay unmount source's export after inter-server copy completed.")
	Signed-off-by: Xingyuan Mo <[email protected]>
	Reviewed-by: Jeff Layton <[email protected]>
	Signed-off-by: Chuck Lever <[email protected]>
(cherry picked from commit e6cf91b)
	Signed-off-by: Anmol Jain <[email protected]>
jira VULN-41185
cve CVE-2024-56601
commit-author Ignat Korchagin <[email protected]>
commit 9365fa5

sock_init_data() attaches the allocated sk object to the provided sock
object. If inet_create() fails later, the sk object is freed, but the
sock object retains the dangling pointer, which may create use-after-free
later.

Clear the sk pointer in the sock object on error.

	Signed-off-by: Ignat Korchagin <[email protected]>
	Reviewed-by: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 9365fa5)
	Signed-off-by: Anmol Jain <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 13, 2025
JIRA: https://issues.redhat.com/browse/RHEL-15711
Upstream status: https://git.kernel.org/pub/scm/virt/kvm/kvm.git

Intel TDX protects guest VM's from malicious host and certain physical
attacks.  TDX introduces a new operation mode, Secure Arbitration Mode
(SEAM) to isolate and protect guest VM's.  A TDX guest VM runs in SEAM and,
unlike VMX, direct control and interaction with the guest by the host VMM
is not possible.  Instead, Intel TDX Module, which also runs in SEAM,
provides a SEAMCALL API.

The SEAMCALL that provides the ability to enter a guest is TDH.VP.ENTER.
The TDX Module processes TDH.VP.ENTER, and enters the guest via VMX
VMLAUNCH/VMRESUME instructions.  When a guest VM-exit requires host VMM
interaction, the TDH.VP.ENTER SEAMCALL returns to the host VMM (KVM).

Add tdh_vp_enter() to wrap the SEAMCALL invocation of TDH.VP.ENTER;
tdh_vp_enter() needs to be noinstr because VM entry in KVM is noinstr
as well, which is for two reasons:
* marking the area as CT_STATE_GUEST via guest_state_enter_irqoff() and
  guest_state_exit_irqoff()
* IRET must be avoided between VM-exit and NMI handling, in order to
  avoid prematurely releasing the NMI inhibit.

TDH.VP.ENTER is different from other SEAMCALLs in several ways: it
uses more arguments, and after it returns some host state may need to be
restored.  Therefore tdh_vp_enter() uses __seamcall_saved_ret() instead of
__seamcall_ret(); since it is the only caller of __seamcall_saved_ret(),
it can be made noinstr also.

TDH.VP.ENTER arguments are passed through General Purpose Registers (GPRs).
For the special case of the TD guest invoking TDG.VP.VMCALL, nearly any GPR
can be used, as well as XMM0 to XMM15. Notably, RBP is not used, and Linux
mandates the TDX Module feature NO_RBP_MOD, which is enforced elsewhere.
Additionally, XMM registers are not required for the existing Guest
Hypervisor Communication Interface and are handled by existing KVM code
should they be modified by the guest.

There are 2 input formats and 5 output formats for TDH.VP.ENTER arguments.
Input #1 : Initial entry or following a previous async. TD Exit
Input #2 : Following a previous TDCALL(TDG.VP.VMCALL)
Output #1 : On Error (No TD Entry)
Output #2 : Async. Exits with a VMX Architectural Exit Reason
Output #3 : Async. Exits with a non-VMX TD Exit Status
Output #4 : Async. Exits with Cross-TD Exit Details
Output #5 : On TDCALL(TDG.VP.VMCALL)

Currently, to keep things simple, the wrapper function does not attempt
to support different formats, and just passes all the GPRs that could be
used.  The GPR values are held by KVM in the area set aside for guest
GPRs.  KVM code uses the guest GPR area (vcpu->arch.regs[]) to set up for
or process results of tdh_vp_enter().

Therefore changing tdh_vp_enter() to use more complex argument formats
would also alter the way KVM code interacts with tdh_vp_enter().

Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Message-ID: <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit 69e23fa)
Signed-off-by: Paolo Bonzini <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 13, 2025
JIRA: https://issues.redhat.com/browse/RHEL-95318

commit c5b6aba
Author: Maxim Levitsky <[email protected]>
Date:   Mon May 12 14:04:02 2025 -0400

    locking/mutex: implement mutex_trylock_nested

    Despite the fact that several lockdep-related checks are skipped when
    calling trylock* versions of the locking primitives, for example
    mutex_trylock, each time the mutex is acquired, a held_lock is still
    placed onto the lockdep stack by __lock_acquire() which is called
    regardless of whether the trylock* or regular locking API was used.

    This means that if the caller successfully acquires more than
    MAX_LOCK_DEPTH locks of the same class, even when using mutex_trylock,
    lockdep will still complain that the maximum depth of the held lock stack
    has been reached and disable itself.

    For example, the following error currently occurs in the ARM version
    of KVM, once the code tries to lock all vCPUs of a VM configured with more
    than MAX_LOCK_DEPTH vCPUs, a situation that can easily happen on modern
    systems, where having more than 48 CPUs is common, and it's also common to
    run VMs that have vCPU counts approaching that number:

    [  328.171264] BUG: MAX_LOCK_DEPTH too low!
    [  328.175227] turning off the locking correctness validator.
    [  328.180726] Please attach the output of /proc/lock_stat to the bug report
    [  328.187531] depth: 48  max: 48!
    [  328.190678] 48 locks held by qemu-kvm/11664:
    [  328.194957]  #0: ffff800086de5ba0 (&kvm->lock){+.+.}-{3:3}, at: kvm_ioctl_create_device+0x174/0x5b0
    [  328.204048]  #1: ffff0800e78800b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.212521]  #2: ffff07ffeee51e98 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.220991]  #3: ffff0800dc7d80b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.229463]  #4: ffff07ffe0c980b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.237934]  #5: ffff0800a3883c78 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.246405]  #6: ffff07fffbe480b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0

    Luckily, in all instances that require locking all vCPUs, the
    'kvm->lock' is taken a priori, and that fact makes it possible to use
    the little known feature of lockdep, called a 'nest_lock', to avoid this
    warning and subsequent lockdep self-disablement.

    The action of 'nested lock' being provided to lockdep's lock_acquire(),
    causes the lockdep to detect that the top of the held lock stack contains
    a lock of the same class and then increment its reference counter instead
    of pushing a new held_lock item onto that stack.

    See __lock_acquire for more information.

    Signed-off-by: Maxim Levitsky <[email protected]>
    Acked-by: Peter Zijlstra (Intel) <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Paolo Bonzini <[email protected]>

Signed-off-by: Maxim Levitsky <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 13, 2025
JIRA: https://issues.redhat.com/browse/RHEL-95318

commit b586c5d
Author: Maxim Levitsky <[email protected]>
Date:   Mon May 12 14:04:06 2025 -0400

    KVM: arm64: use kvm_trylock_all_vcpus when locking all vCPUs

    Use kvm_trylock_all_vcpus instead of a custom implementation when locking
    all vCPUs of a VM, to avoid triggering a lockdep warning, in the case in
    which the VM is configured to have more than MAX_LOCK_DEPTH vCPUs.

    This fixes the following false lockdep warning:

    [  328.171264] BUG: MAX_LOCK_DEPTH too low!
    [  328.175227] turning off the locking correctness validator.
    [  328.180726] Please attach the output of /proc/lock_stat to the bug report
    [  328.187531] depth: 48  max: 48!
    [  328.190678] 48 locks held by qemu-kvm/11664:
    [  328.194957]  #0: ffff800086de5ba0 (&kvm->lock){+.+.}-{3:3}, at: kvm_ioctl_create_device+0x174/0x5b0
    [  328.204048]  #1: ffff0800e78800b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.212521]  #2: ffff07ffeee51e98 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.220991]  #3: ffff0800dc7d80b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.229463]  #4: ffff07ffe0c980b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.237934]  #5: ffff0800a3883c78 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
    [  328.246405]  #6: ffff07fffbe480b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0

    Suggested-by: Paolo Bonzini <[email protected]>
    Signed-off-by: Maxim Levitsky <[email protected]>
    Acked-by: Marc Zyngier <[email protected]>
    Acked-by: Peter Zijlstra (Intel) <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Paolo Bonzini <[email protected]>

Signed-off-by: Maxim Levitsky <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
github-actions bot pushed a commit that referenced this pull request Aug 13, 2025
…e probe

JIRA: https://issues.redhat.com/browse/RHEL-90130

commit dcaeeb8
Author: Antonios Salios <[email protected]>
Date:   Fri Apr 25 13:17:45 2025 +0200

    can: m_can: m_can_class_allocate_dev(): initialize spin lock on device probe

    The spin lock tx_handling_spinlock in struct m_can_classdev is not
    being initialized. This leads the following spinlock bad magic
    complaint from the kernel, eg. when trying to send CAN frames with
    cansend from can-utils:

    | BUG: spinlock bad magic on CPU#0, cansend/95
    |  lock: 0xff60000002ec1010, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
    | CPU: 0 UID: 0 PID: 95 Comm: cansend Not tainted 6.15.0-rc3-00032-ga79be02bba5c #5 NONE
    | Hardware name: MachineWare SIM-V (DT)
    | Call Trace:
    | [<ffffffff800133e0>] dump_backtrace+0x1c/0x24
    | [<ffffffff800022f2>] show_stack+0x28/0x34
    | [<ffffffff8000de3e>] dump_stack_lvl+0x4a/0x68
    | [<ffffffff8000de70>] dump_stack+0x14/0x1c
    | [<ffffffff80003134>] spin_dump+0x62/0x6e
    | [<ffffffff800883ba>] do_raw_spin_lock+0xd0/0x142
    | [<ffffffff807a6fcc>] _raw_spin_lock_irqsave+0x20/0x2c
    | [<ffffffff80536dba>] m_can_start_xmit+0x90/0x34a
    | [<ffffffff806148b0>] dev_hard_start_xmit+0xa6/0xee
    | [<ffffffff8065b730>] sch_direct_xmit+0x114/0x292
    | [<ffffffff80614e2a>] __dev_queue_xmit+0x3b0/0xaa8
    | [<ffffffff8073b8fa>] can_send+0xc6/0x242
    | [<ffffffff8073d1c0>] raw_sendmsg+0x1a8/0x36c
    | [<ffffffff805ebf06>] sock_write_iter+0x9a/0xee
    | [<ffffffff801d06ea>] vfs_write+0x184/0x3a6
    | [<ffffffff801d0a88>] ksys_write+0xa0/0xc0
    | [<ffffffff801d0abc>] __riscv_sys_write+0x14/0x1c
    | [<ffffffff8079ebf8>] do_trap_ecall_u+0x168/0x212
    | [<ffffffff807a830a>] handle_exception+0x146/0x152

    Initializing the spin lock in m_can_class_allocate_dev solves that
    problem.

    Fixes: 1fa80e2 ("can: m_can: Introduce a tx_fifo_in_flight counter")
    Signed-off-by: Antonios Salios <[email protected]>
    Reviewed-by: Vincent Mailhol <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Reviewed-by: Markus Schneider-Pargmann <[email protected]>
    Signed-off-by: Marc Kleine-Budde <[email protected]>

Signed-off-by: Radu Rendec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants