Skip to content

Bpf/optimized usdt ci #9219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 215 commits into
base: bpf-next_base
Choose a base branch
from

Conversation

olsajiri
Copy link
Contributor

test

Peter Zijlstra and others added 30 commits June 11, 2025 11:20
Due to the weird Makefile setup of sched the various files do not
compile as stand alone units. The new generation of editors are trying
to do just this -- mostly to offer fancy things like completions but
also better syntax highlighting and code navigation.

Specifically, I've been playing around with neovim and clangd.

Setting up clangd on the kernel source is a giant pain in the arse
(this really should be improved), but once you do manage, you run into
dumb stuff like the above.

Fix up the scheduler files to at least pretend to work.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Tested-by: Juri Lelli <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Correct "l" to "vl_i".

Signed-off-by: wang wei <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
irq_domain_create_simple() takes a fwnode as the first argument. It can be
extracted from struct device using the dev_fwnode() helper instead of using
of_node with of_fwnode_handle().

So use the dev_fwnode() helper.

Signed-off-by: Jiri Slaby (SUSE) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Make the logic easier to follow:

  - Remove the final return statement, which is never reached, and move the
    actual walk-terminating return statement out of the do-while loop.

  - Remove the else-clause to reduce indentation. If a non-lonely group is
    encountered during the walk, the loop is immediately terminated with a
    return statement anyway; no need for an else.

Signed-off-by: Petr Tesarik <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
…ogroup.[ch]

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…ck.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

	#if CONFIG_FOO
	...
	#else /* !CONFIG_FOO: */
	...
	#endif /* !CONFIG_FOO */

 - Apply this simplification:

	-#if defined(CONFIG_FOO)
	+#ifdef CONFIG_FOO

 - Fix whitespace noise.

 - Use vertical alignment to better visualize nested #ifdef blocks,
   where appropriate:

	#ifdef CONFIG_FOO
	# ifdef CONFIG_BAR
	...
	# endif
	#endif

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…freq_schedutil.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…pri.h

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…time.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…dline.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…ug.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…davg.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…t.[ch]

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…ed.h

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…ts.[ch]

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…calls.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…ology.c

 - Use the standard #ifdef marker format for larger blocks,
   where appropriate:

        #if CONFIG_FOO
        ...
        #else /* !CONFIG_FOO: */
        ...
        #endif /* !CONFIG_FOO */

 - Fix whitespace noise and other inconsistencies.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
…t_domain and sched_domains_mutex

Simplify the scheduler by making CONFIG_SMP=y primitives and data
structures unconditional.

Unconditionally build kernel/sched/topology.c and the main sched-domains
locking primitives.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler by making CONFIG_SMP=y primitives and data
structures unconditional.

Introduce transitory wrappers for functionality not yet converted to SMP.

Note that this patch is pretty large, because there's no clear separation
between various aspects of the SMP scheduler, it's basically a huge block
of #ifdef CONFIG_SMP. A fair amount of it has to be switched on for it to
boot and work on UP systems.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler by making the CONFIG_SMP=y primitives
of is_percpu_thread() and scheduler_ipi() unconditional.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler by making CONFIG_SMP=y data structure
of rq->hrtick_csd unconditional.

Adjust hrtick_start() accordingly, which was split due to the
::hrtick_csd asymmetry and use the SMP version there too.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler by making CONFIG_SMP=y logic within
try_to_wake_up() unconditional.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler by making CONFIG_SMP=y code in
__task_needs_rq_lock() unconditional.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler by making CONFIG_SMP=y code in wake_up_new_task()
unconditional.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Simplify the scheduler making CONFIG_SMP=y sched_exec()
code unconditional.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shrikanth Hegde <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
olsajiri added 17 commits June 28, 2025 11:51
Adding nbytes argument to uprobe_write and related functions as
preparation for writing whole instructions in following changes.

Also renaming opcode arguments to insn, which seems to fit better.

Acked-by: Andrii Nakryiko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
…code

The uprobe_write has special path to restore the original page when we
write original instruction back. This happens when uprobe_write detects
that we want to write anything else but breakpoint instruction.

Moving the detection away and passing it to uprobe_write as argument,
so it's possible to write different instructions (other than just
breakpoint and rest).

Acked-by: Andrii Nakryiko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Making update_ref_ctr call in uprobe_write conditional based
on do_ref_ctr argument. This way we can use uprobe_write for
instruction update without doing ref_ctr_offset update.

Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Adding support to add special mapping for user space trampoline with
following functions:

  uprobe_trampoline_get - find or add uprobe_trampoline
  uprobe_trampoline_put - remove or destroy uprobe_trampoline

The user space trampoline is exported as arch specific user space special
mapping through tramp_mapping, which is initialized in following changes
with new uprobe syscall.

The uprobe trampoline needs to be callable/reachable from the probed address,
so while searching for available address we use is_reachable_by_call function
to decide if the uprobe trampoline is callable from the probe address.

All uprobe_trampoline objects are stored in uprobes_state object and are
cleaned up when the process mm_struct goes down. Adding new arch hooks
for that, because this change is x86_64 specific.

Locking is provided by callers in following changes.

Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Adding new uprobe syscall that calls uprobe handlers for given
'breakpoint' address.

The idea is that the 'breakpoint' address calls the user space
trampoline which executes the uprobe syscall.

The syscall handler reads the return address of the initial call
to retrieve the original 'breakpoint' address. With this address
we find the related uprobe object and call its consumers.

Adding the arch_uprobe_trampoline_mapping function that provides
uprobe trampoline mapping. This mapping is backed with one global
page initialized at __init time and shared by the all the mapping
instances.

We do not allow to execute uprobe syscall if the caller is not
from uprobe trampoline mapping.

The uprobe syscall ensures the consumer (bpf program) sees registers
values in the state before the trampoline was called.

Acked-by: Andrii Nakryiko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Putting together all the previously added pieces to support optimized
uprobes on top of 5-byte nop instruction.

The current uprobe execution goes through following:

  - installs breakpoint instruction over original instruction
  - exception handler hit and calls related uprobe consumers
  - and either simulates original instruction or does out of line single step
    execution of it
  - returns to user space

The optimized uprobe path does following:

  - checks the original instruction is 5-byte nop (plus other checks)
  - adds (or uses existing) user space trampoline with uprobe syscall
  - overwrites original instruction (5-byte nop) with call to user space
    trampoline
  - the user space trampoline executes uprobe syscall that calls related uprobe
    consumers
  - trampoline returns back to next instruction

This approach won't speed up all uprobes as it's limited to using nop5 as
original instruction, but we plan to use nop5 as USDT probe instruction
(which currently uses single byte nop) and speed up the USDT probes.

The arch_uprobe_optimize triggers the uprobe optimization and is called after
first uprobe hit. I originally had it called on uprobe installation but then
it clashed with elf loader, because the user space trampoline was added in a
place where loader might need to put elf segments, so I decided to do it after
first uprobe hit when loading is done.

The uprobe is un-optimized in arch specific set_orig_insn call.

The instruction overwrite is x86 arch specific and needs to go through 3 updates:
(on top of nop5 instruction)

  - write int3 into 1st byte
  - write last 4 bytes of the call instruction
  - update the call instruction opcode

And cleanup goes though similar reverse stages:

  - overwrite call opcode with breakpoint (int3)
  - write last 4 bytes of the nop5 instruction
  - write the nop5 first instruction byte

We do not unmap and release uprobe trampoline when it's no longer needed,
because there's no easy way to make sure none of the threads is still
inside the trampoline. But we do not waste memory, because there's just
single page for all the uprobe trampoline mappings.

We do waste frame on page mapping for every 4GB by keeping the uprobe
trampoline page mapped, but that seems ok.

We take the benefit from the fact that set_swbp and set_orig_insn are
called under mmap_write_lock(mm), so we can use the current instruction
as the state the uprobe is in - nop5/breakpoint/call trampoline -
and decide the needed action (optimize/un-optimize) based on that.

Attaching the speed up from benchs/run_bench_uprobes.sh script:

current:
        usermode-count :  152.604 ± 0.044M/s
        syscall-count  :   13.359 ± 0.042M/s
-->     uprobe-nop     :    3.229 ± 0.002M/s
        uprobe-push    :    3.086 ± 0.004M/s
        uprobe-ret     :    1.114 ± 0.004M/s
        uprobe-nop5    :    1.121 ± 0.005M/s
        uretprobe-nop  :    2.145 ± 0.002M/s
        uretprobe-push :    2.070 ± 0.001M/s
        uretprobe-ret  :    0.931 ± 0.001M/s
        uretprobe-nop5 :    0.957 ± 0.001M/s

after the change:
        usermode-count :  152.448 ± 0.244M/s
        syscall-count  :   14.321 ± 0.059M/s
        uprobe-nop     :    3.148 ± 0.007M/s
        uprobe-push    :    2.976 ± 0.004M/s
        uprobe-ret     :    1.068 ± 0.003M/s
-->     uprobe-nop5    :    7.038 ± 0.007M/s
        uretprobe-nop  :    2.109 ± 0.004M/s
        uretprobe-push :    2.035 ± 0.001M/s
        uretprobe-ret  :    0.908 ± 0.001M/s
        uretprobe-nop5 :    3.377 ± 0.009M/s

I see bit more speed up on Intel (above) compared to AMD. The big nop5
speed up is partly due to emulating nop5 and partly due to optimization.

The key speed up we do this for is the USDT switch from nop to nop5:
        uprobe-nop     :    3.148 ± 0.007M/s
        uprobe-nop5    :    7.038 ± 0.007M/s

Acked-by: Andrii Nakryiko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Importing usdt.h from libbpf/usdt project.

Suggested-by: Andrii Nakryiko <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Adding __test_uprobe_syscall with non x86_64 stub to execute all the tests,
so we don't need to keep adding non x86_64 stub functions for new tests.

Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
…multi

Renaming uprobe_syscall_executed prog to test_uretprobe_multi
to fit properly in the following changes that add more programs.

Plus adding pid filter and increasing executed variable.

Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Adding tests for optimized uprobe/usdt probes.

Checking that we get expected trampoline and attached bpf programs
get executed properly.

Signed-off-by: Jiri Olsa <[email protected]>
Adding test that makes sure parallel execution of the uprobe and
attach/detach of optimized uprobe on it works properly.

By default the test runs for 500ms, which is adjustable by using
BPF_SELFTESTS_UPROBE_SYSCALL_RACE_MSEC env variable.

Signed-off-by: Jiri Olsa <[email protected]>
Make sure that calling uprobe syscall from outside uprobe trampoline
results in sigill signal.

Signed-off-by: Jiri Olsa <[email protected]>
Adding optimized usdt variant for basic usdt test to check that
usdt arguments are properly passed in optimized code path.

Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Changing uretprobe_regs_trigger to allow the test for both
uprobe and uretprobe and renaming it to uprobe_regs_equal.

We check that both uprobe and uretprobe probes (bpf programs)
see expected registers with few exceptions.

Signed-off-by: Jiri Olsa <[email protected]>
…robe

Changing the test_uretprobe_regs_change test to test both uprobe
and uretprobe by adding entry consumer handler to the testmod
and making it to change one of the registers.

Making sure that changed values both uprobe and uretprobe handlers
propagate to the user space.

Signed-off-by: Jiri Olsa <[email protected]>
Adding uprobe as another exception to the seccomp filter alongside
with the uretprobe syscall.

Same as the uretprobe the uprobe syscall is installed by kernel as
replacement for the breakpoint exception and is limited to x86_64
arch and isn't expected to ever be supported in i386.

Cc: Kees Cook <[email protected]>
Cc: Eyal Birger <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Adding uprobe checks into the current uretprobe tests.

All the related tests are now executed with attached uprobe
or uretprobe or without any probe.

Renaming the test fixture to uprobe, because it seems better.

Cc: Kees Cook <[email protected]>
Cc: Eyal Birger <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
@olsajiri olsajiri force-pushed the bpf/optimized_usdt_ci branch 4 times, most recently from e337b63 to 8745062 Compare July 1, 2025 12:00
@olsajiri olsajiri force-pushed the bpf/optimized_usdt_ci branch from 8745062 to 201471b Compare July 1, 2025 14:15
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 2917a29 to cbe2f3c Compare July 1, 2025 19:46
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 2 times, most recently from a6d5592 to 9362e73 Compare July 1, 2025 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.