Use compiler builtins to detect "simple common cases" in pp_add, pp_subtract, and pp_multiply #23503

t-a-k · 2025-07-29T16:38:37Z

This will hopefully make the code faster and smaller, and make more cases to be handled as "simple common cases".

Note that this change uses HAS_BUILTIN_{ADD,SUB,MUL}_OVERFLOW macros which have already been defined in config.h but seem not to have been used by existing code.

The behavior should be the same before and after this change.

This set of changes requires a perldelta entry, and it is included.

…vailable This will hopefully make the code faster and smaller, and make more cases to be handled as "simple common cases". Note that this change uses HAS_BUILTIN_{ADD,SUB,MUL}_OVERFLOW macros which have already been defined in config.h but seem not to have been used by existing code. t/op/64bitint.t: Add tests to exercise "simple common cases". Note that these tests should pass even before this change.

tonycoz · 2025-07-30T01:03:26Z

This breaks Win32, which doesn't enable the __builtin_add_... etc builtins even for gcc.

I suspect it's due to long being 32-bits even on 64-bit Win32, but I haven't tried to debug it.

… in UV If C compiler doesn't know __builtin_mul_overflow, S_uv_mul_overflow will be implemented with fallback "long multiplication" algorithm, but it had a bug that elemental multiplications were done in unsigned long precision instead of UV precision. It will lead wrong result when unsigned long is narrower than UV (for example -Duse64bitint on 32-bit platform).

t-a-k · 2025-07-31T08:40:50Z

This breaks Win32, which doesn't enable the __builtin_add_... etc builtins even for gcc.

I suspect it's due to long being 32-bits even on 64-bit Win32, but I haven't tried to debug it.

Thank you for your comment. My patch has a fallback code (similar to the code used before this change) for compilers with no overflow-checking builtins, but it had a bug (I was able to reproduce similar symptoms by ./Configure ... -Duse64bitint -Ud_builtin_mul_overflow on 32-bit x86 Linux). I've pushed a commit to fix this.

(intended to be squashed before merge)

…st glance. (intended to be squashed before merge)

tonycoz · 2025-08-13T05:12:31Z

inline.h

+#  ifndef IV_MUL_OVERFLOW_IS_EXPENSIVE
+/* Strict overflow check for IV multiplication is generally expensive
+ * when IV is a multi-word integer.  */
+#    define IV_MUL_OVERFLOW_IS_EXPENSIVE (IVSIZE > LONGSIZE)


I don't think this is a reasonable test - if we enable the builtins for GCC on Win32 x86-64 this will be false even though it is a 64-bit platform.

Testing against PTRSIZE is probably better since that better matches the platform word size.

Thank you for pointing this out. I've pushed a commit to change the test to against PTRSIZE.

tonycoz · 2025-08-13T05:37:46Z

inline.h

+
+/* Define IV_*_OVERFLOW_IS_EXPENSIVE below to nonzero value
+ * if strict overflow checks are too expensive
+ * (for example, for CPUs that has no hardware overflow detection flag).


RISC V doesn't have a hardware overflow flag (or any classic carry, zero etc flags), so the compiler generates more complex code:

bool my_chk_add(long a, long b, long *result) { return __builtin_add_overflow(a, b, result); } ; RISC-V my_chk_add(long, long, long*): add a5,a0,a1 slt a0,a5,a0 slti a1,a1,0 sub a0,a1,a0 sd a5,0(a2) snez a0,a0 ret ; amd64 "my_chk_add(long, long, long*)": add rdi, rsi mov QWORD PTR [rdx], rdi seto al ret ; arm64 my_chk_add(long, long, long*): adds x1, x0, x1 str x1, [x2] cset w0, vs ret

(no action needed here)

tonycoz · 2025-08-13T06:05:07Z

inline.h

+#  endif
+
+#  if defined(I_STDCKDINT) && !IV_ADD_SUB_OVERFLOW_IS_EXPENSIVE
+/* XXX Preparation for upcoming C23, but I_STDCKDINT is not yet tested */


Modern clang has stdckdint.h

I know, but it would require patches to Configure and so on (and I currently have no test environment with modern Clang or GCC 14+), so I intend to make a separate PR to enable it.

@tonycoz

This will affect Win32 x86-64. Thanks to @tonycoz for figuring this out. (intended to be squashed before merge)

t-a-k added 2 commits August 6, 2025 02:13

inline.h: Comments fixed and added

1e6ea20

(intended to be squashed before merge)

inline.h: Add comment for the cast that might appear redundant at fir…

f318498

…st glance. (intended to be squashed before merge)

tonycoz reviewed Aug 13, 2025

View reviewed changes

Make default IV_MUL_OVERFLOW_IS_EXPENSIVE reasonable on LLP64 platforms

b2a723a

This will affect Win32 x86-64. Thanks to @tonycoz for figuring this out. (intended to be squashed before merge)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use compiler builtins to detect "simple common cases" in pp_add, pp_subtract, and pp_multiply #23503

Use compiler builtins to detect "simple common cases" in pp_add, pp_subtract, and pp_multiply #23503

Uh oh!

t-a-k commented Jul 29, 2025

Uh oh!

tonycoz commented Jul 30, 2025

Uh oh!

t-a-k commented Jul 31, 2025

Uh oh!

tonycoz Aug 13, 2025

Uh oh!

t-a-k Aug 13, 2025

Uh oh!

tonycoz Aug 13, 2025

Uh oh!

tonycoz Aug 13, 2025

Uh oh!

t-a-k Aug 13, 2025

Uh oh!

Uh oh!

Use compiler builtins to detect "simple common cases" in pp_add, pp_subtract, and pp_multiply #23503

Are you sure you want to change the base?

Use compiler builtins to detect "simple common cases" in pp_add, pp_subtract, and pp_multiply #23503

Uh oh!

Conversation

t-a-k commented Jul 29, 2025

Uh oh!

tonycoz commented Jul 30, 2025

Uh oh!

t-a-k commented Jul 31, 2025

Uh oh!

tonycoz Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

t-a-k Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

tonycoz Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

tonycoz Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

t-a-k Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!