Skip to content

[CIR][CIRGen][Builtin][X86] Lower mm_prefetch #1675

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2025

Conversation

RiverDave
Copy link
Collaborator

@RiverDave RiverDave commented Jun 10, 2025

Couple of things I have questions about:

  1. I duplicated function getIntValueFromConstOp from CIRGenBuiltinAArch64.cpp. I was wondering if that's correct or if there's a place where we can avoid that duplication.

  2. For the tests related to mm_prefetch im not sure if it'd be correct to define them in a file eg: sse-builtins.c like it's currently done in the codegen lib.

  3. I'm also aware we can emit a call for a PreFetchOp would that be required in this case?

related: #1414, #1404 (A PR was previously opened but It was not resolved)

@RiverDave RiverDave changed the title [CIR][CIRGen][Builtin][X86] Lower m_prefetch [CIR][CIRGen][Builtin][X86] Lower mm_prefetch Jun 10, 2025
@RiverDave
Copy link
Collaborator Author

RiverDave commented Jun 13, 2025

Okay, been experimenting with the tests for a while. in the test when including:

  • flag: -ffreestanding
  • header: x86intrin.h

Causes the intrinsic call operation:

cir.llvm.intrinsic "prefetch" {{%.*}} : (!cir.ptr<!void>, !s32i, !s32i, !s32i) -> !void

To transform in a proper prefetch operation:

    cir.prefetch(%2 : !cir.ptr<!void>) locality(0) read loc(#loc6)

I assume we prefer the latter. Is that right?

@RiverDave
Copy link
Collaborator Author

cc. @bcardosolopes

@bcardosolopes
Copy link
Member

@RiverDave sorry, somehow I missed this.

I assume we prefer the latter. Is that right?

cir.prefetch is preferred. The question is whether it leads to the same LLVM IR output without differences and if we need to change cir.prefetch to accommodate. Have you find any issues while trying to do so?

@RiverDave RiverDave force-pushed the dave/lower-prefetch branch from d250cc8 to 0347332 Compare July 2, 2025 23:47
@RiverDave
Copy link
Collaborator Author

@RiverDave sorry, somehow I missed this.

I assume we prefer the latter. Is that right?

cir.prefetch is preferred. The question is whether it leads to the same LLVM IR output without differences and if we need to change cir.prefetch to accommodate. Have you find any issues while trying to do so?

Two things:

  1. I've revised this PR and moved it's test case to its respective test file as in OG and included the headers that again, was present in OG.

  2. They seem to be equivalent indeed:

raw intrinsic call:

  cir.func dso_local @test_mm_prefetch(%arg0: !cir.ptr<!s8i> loc(fused[#loc3, #loc4])) extra(#fn_attr) {
    %0 = cir.alloca !cir.ptr<!s8i>, !cir.ptr<!cir.ptr<!s8i>>, ["p", init] {alignment = 8 : i64} loc(#loc10)
    cir.store %arg0, %0 : !cir.ptr<!s8i>, !cir.ptr<!cir.ptr<!s8i>> loc(#loc5)
    %1 = cir.load align(8) %0 : !cir.ptr<!cir.ptr<!s8i>>, !cir.ptr<!s8i> loc(#loc6)
    %2 = cir.const #cir.int<0> : !s32i loc(#loc7)
    %3 = cir.cast(bitcast, %1 : !cir.ptr<!s8i>), !cir.ptr<!void> loc(#loc6)
    %4 = cir.const #cir.int<0> : !s32i loc(#loc8)
    %5 = cir.const #cir.int<0> : !s32i loc(#loc8)
    %6 = cir.const #cir.int<1> : !s32i loc(#loc8)
    %7 = cir.llvm.intrinsic "prefetch" %3, %4, %5, %6 : (!cir.ptr<!void>, !s32i, !s32i, !s32i) -> !void loc(#loc8)
    cir.return loc(#loc2)
  } loc(#loc9)

PrefetchOp:

  cir.func dso_local @_Z16test_mm_prefetchPKc(%arg0: !cir.ptr<!u8i> loc(fused[#loc3, #loc4])) extra(#fn_attr) {
    %0 = cir.alloca !cir.ptr<!u8i>, !cir.ptr<!cir.ptr<!u8i>>, ["p", init] {alignment = 8 : i64} loc(#loc8)
    cir.store %arg0, %0 : !cir.ptr<!u8i>, !cir.ptr<!cir.ptr<!u8i>> loc(#loc5)
    %1 = cir.load align(8) %0 : !cir.ptr<!cir.ptr<!u8i>>, !cir.ptr<!u8i> loc(#loc6)
    %2 = cir.cast(bitcast, %1 : !cir.ptr<!u8i>), !cir.ptr<!void> loc(#loc6)
    cir.prefetch(%2 : !cir.ptr<!void>) locality(0) read loc(#loc6)
    cir.return loc(#loc2)
  } loc(#loc7)

I have one question though, Not just in this, but in other PR's, I've noted functions that have some trailing/leading characters as shown: _Z16test_mm_prefetchPKc what is the reason behind this? why is this not presented without the header I included?

@RiverDave RiverDave force-pushed the dave/lower-prefetch branch from 0347332 to f12216e Compare July 2, 2025 23:49
@RiverDave RiverDave force-pushed the dave/lower-prefetch branch from f12216e to 0be9b4f Compare July 2, 2025 23:49
@bcardosolopes
Copy link
Member

I've noted functions that have some trailing/leading characters as shown: _Z16test_mm_prefetchPKc what is the reason behind this?

It only happens in C++ - due to C++ mangling. If you cat the file to c++filt -n you gonna get the "original" names.

@bcardosolopes bcardosolopes merged commit c0fb3b0 into llvm:main Jul 3, 2025
8 of 9 checks passed
@RiverDave RiverDave deleted the dave/lower-prefetch branch July 6, 2025 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants