Use ATTACH maps for array-sections/subscripts on pointers. #1

abhinavgaba · 2025-07-16T14:02:39Z

This is the initial clang change to support using ATTACH map-type for pointer-attachment.

This builds upon the following:

[Offload] Introduce ATTACH map-type support for pointer attachment. llvm/llvm-project#149036
[Clang][OpenMP] Capture mapped pointers on target by reference. llvm/llvm-project#145454

For example, for the following:

  int *p;
  #pragma omp target enter data map(p[1:10])

The following maps are now emitted by clang:

  (A)
  &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
  &p, &p[1], sizeof(p), ATTACH

Previously, the two possible maps emitted by clang were:

  (B)
  &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM

  (C)
  &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ

(B) does not perform any pointer attachment, while (C) also maps the
pointer p, both of which are incorrect.

With this change, we are using ATTACH-style maps, like (A), for cases where the expression has a base-pointer. For example:

  int *p, **pp;
  S *ps, **pps;
  ... map(p[0])
  ... map(p[10:20])
  ... map(*p)
  ... map(([20])p)
  ... map(ps->a)
  ... map(pps->p->a)
  ... map(pp[0][0])
  ... map(*(pp + 10)[0])

We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like:

  S **spp;
  map(spp[0][0], spp[0][0].a), // attach-ptr: spp[0]
  map(spp[0]),                // attach-ptr: spp
  map(spp),                   // attach-ptr: N/A

We first map spp, then spp[0] then spp[0][0] and spp[0][0].a.

This allows us to also group "struct" allocation based on their attach pointers.

Cases that need handling:

When a class member like p is a base-pointer in a map from a member function within the same class, p is not being privatized, instead, we still try to create an implicit map of this[0:1], and access p through that, which is incorrect.

 struct S { int *p;
 void f1() {
   #pragma omp target data map(p[0:1])
      printf("%p %p\n", &p, p);
 }

Attach-style maps for declare mappers. That should be a separate PR.
use_device_addr clause does not work properly, because we don't have a proper component-list set-up for it, just one component, so we cannot find the proper attach-ptr. For use_device_addr, we should match existing maps whose attach-ptr matches the attach-ptr of the use_device_addr operand.
use_device_ptr handling has some issues too. Need debugging.
Other issues that haven't been found yet.

Some tests still haven't been updated. These include:

  Clang :: OpenMP/copy-gaps-1.cpp
  Clang :: OpenMP/copy-gaps-6.cpp
  Clang :: OpenMP/map_struct_ordering.cpp
  Clang :: OpenMP/target_data_use_device_addr_codegen.cpp
  Clang :: OpenMP/target_data_use_device_ptr_codegen.cpp
  Clang :: OpenMP/target_enter_data_codegen.cpp
  Clang :: OpenMP/target_enter_data_depend_codegen.cpp
  Clang :: OpenMP/target_exit_data_codegen.cpp
  Clang :: OpenMP/target_exit_data_depend_codegen.cpp
  Clang :: OpenMP/target_map_codegen_18c.cpp
  Clang :: OpenMP/target_map_codegen_18d.cpp
  Clang :: OpenMP/target_map_codegen_28.cpp
  Clang :: OpenMP/target_map_codegen_29.cpp
  Clang :: OpenMP/target_map_codegen_31.cpp
  Clang :: OpenMP/target_map_codegen_hold.cpp
  Clang :: OpenMP/target_map_deref_array_codegen.cpp
  Clang :: OpenMP/target_map_member_expr_codegen.cpp
  Clang :: OpenMP/target_update_codegen.cpp
  Clang :: OpenMP/target_update_depend_codegen.cpp

abhinavgaba · 2025-07-16T14:03:51Z

offload/libomptarget/interface.cpp

The libomptarget code will disappear from this PR once llvm#149036 is merged.

abhinavgaba · 2025-07-23T13:31:19Z

clang/lib/CodeGen/CGOpenMPRuntime.cpp

      const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr,
      ArrayRef<OMPClauseMappableExprCommon::MappableExprComponentListRef>
-          OverlappedElements = {},
-      bool AreBothBasePtrAndPteeMapped = false) const {


AreBothBaseptrAndPteeMapped was used to decide to use PTR_AND_OBJ maps for something like map(p, p[0]). We don't do that now, since we map them independently, and attach them separately.

…lvm#161666) This flag is really convinient in most cases. It's easy to figure out what value to pass for most cases. However, it can sometimes match too many times, like for template functions that has non-decuded (aka. explicitly specified) template parameters - because they don't appear in the parameter list, thus they are not accounted for in the current logic. It would be nice to improve `getFunctionName` but I'd say to just settle on using USRs. So this PR enables passing USRs to the flag, while keeping previous behavior.

Also remove the gratuitous spaces after "," that break strict CSV compliance. As the debug function name does not uniquely identify an entry point, add the main-TU name and the USR values for each entry point snapshot to reduce the likelihood of collisions between declarations across large projects. While adding a filename to each row increases the file size substantially, the difference in size for the compressed is acceptable. I evaluated it on our set of 200+ open source C and C++ projects with 3M entry points, and got the following results when adding these two columns: - Raw CSV file increased from 530MB to 1.1GB - Compressed file (XZ) increased from 54 MB to 78 MB -- CPP-7098 --------- Co-authored-by: Balazs Benics <[email protected]>

…checkers (llvm#161664) Previously, when using `-analyze-function` to target a specific function, the analyzer would incorrectly report "Every top-level function was skipped" even when the function was successfully analyzed by syntax-only checkers. This happened because `NumFunctionsAnalyzed` only counted path-sensitive analysis, not syntax-only analysis. The misuse detection logic would see 0 functions analyzed and incorrectly conclude the function wasn't found.

revert this patch due to failure in unittests/Support, e.g. https://lab.llvm.org/buildbot/#/builders/33/builds/24178/steps/6/logs/FAIL__LLVM-Unit__SupportTests_61 This reverts commit ffc503e.

…atch legalized types (llvm#161802) When running with `-debug`, print a note when the replacement types (during a `ConversionPatternRewriter::replaceOp`) do not match the legalized types of the current type converter. That's not an API violation, but it could indicate a bug in user code. Example output: ``` [dialect-conversion:1] ** Replace : 'test.multiple_1_to_n_replacement'(0x56b745f99470) [dialect-conversion:1] Note: Replacing op result of type f16 with value(s) of type (f16, f16), but the legalized type(s) is/are (f16) ```

… msvc (llvm#161811) Clang 20 (and early 21 versions; let's hope it can be fixed before the later versions before such versions become relevant for libcxx CI) have got an issue with its intrinsics headers, where they use unreserved names, that users are allowed to override. See llvm#161808 for the issue report. This only crops up in the MSVC build configurations, as recent versions of some MSVC/UCRT headers include `<intrin.h>`, which ends up pulling in most intrinsics headers, exposing this issue in the Clang headers. This should unblock llvm#161736 from being merged.

…m#161830)

The i16/i32 shuffle intrinsics (`pshufw`, `pshuflw`, `pshufhw`, `pshufd`) currently cannot be used in constant expressions. This patch adds support in both bytecode interpreter (InterpBuiltin.cpp) and constant evaluator (ExprConstant.cpp) for pshuf intrinsics, enabling their use in constant expressions. ## Intrinsics covered - `_mm_shuffle_pi16` (MMX `pshufw`) - `_mm_shufflelo_epi16` / `_mm_shufflehi_epi16` - `_mm_shuffle_epi32` - Their AVX2/AVX512 vector-width variants - Masked and maskz forms (handled indirectly via `__builtin_ia32_select*`) Fixes llvm#156611

Replace mul and mul_u ops with a neg operation if their second operand is a splat value -1. Apply the optimization also for mul_u ops if their first operand is a splat value -1 due to their commutativity.

``` /home/david.spickett/llvm-project/llvm/lib/Analysis/HashRecognize.cpp:100:54: warning: comparison of integers of different signs: 'typename iterator_traits<ilist_iterator_w_bits<node_options<Instruction, true, false, void, true, BasicBlock>, false, false>>::difference_type' (aka 'int') and 'size_type' (aka 'unsigned int') [-Wsign-compare] 100 | return std::distance(Latch->begin(), Latch->end()) != Visited.size(); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ ``` By using Latch->size() instead.

1. Fixed Android setjmp issue. The root cause is that TSan initializes before longjmp_xor_key is set up. During __libc_init_vdso, a call to strcmp triggers TSan initialization, which occurs before __libc_init_setjmp_cookie. The solution is to call InitializeLongjmpXorKey on the first use of longjmp_xor_key. Additionally, correct LONG_JMP_SP_ENV_SLOT by following the bionic source code. 2. Skip thr object range check on Android. On Android, thr is allocated on the heap, causing the check to fail. 3. Disable intercepting clone on Android. pthread_create internally calls clone. Disabling the interception of clone resolves the issue in most scenarios. 4. Use a workaround to recover the thr pointer stored in TLS_SLOT_SANITIZER slot, whose value was modified by Skia. This PR solved the issue from NDK android/ndk#1041. Test project: https://github.com/bytedance/android_tsan_sample/

…pr (llvm#162114) This was missed in the earlier f16c/fp16 constexpr patches

…lvm#162042) We don't have FPR isel patterns for G_LOAD/STORE so force to the GPR register bank.

Fix the build configuration that has OnDiskCAS disabled.

…compilations. (llvm#161486) The driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash). This resets those options.

…ind (llvm#161230) Inspired by this comment llvm#157930 (comment) (and long-standing issues related to finding nanobind/pybind in the right place), this PR moves to using `FetchContent_Declare` to get the nanobind dependency. This is pretty standard (see e.g., [IREE](https://github.com/iree-org/iree/blob/cf60359b7443b0e92e15fb6ffc011525dc40e772/CMakeLists.txt#L842-L848)). This PR also removes pybind which has been deprecated for almost a year (llvm#117922) and which isn't compatible (for whatever reason) with `FetchContent_Declare`. --------- Co-authored-by: Jacques Pienaar <[email protected]>

…untime_vers (llvm#162062) `DW_AT_APPLE_major_runtime_version` doesn't exist. This should be `DW_AT_APPLE_major_runtime_vers`.

…vm#154656) As discussed before, this PR adds the basic infrastructure/boiler plate for loop ordering strategies to be implemented. If this looks ok, I wanted to also mention some of the heuristics that I would implement next, if they sound reasonable to you guys: - Parallel first : prioritize parallel loops over reduction loops - Dense outer : prioritize the most dense loops first - Sparse outer : the opposite, potentially useful in some cases? There is another that I am considering, stride/memory aware, which would prioritize loops with better stride patterns (like sequential or linear). Not sure how well this carries over to Sparse Tensor though. Are there any ideas/heuristics that I should definitely try to implement? As we discussed, I will try to incrementally add heuristics. Sorry for the delay on my end, and thank you so much for the feedback! --------- Co-authored-by: Aart Bik <[email protected]>

…lvm#153337) Fixes clangd/clangd#2290

…62119) I don't think we need them after AST is destroyed. If I am wrong we should see crashes immediately, as pointers de-referenced unchecked.

This simplifies and cleans up the vector list initialization behavior. This simplifies the work we do in SemaInit by just relying on SemaHLSL's initialization list flattening. This change fixes some outstanding limitations by supporting structure to vector initialization, but re-introduces HLSL's limitations around overload resolution in initializers. --------- Co-authored-by: Helena Kotas <[email protected]>

…59667) The frontend currently opens the path provided via `-fprofile-instrument-use-path=` to learn the kind of the instrumentation data and set the `CodeGenOptions::ProfileUse` value. This happens during command-line parsing, where we don't have a correctly configured VFS yet, so the behavior is quite different from all other frontend inputs. We need to move this logic out of the frontend command line parsing logic somewhere where we do have the configured VFS. The complication is that the `ProfileUse` flag is being used to set preprocessor macros, and there isn't a great place between command line parsing and preprocessor initialization to perform this logic. This PR solves the issue by deducing the kind of instrumentation data right in the driver and then passing it via a new flag to the frontend. This shouldn't change observable behavior of Clang on the driver level, and only affects the frontend command line interface, which is an implementation detail anyway.

Summary: This check is unnecessarily restrictive and currently incorrectly fires for any size less than eight bytes. Just remove it, we do sanity checks elsewhere and at some point need to trust the ABI.

… Android (llvm#161596) In Bionic, pthread_detach calls pthread_join, so the thread has already been consumed by the pthread_detach interceptor. https://android.googlesource.com/platform/bionic/+/refs/heads/android16-release/libc/bionic/pthread_detach.cpp

AMDGPU documentation states op_sel[3] can be used in v_interp_p2_f16 in GFX9.

llvm#162100) … type. Our modeling of cast expressions now results in the potential presence of fields from other types derived from the SourceLocation's type. Also add some additional debug logging and new tests, and update an existing test to reflect real usage of synthetic field callbacks.

…pport/math folder. (llvm#161992) Part of llvm#147386 in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450

… NFC (llvm#162122)

…oleans (llvm#161293) Logical booleans in LLVM are represented by select statements - e.g. the statement ``` A && B ``` is represented as ``` select i1 %A, i1 %B, i1 false ``` When LLVM folds two of the same logical booleans into a logical boolean and a bitwise boolean (e.g. `A && B && C` -> `A && (B & C)`), the first logical boolean is a select statement that retains the original condition from the first logical boolean of the original statement. This means that the new select statement has the branch weights as the original select statement. Tracking issue: llvm#147390

Originally, DXContainer `isValid...` functions were defined din the header. That cause a problem, that made this specific file huge, since every function also includes a `def` file. To fix this problem, this PR moves the function to the cpp equivalent file. Closes: [158162](llvm#158162)

Implement support for `subcommands` in OptTable to attain feature parity with `cl`. Design overview: https://discourse.llvm.org/t/subcommand-feature-support-in-llvm-opttable/88098 Issue: llvm#108307

…61747) Handle branch weights for `indirectbr` simplification: if we drop branches that aren't taken, we just need to remove the corresponding branch weight (which is presumably 0). Issue llvm#147390

This allows to define multiple interface methods with the same name but different arguments.

…rom pointer to reference. (llvm#161785) Also adds an instance of `AttachPtrExprComparator` to the `MappableExprHandler` class, so that it can be reused for multiple comparisons. This was extracted out of llvm#153683 to make that PR more focused on the functional changes.

…et.h. (llvm#161791) The clang changes that use this map-type are in llvm#153683.

…62141)

…on-using-attach-maptype

abhinavgaba commented Jul 16, 2025

View reviewed changes

offload/libomptarget/interface.cpp

Copy link

Owner Author

abhinavgaba Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The libomptarget code will disappear from this PR once llvm#149036 is merged.

abhinavgaba mentioned this pull request Jul 17, 2025

[Offload] Introduce ATTACH map-type support for pointer attachment. llvm/llvm-project#149036

Merged

abhinavgaba changed the title ~~[WIP] Use ATTACH maps for array-sections/subscripts on pointers.~~ Use ATTACH maps for array-sections/subscripts on pointers. Jul 22, 2025

This was referenced Jul 22, 2025

[OpenMP] Mapping of 'middle' structures chained through '->' does not work llvm/llvm-project#141042

Open

[Clang][OpenMP] Capture mapped pointers on target by reference. llvm/llvm-project#145454

Open

abhinavgaba commented Jul 23, 2025

View reviewed changes

steakhal and others added 10 commits October 3, 2025 16:24

Revert "[LLVM] Add GNU make jobserver support (llvm#145131)"

f8f6c0b

revert this patch due to failure in unittests/Support, e.g. https://lab.llvm.org/buildbot/#/builders/33/builds/24178/steps/6/logs/FAIL__LLVM-Unit__SupportTests_61 This reverts commit ffc503e.

[lldb][Lanugage][NFC] Adapt Language::ForEach to IterationAction (llv…

3d81008

…m#161830)

Fold SVE mul and mul_u to neg during isel (llvm#160828)

f4a39a8

Replace mul and mul_u ops with a neg operation if their second operand is a splat value -1. Apply the optimization also for mul_u ops if their first operand is a splat value -1 due to their commutativity.

airpfei and others added 29 commits October 6, 2025 12:51

[X86] avx512fp16intrin.h - allow _mm512_cvtsh_h to be used in constex…

ec0db66

…pr (llvm#162114) This was missed in the earlier f16c/fp16 constexpr patches

[RISCV][GISel] Force atomic G_LOAD/STORE to the GPR register bank. (l…

141964b

…lvm#162042) We don't have FPR isel patterns for G_LOAD/STORE so force to the GPR register bank.

Re-land llvm#161264: [CAS] Add OnDiskDataAllocator (llvm#162112)

2aff3c6

Fix the build configuration that has OnDiskCAS disabled.

[lldb][docs] DW_AT_APPLE_major_runtime_version -> DW_AT_APPLE_major_r…

8889377

…untime_vers (llvm#162062) `DW_AT_APPLE_major_runtime_version` doesn't exist. This should be `DW_AT_APPLE_major_runtime_vers`.

[clangd] Retrieve documentation for class member instance from index (l…

255db37

…lvm#153337) Fixes clangd/clangd#2290

[NFC][AST] Clean up XRayFilter and NoSanitizeL before backend (llvm#1…

68a7196

…62119) I don't think we need them after AST is destroyed. If I am wrong we should see crashes immediately, as pointers de-referenced unchecked.

[Offload] Remove check on kernel argument sizes (llvm#162121)

8763812

Summary: This check is unnecessarily restrictive and currently incorrectly fires for any size less than eight bytes. Just remove it, we do sanity checks elsewhere and at some point need to trust the ABI.

[AMDGPU][MC] GFX9 - allow op_sel in v_interp_p2_f16 (llvm#150712)

6419046

AMDGPU documentation states op_sel[3] can be used in v_interp_p2_f16 in GFX9.

[gn build] Port 2aff3c6

bf8c2f2

[RISCV] Add "MIPS" to name of MIPS-specific RISCVSubtarget functions.…

bf0a6ae

… NFC (llvm#162122)

[llvm] Add subcommand support for OptTable (llvm#155026)

fdbd17d

Implement support for `subcommands` in OptTable to attain feature parity with `cl`. Design overview: https://discourse.llvm.org/t/subcommand-feature-support-in-llvm-opttable/88098 Issue: llvm#108307

[SimplifyCFG][profcheck] Profile propagation for indirectbr (llvm#1…

a7b3826

…61747) Handle branch weights for `indirectbr` simplification: if we drop branches that aren't taken, we just need to remove the corresponding branch weight (which is presumably 0). Issue llvm#147390

[profcheck] Update exclusion list to reflect fixes (llvm#161943)

199811d

[MLIR][ODS] Add support for overloading interface methods (llvm#161828)

842622b

This allows to define multiple interface methods with the same name but different arguments.

[OpenMP] Add ATTACH map-type to OMPConstants.h, to align with omptarg…

fc7cd65

…et.h. (llvm#161791) The clang changes that use this map-type are in llvm#153683.

[profcheck] Re-add 2 tests prematurely removed in llvm#161943 (llvm#1…

91a3cd7

…62141)

Merge remote-tracking branch 'upstream/main' into map-ptr-array-secti…

2f7b421

…on-using-attach-maptype

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use ATTACH maps for array-sections/subscripts on pointers. #1

Use ATTACH maps for array-sections/subscripts on pointers. #1

Uh oh!

abhinavgaba commented Jul 16, 2025 •

edited

Loading

Uh oh!

abhinavgaba Jul 16, 2025

Uh oh!

abhinavgaba Jul 23, 2025

Uh oh!

Uh oh!

Use ATTACH maps for array-sections/subscripts on pointers. #1

Are you sure you want to change the base?

Use ATTACH maps for array-sections/subscripts on pointers. #1

Uh oh!

Conversation

abhinavgaba commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhinavgaba Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

abhinavgaba Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abhinavgaba commented Jul 16, 2025 •

edited

Loading