Skip to content

[Flang][OpenMP] Remove use of non reference values from MapInfoOp #72444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 24, 2023

Conversation

TIFitis
Copy link
Member

@TIFitis TIFitis commented Nov 15, 2023

This patch removes the val field from the MapInfoOp.

Previously when lowering TargetOp, the bounds information for the BoxValues were also being mapped. Instead these ops are now cloned inside the target region to prevent mapping of non reference typed values.

@llvmbot
Copy link
Member

llvmbot commented Nov 15, 2023

@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-flang-fir-hlfir
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir

Author: Akash Banerjee (TIFitis)

Changes

This patch removes the val field from the MapInfoOp.

Previously when lowering TargetOp, the bounds information for the BoxValues were also being mapped. Instead these ops are now cloned inside the target region to prevent mapping of non reference typed values.


Patch is 43.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72444.diff

10 Files Affected:

  • (modified) flang/lib/Lower/OpenMP.cpp (+74-87)
  • (modified) flang/test/Lower/OpenMP/FIR/array-bounds.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/FIR/target.f90 (+56-18)
  • (modified) flang/test/Lower/OpenMP/array-bounds.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/target.f90 (+43-28)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+6-9)
  • (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (-9)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+12-29)
  • (modified) mlir/test/Dialect/OpenMP/invalid.mlir (-36)
  • (modified) openmp/libomptarget/test/offloading/fortran/constant-arr-index.f90 (+3-3)
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index 00f16026d9b4bb8..30d2be33357c8c0 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -28,6 +28,7 @@
 #include "flang/Semantics/tools.h"
 #include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Dialect/SCF/IR/SCF.h"
+#include "mlir/Transforms/RegionUtils.h"
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include "llvm/Support/CommandLine.h"
 
@@ -1709,9 +1710,9 @@ static mlir::omp::MapInfoOp
 createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc,
                 mlir::Value baseAddr, std::stringstream &name,
                 mlir::SmallVector<mlir::Value> bounds, uint64_t mapType,
-                mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type retTy,
-                bool isVal = false) {
-  mlir::Value val, varPtr, varPtrPtr;
+                mlir::omp::VariableCaptureKind mapCaptureType,
+                mlir::Type retTy) {
+  mlir::Value varPtr, varPtrPtr;
   mlir::TypeAttr varType;
 
   if (auto boxTy = baseAddr.getType().dyn_cast<fir::BaseBoxType>()) {
@@ -1719,16 +1720,12 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc,
     retTy = baseAddr.getType();
   }
 
-  if (isVal)
-    val = baseAddr;
-  else
-    varPtr = baseAddr;
-
-  if (auto ptrType = llvm::dyn_cast<mlir::omp::PointerLikeType>(retTy))
-    varType = mlir::TypeAttr::get(ptrType.getElementType());
+  varPtr = baseAddr;
+  varType = mlir::TypeAttr::get(
+      llvm::cast<mlir::omp::PointerLikeType>(retTy).getElementType());
 
   mlir::omp::MapInfoOp op = builder.create<mlir::omp::MapInfoOp>(
-      loc, retTy, val, varPtr, varType, varPtrPtr, bounds,
+      loc, retTy, varPtr, varType, varPtrPtr, bounds,
       builder.getIntegerAttr(builder.getIntegerType(64, false), mapType),
       builder.getAttr<mlir::omp::VariableCaptureKindAttr>(mapCaptureType),
       builder.getStringAttr(name.str()));
@@ -2489,21 +2486,20 @@ static void genBodyOfTargetOp(
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
   mlir::Region &region = targetOp.getRegion();
 
-  firOpBuilder.createBlock(&region, {}, mapSymTypes, mapSymLocs);
+  auto *regionBlock =
+      firOpBuilder.createBlock(&region, {}, mapSymTypes, mapSymLocs);
 
   unsigned argIndex = 0;
-  unsigned blockArgsIndex = mapSymbols.size();
-
-  // The block arguments contain the map_operands followed by the bounds in
-  // order. This returns a vector containing the next 'n' block arguments for
-  // the bounds.
-  auto extractBoundArgs = [&](auto n) {
-    llvm::SmallVector<mlir::Value> argExtents;
-    while (n--) {
-      argExtents.push_back(fir::getBase(region.getArgument(blockArgsIndex)));
-      blockArgsIndex++;
+
+  // Clones the `bounds` and returns them.
+  auto cloneBounds = [&](llvm::ArrayRef<mlir::Value> bounds) {
+    llvm::SmallVector<mlir::Value> clonedBounds;
+    for (mlir::Value bound : bounds) {
+      mlir::Operation *clonedOp = bound.getDefiningOp()->clone();
+      regionBlock->push_back(clonedOp);
+      clonedBounds.emplace_back(clonedOp->getResult(0));
     }
-    return argExtents;
+    return clonedBounds;
   };
 
   // Bind the symbols to their corresponding block arguments.
@@ -2512,34 +2508,35 @@ static void genBodyOfTargetOp(
     fir::ExtendedValue extVal = converter.getSymbolExtendedValue(*sym);
     extVal.match(
         [&](const fir::BoxValue &v) {
-          converter.bindSymbol(
-              *sym,
-              fir::BoxValue(arg, extractBoundArgs(v.getLBounds().size()),
-                            v.getExplicitParameters(), v.getExplicitExtents()));
+          converter.bindSymbol(*sym,
+                               fir::BoxValue(arg, cloneBounds(v.getLBounds()),
+                                             v.getExplicitParameters(),
+                                             v.getExplicitExtents()));
         },
         [&](const fir::MutableBoxValue &v) {
           converter.bindSymbol(
-              *sym,
-              fir::MutableBoxValue(arg, extractBoundArgs(v.getLBounds().size()),
-                                   v.getMutableProperties()));
+              *sym, fir::MutableBoxValue(arg, cloneBounds(v.getLBounds()),
+                                         v.getMutableProperties()));
         },
         [&](const fir::ArrayBoxValue &v) {
           converter.bindSymbol(
-              *sym,
-              fir::ArrayBoxValue(arg, extractBoundArgs(v.getExtents().size()),
-                                 extractBoundArgs(v.getLBounds().size()),
-                                 v.getSourceBox()));
+              *sym, fir::ArrayBoxValue(arg, cloneBounds(v.getExtents()),
+                                       cloneBounds(v.getLBounds()),
+                                       v.getSourceBox()));
         },
         [&](const fir::CharArrayBoxValue &v) {
+          mlir::Operation *clonedLen = v.getLen().getDefiningOp()->clone();
+          regionBlock->push_back(clonedLen);
           converter.bindSymbol(
-              *sym,
-              fir::CharArrayBoxValue(arg, extractBoundArgs(1).front(),
-                                     extractBoundArgs(v.getExtents().size()),
-                                     extractBoundArgs(v.getLBounds().size())));
+              *sym, fir::CharArrayBoxValue(arg, clonedLen->getResult(0),
+                                           cloneBounds(v.getExtents()),
+                                           cloneBounds(v.getLBounds())));
         },
         [&](const fir::CharBoxValue &v) {
-          converter.bindSymbol(
-              *sym, fir::CharBoxValue(arg, extractBoundArgs(1).front()));
+          mlir::Operation *clonedLen = v.getLen().getDefiningOp()->clone();
+          regionBlock->push_back(clonedLen);
+          converter.bindSymbol(*sym,
+                               fir::CharBoxValue(arg, clonedLen->getResult(0)));
         },
         [&](const fir::UnboxedValue &v) { converter.bindSymbol(*sym, arg); },
         [&](const auto &) {
@@ -2549,6 +2546,43 @@ static void genBodyOfTargetOp(
     argIndex++;
   }
 
+  // Check if cloning the bounds introduced any dependency on the outer region.
+  // If so, then either clone them as well if trivial, or else add them to the
+  // map and block_argument lists.
+  llvm::SetVector<mlir::Value> valuesDefinedAbove;
+  mlir::getUsedValuesDefinedAbove(region, valuesDefinedAbove);
+  while (!valuesDefinedAbove.empty()) {
+    for (mlir::Value val : valuesDefinedAbove) {
+      if (fir::isa_trivial(val.getType())) {
+        mlir::Operation *clonedVal = val.getDefiningOp()->clone();
+        regionBlock->push_front(clonedVal);
+        val.replaceUsesWithIf(
+            clonedVal->getResult(0), [regionBlock](mlir::OpOperand &use) {
+              return use.getOwner()->getBlock() == regionBlock;
+            });
+      } else {
+        llvm::SmallVector<mlir::Value> bounds;
+        std::stringstream name;
+        auto savedIP = firOpBuilder.getInsertionPoint();
+        firOpBuilder.setInsertionPoint(targetOp);
+        mlir::Value mapOp = createMapInfoOp(
+            firOpBuilder, val.getLoc(), val, name, bounds,
+            static_cast<
+                std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
+                llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT),
+            mlir::omp::VariableCaptureKind::ByCopy, val.getType());
+        targetOp.getMapOperandsMutable().append(mapOp);
+        mlir::Value clonedVal = region.addArgument(val.getType(), val.getLoc());
+        val.replaceUsesWithIf(clonedVal, [regionBlock](mlir::OpOperand &use) {
+          return use.getOwner()->getBlock() == regionBlock;
+        });
+        firOpBuilder.setInsertionPoint(regionBlock, savedIP);
+      }
+    }
+    valuesDefinedAbove.clear();
+    mlir::getUsedValuesDefinedAbove(region, valuesDefinedAbove);
+  }
+
   // Insert dummy instruction to remember the insertion position. The
   // marker will be deleted since there are not uses.
   // In the HLFIR flow there are hlfir.declares inserted above while
@@ -2671,53 +2705,6 @@ genTargetOp(Fortran::lower::AbstractConverter &converter,
   };
   Fortran::lower::pft::visitAllSymbols(eval, captureImplicitMap);
 
-  // Add the bounds and extents for box values to mapOperands
-  auto addMapInfoForBounds = [&](const auto &bounds) {
-    for (auto &val : bounds) {
-      mapSymLocs.push_back(val.getLoc());
-      mapSymTypes.push_back(val.getType());
-
-      llvm::SmallVector<mlir::Value> bounds;
-      std::stringstream name;
-
-      mlir::Value mapOp = createMapInfoOp(
-          converter.getFirOpBuilder(), val.getLoc(), val, name, bounds,
-          static_cast<
-              std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
-              llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT),
-          mlir::omp::VariableCaptureKind::ByCopy, val.getType(), true);
-      mapOperands.push_back(mapOp);
-    }
-  };
-
-  for (const Fortran::semantics::Symbol *sym : mapSymbols) {
-    fir::ExtendedValue extVal = converter.getSymbolExtendedValue(*sym);
-    extVal.match(
-        [&](const fir::BoxValue &v) { addMapInfoForBounds(v.getLBounds()); },
-        [&](const fir::MutableBoxValue &v) {
-          addMapInfoForBounds(v.getLBounds());
-        },
-        [&](const fir::ArrayBoxValue &v) {
-          addMapInfoForBounds(v.getExtents());
-          addMapInfoForBounds(v.getLBounds());
-        },
-        [&](const fir::CharArrayBoxValue &v) {
-          llvm::SmallVector<mlir::Value> len;
-          len.push_back(v.getLen());
-          addMapInfoForBounds(len);
-          addMapInfoForBounds(v.getExtents());
-          addMapInfoForBounds(v.getLBounds());
-        },
-        [&](const fir::CharBoxValue &v) {
-          llvm::SmallVector<mlir::Value> len;
-          len.push_back(v.getLen());
-          addMapInfoForBounds(len);
-        },
-        [&](const auto &) {
-          // Nothing to do for non-box values.
-        });
-  }
-
   auto targetOp = converter.getFirOpBuilder().create<mlir::omp::TargetOp>(
       currentLocation, ifClauseOperand, deviceOperand, threadLimitOperand,
       nowaitAttr, mapOperands);
diff --git a/flang/test/Lower/OpenMP/FIR/array-bounds.f90 b/flang/test/Lower/OpenMP/FIR/array-bounds.f90
index 02b5ebcee022675..8acc9738821464f 100644
--- a/flang/test/Lower/OpenMP/FIR/array-bounds.f90
+++ b/flang/test/Lower/OpenMP/FIR/array-bounds.f90
@@ -16,7 +16,7 @@
 !ALL:  %[[BOUNDS1:.*]] = omp.bounds   lower_bound(%[[C5]] : index) upper_bound(%[[C6]] : index) stride(%[[C4]] : index) start_idx(%[[C4]] : index)
 !ALL:  %[[MAP1:.*]] = omp.map_info var_ptr(%[[WRITE]] : !fir.ref<!fir.array<10xi32>>, !fir.array<10xi32>)   map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS1]]) -> !fir.ref<!fir.array<10xi32>> {name = "sp_write(2:5)"}
 !ALL:  %[[MAP2:.*]] = omp.map_info var_ptr(%[[ITER]] : !fir.ref<i32>, i32)   map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref<i32> {name = "i"}
-!ALL: omp.target map_entries(%[[MAP0]] -> %{{.*}}, %[[MAP1]] -> %{{.*}}, %[[MAP2]] -> %{{.*}}, %{{.*}} -> %{{.*}}, %{{.*}} -> %{{.*}} : !fir.ref<!fir.array<10xi32>>, !fir.ref<!fir.array<10xi32>>, !fir.ref<i32>, index, index) {
+!ALL: omp.target map_entries(%[[MAP0]] -> %{{.*}}, %[[MAP1]] -> %{{.*}}, %[[MAP2]] -> %{{.*}} : !fir.ref<!fir.array<10xi32>>, !fir.ref<!fir.array<10xi32>>, !fir.ref<i32>) {
 
 subroutine read_write_section()
     integer :: sp_read(10) = (/1,2,3,4,5,6,7,8,9,10/)
@@ -64,7 +64,7 @@ end subroutine assumed_shape_array
 !ALL: %[[BOUNDS:.*]]  = omp.bounds   lower_bound(%[[C1]] : index) upper_bound(%[[C2]] : index) stride(%[[C0]] : index) start_idx(%[[C0]] : index)
 !ALL: %[[MAP:.*]] = omp.map_info var_ptr(%[[ARG0]] : !fir.ref<!fir.array<?xi32>>, !fir.array<?xi32>)   map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.ref<!fir.array<?xi32>> {name = "arr_read_write(2:5)"}
 !ALL: %[[MAP2:.*]] = omp.map_info var_ptr(%[[ALLOCA]] : !fir.ref<i32>, i32)   map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref<i32> {name = "i"}
-!ALL: omp.target map_entries(%[[MAP]] -> %{{.*}}, %[[MAP2]] -> %{{.*}}, %{{.*}} -> %{{.*}} : !fir.ref<!fir.array<?xi32>>, !fir.ref<i32>, index) {
+!ALL: omp.target map_entries(%[[MAP]] -> %{{.*}}, %[[MAP2]] -> %{{.*}} : !fir.ref<!fir.array<?xi32>>, !fir.ref<i32>) {
 
         subroutine assumed_size_array(arr_read_write)
             integer, intent(inout) :: arr_read_write(*)
diff --git a/flang/test/Lower/OpenMP/FIR/target.f90 b/flang/test/Lower/OpenMP/FIR/target.f90
index cf3b6fff0cb7498..20de9726360bb25 100644
--- a/flang/test/Lower/OpenMP/FIR/target.f90
+++ b/flang/test/Lower/OpenMP/FIR/target.f90
@@ -189,8 +189,8 @@ subroutine omp_target
    integer :: a(1024)
    !CHECK: %[[BOUNDS:.*]] = omp.bounds   lower_bound({{.*}}) upper_bound({{.*}}) extent({{.*}}) stride({{.*}}) start_idx({{.*}})
    !CHECK: %[[MAP:.*]] = omp.map_info var_ptr(%[[VAL_0]] : !fir.ref<!fir.array<1024xi32>>, !fir.array<1024xi32>)   map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.ref<!fir.array<1024xi32>> {name = "a"}
-   !CHECK: omp.target   map_entries(%[[MAP]] -> %[[ARG_0:.*]], %{{.*}} -> %{{.*}} : !fir.ref<!fir.array<1024xi32>>, index) {
-   !CHECK: ^bb0(%[[ARG_0]]: !fir.ref<!fir.array<1024xi32>>, %{{.*}}: index):
+   !CHECK: omp.target   map_entries(%[[MAP]] -> %[[ARG_0:.*]] : !fir.ref<!fir.array<1024xi32>>) {
+   !CHECK: ^bb0(%[[ARG_0]]: !fir.ref<!fir.array<1024xi32>>):
    !$omp target map(tofrom: a)
       !CHECK: %[[VAL_1:.*]] = arith.constant 10 : i32
       !CHECK: %[[VAL_2:.*]] = arith.constant 1 : i64
@@ -213,8 +213,8 @@ subroutine omp_target_implicit
    !CHECK: %[[VAL_0:.*]] = fir.alloca !fir.array<1024xi32> {bindc_name = "a", uniq_name = "_QFomp_target_implicitEa"}
    integer :: a(1024)
    !CHECK: %[[MAP:.*]] = omp.map_info var_ptr(%[[VAL_0]] : !fir.ref<!fir.array<1024xi32>>, !fir.array<1024xi32>)   map_clauses(implicit, tofrom) capture(ByRef) bounds(%{{.*}}) -> !fir.ref<!fir.array<1024xi32>> {name = "a"}
-   !CHECK: omp.target   map_entries(%[[MAP]] -> %[[ARG_0:.*]], %{{.*}} -> %{{.*}} : !fir.ref<!fir.array<1024xi32>>, index) {
-   !CHECK: ^bb0(%[[ARG_0]]: !fir.ref<!fir.array<1024xi32>>, %{{.*}}: index):
+   !CHECK: omp.target   map_entries(%[[MAP]] -> %[[ARG_0:.*]] : !fir.ref<!fir.array<1024xi32>>) {
+   !CHECK: ^bb0(%[[ARG_0]]: !fir.ref<!fir.array<1024xi32>>):
    !$omp target
       !CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG_0]], %{{.*}} : (!fir.ref<!fir.array<1024xi32>>, i64) -> !fir.ref<i32>
       a(1) = 10
@@ -249,22 +249,60 @@ end subroutine omp_target_implicit_nested
 ! Target implicit capture with bounds
 !===============================================================================
 
-!CHECK-LABEL: func.func @_QPomp_target_implicit_bounds(%{{.*}}: !fir.ref<i32> {fir.bindc_name = "n"}) {
+
+!CHECK-LABEL: func.func @_QPomp_target_implicit_bounds(
+!CHECK: %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "n"}) {
 subroutine omp_target_implicit_bounds(n)
-   !CHECK: %[[VAL_1:.*]] = arith.select %{{.*}}, %{{.*}}, %{{.*}} : index
-   !CHECK: %[[VAL_2:.*]] = arith.select %{{.*}}, %{{.*}}, %{{.*}} : index
-   !CHECK: %[[VAL_3:.*]] = fir.alloca !fir.array<?x1024xi32>, %[[VAL_1]] {bindc_name = "a", uniq_name = "_QFomp_target_implicit_boundsEa"}
+   !CHECK: %[[VAL_1:.*]] = fir.load %[[VAL_0]] : !fir.ref<i32>
+   !CHECK: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (i32) -> i64
+   !CHECK: %[[VAL_3:.*]] = fir.convert %[[VAL_2]] : (i64) -> index
+   !CHECK: %[[VAL_4:.*]] = arith.constant 0 : index
+   !CHECK: %[[VAL_5:.*]] = arith.cmpi sgt, %[[VAL_3]], %[[VAL_4]] : index
+   !CHECK: %[[VAL_6:.*]] = arith.select %[[VAL_5]], %[[VAL_3]], %[[VAL_4]] : index
+   !CHECK: %[[VAL_7:.*]] = arith.constant 1024 : i64
+   !CHECK: %[[VAL_8:.*]] = fir.convert %[[VAL_7]] : (i64) -> index
+   !CHECK: %[[VAL_9:.*]] = arith.constant 0 : index
+   !CHECK: %[[VAL_10:.*]] = arith.cmpi sgt, %[[VAL_8]], %[[VAL_9]] : index
+   !CHECK: %[[VAL_11:.*]] = arith.select %[[VAL_10]], %[[VAL_8]], %[[VAL_9]] : index
+   !CHECK: %[[VAL_12:.*]] = fir.alloca !fir.array<?x1024xi32>, %[[VAL_6]] {bindc_name = "a", uniq_name = "_QFomp_target_implicit_boundsEa"}
    integer :: n
    integer :: a(n, 1024)
-   !CHECK: %[[VAL_4:.*]] = omp.map_info var_ptr(%[[VAL_3]] : !fir.ref<!fir.array<?x1024xi32>>, !fir.array<?x1024xi32>)   map_clauses(implicit, tofrom) capture(ByRef) bounds(%{{.*}}) -> !fir.ref<!fir.array<?x1024xi32>> {name = "a"}
-   !CHECK: %[[VAL_5:.*]] = omp.map_info val(%[[VAL_1]] : index)  map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> index {name = ""}
-   !CHECK: %[[VAL_6:.*]] = omp.map_info val(%[[VAL_2]] : index)  map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> index {name = ""}
-   !CHECK: omp.target   map_entries(%[[VAL_4]] -> %[[ARG_1:.*]], %[[VAL_5]] -> %[[ARG_2:.*]], %[[VAL_6]] -> %[[ARG_3:.*]] : !fir.ref<!fir.array<?x1024xi32>>, index, index) {
-   !CHECK: ^bb0(%[[ARG_1]]: !fir.ref<!fir.array<?x1024xi32>>, %[[ARG_2]]: index, %[[ARG_3]]: index):
+   !CHECK: %[[VAL_13:.*]] = arith.constant 1 : index
+   !CHECK: %[[VAL_14:.*]] = arith.constant 0 : index
+   !CHECK: %[[VAL_15:.*]] = arith.subi %[[VAL_6]], %[[VAL_13]] : index
+   !CHECK: %[[VAL_16:.*]] = omp.bounds lower_bound(%[[VAL_14]] : index) upper_bound(%[[VAL_15]] : index) extent(%[[VAL_6]] : index) stride(%[[VAL_13]] : index) start_idx(%[[VAL_13]] : index)
+   !CHECK: %[[VAL_17:.*]] = arith.constant 0 : index
+   !CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_11]], %[[VAL_13]] : index
+   !CHECK: %[[VAL_19:.*]] = omp.bounds lower_bound(%[[VAL_17]] : index) upper_bound(%[[VAL_18]] : index) extent(%[[VAL_11]] : index) stride(%[[VAL_13]] : index) start_idx(%[[VAL_13]] : index)
+   !CHECK: %[[VAL_20:.*]] = omp.map_info var_ptr(%[[VAL_12]] : !fir.ref<!fir.array<?x1024xi32>>, !fir.array<?x1024xi32>) map_clauses(implicit, tofrom) capture(ByRef) bounds(%[[VAL_16]], %[[VAL_19]]) -> !fir.ref<!fir.array<?x1024xi32>> {name = "a"}
+   !CHECK: %[[VAL_21:.*]] = omp.map_info var_ptr(%[[VAL_0]] : !fir.ref<i32>, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref<i32> {name = ""}
+   !CHECK: omp.target map_entries(%[[VAL_20]] -> %[[VAL_22:.*]], %[[VAL_21]] -> %[[VAL_23:.*]] : !fir.ref<!fir.array<?x1024xi32>>, !fir.ref<i32>) {
+   !CHECK: ^bb0(%[[VAL_22]]: !fir.ref<!fir.array<?x1024xi32>>, %[[VAL_23]]: !fir.ref<i32>):
    !$omp target
-      !CHECK: %{{.*}} = fir.convert %[[ARG_1]] : (!fir.ref<!fir.array<?x1024xi32>>) -> !fir.ref<!fir.array<?xi32>>
-      !CHECK: %{{.*}} = arith.muli %{{.*}}, %[[ARG_2]] : index
-      a(11,22) = 33
+      !CHECK: %[[VAL_24:.*]] = fir.load %[[VAL_23]] : !fir.ref<i32>
+      !CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24]] : (i32) -> i64
+      !CHECK: %[[VAL_26:.*]] = arith.constant 0 : index
+      !CHECK: %[[VAL_27:.*]] = fir.convert %[[VAL_25]] : (i64) -> index
+      !CHECK: %[[VAL_28:.*]] = arith.cmpi sgt, %[[VAL_27]], %[[VAL_26]] : index
+      !CHECK: %[[VAL_29:.*]] = arith.select %[[VAL_28]], %[[VAL_27]], %[[VAL_26]] : index
+      !CHECK: %[[VAL_30:.*]] = arith.constant 33 : i32
+      !CHECK: %[[VAL_31:.*]] = fir.convert %[[VAL_22]] : (!fir.ref<!fir.array<?x1024xi32>>) -> !fir.ref<!fir.array<?xi32>>
+      !CHECK: %[[VAL_32:.*]] = arith.constant 1 : index
+      !CHECK: %[[VAL_33:.*]] = arith.constant 0 : index
+      !CHECK: %[[VAL_34:.*]] = arith.constant 11 : i64
+      !CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_34]] : (i64) -> index
+      !CHECK: %[[VAL_36:.*]] = arith.subi %[[VAL_35]], %[[VAL_32]] : index
+      !CHECK: %[[VAL_37:.*]] = arith.muli %[[VAL_32]], %[[VAL_36]] : index
+      !CHECK: %[[VAL_38:.*]] = arith.addi %[[VAL_37]], %[[VAL_33]] : index
+      !CHECK: %[[VAL_39:.*]] = arith.muli %[[VAL_32]], %[[VAL_29]] : index
+      !CHECK: %[[VAL_40:.*]] = arith.constant 22 : i64
+      !CHECK: %[[VAL_41:.*]] = fir.convert %[[VAL_40]] : (i64) -> index
+      !CHECK: %[[VAL_42:.*]] = arith.subi %[[VAL_41]], %[[VAL_32]] : index
+      !CHECK: %[[VAL_43:.*]] = arith.muli %[[VAL_39]], %[[VAL_42]] : index
+      !CHECK: %[[VAL_44:.*]] = arith.addi %[[VAL_43]], %[[VAL_38]] : index
+      !CHECK: %[[VAL_45:.*]] = fir.coordinate_of %[[VAL_31]], %[[VAL_44]] : (!fir.ref<!fir.array<?xi32>>, index) -> !fir.ref<i32>
+      !CHECK: fir.store %[[VAL_30]] to %[[VAL_45]] : !fir.ref<i32>
+      a(11, 22) = 33
       !CHECK: omp.terminator
    !$omp end target
 !CHECK: }
@@ ...
[truncated]

! print *, sp(1)
! print *, sp(5)
print *, sp(1)
print *, sp(5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, thank you for catching this!

Would it be possible to add a variation of one of the other tests to use a do loop? Or just convert one of the existing ones to use a do loop, instead of the awkward while loop that was in place to work around lack of implicit capture previously!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the test openmp/libomptarget/test/offloading/fortran/basic-target-region-array.f90 and changed it to target-region-implicit-array.f90 instead.

mlir::Operation *clonedVal = val.getDefiningOp()->clone();
regionBlock->push_front(clonedVal);
val.replaceUsesWithIf(
clonedVal->getResult(0), [regionBlock](mlir::OpOperand &use) {
Copy link
Contributor

@agozillon agozillon Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for an operation tied to "trivial" to have more than one result? I'm wondering if we need to cover the scenario where we have something with multiple returns, but that can probably come in a future patch if we ever do run into it I imagine and isn't necessary for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these values are explicitly coming from the BoxValue bounds, I'm not expecting them to have more than one result.

I'll add an assert for this perhaps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair enough, no need to add an assert! Was more curious than anything as I've seen some of the HLFIR/FIR operations can have multiple returns

@agozillon
Copy link
Contributor

This LGTM, just a few small comments. However, I defer judgement to the other more knowledgeable reviewers

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not look at the details of the lowering code, but from the patch description and looking at the tests, I am afraid that the new approach may be incorrect since it is re-evaluating the specification expressions inside the target region, but there is no guarantee that the specification expression will evaluate again to the same value once the execution part of a subprogram started.

For instance, in the following program, if you are reading n again at the point of the target directive to get a extent, the extent will not be correct.

subroutine omp_target_implicit_bounds(n)
  integer :: n
  integer :: a(n, 1024)
  n = n + 1
   !$omp target
         a(11, 22) = 33
   !$omp end target
end subroutine omp_target_implicit_bounds

So if you need to pass the extents/lengths by reference, you will need to make temporary from the SSA values and map that instead.

@TIFitis
Copy link
Member Author

TIFitis commented Nov 16, 2023

So if you need to pass the extents/lengths by reference, you will need to make temporary from the SSA values and map that instead.

Thanks for catching this, it slipped my mind. Non trivial values are now copied to a new temporary and then mapped and used inside the region.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should not special case the load case but rather stay away from any cloning unless maybe when you detect a constant. Load op is not the only operation with side effects that may be emitted. Specification expressions may for instance contain pure calls (in the Fortran sense, which means they can still observe global states that could change).

So I would advise just making a temp from the bounds value instead of doing non trivial cloning (you can maybe just special case constants with fir::getIntIfConstant helper without doing any cloning with the cloning infrastructure).

@TIFitis
Copy link
Member Author

TIFitis commented Nov 17, 2023

I think you should not special case the load case but rather stay away from any cloning unless maybe when you detect a constant. Load op is not the only operation with side effects that may be emitted. Specification expressions may for instance contain pure calls (in the Fortran sense, which means they can still observe global states that could change).

I have special cased the LoadOp because that is the most common case we are probably going to encounter. These ops are only coming from the BoxValue bounds, so I imagine that we wouldn't be seeing too many variations.

I've added the other cases as a TODO failure, so we can handle them later as we expand support or come up with a more generalised approach for ops that might have side effects.

So I would advise just making a temp from the bounds value instead of doing non trivial cloning (you can maybe just special case constants with fir::getIntIfConstant helper without doing any cloning with the cloning infrastructure).

I would like to add as little new temps inside the target region as possible. Using the getIntIfConstant we need to create the constantOps, shapeOps and potentially other ops such as select etc as well. Wouldn't it be cleaner to use the clone helper to create them rather than manually?

Is there any particular reason why we are avoiding the cloning infrastructure for trivial type values here?

Also the targetOp is protected with the IsolatedFromAbove trait, so we can be sure that we aren't unintentionally adding ops with side effects.

@jeanPerier
Copy link
Contributor

Wouldn't it be cleaner to use the clone helper to create them rather than manually?

If you use side effect interface to check that you are not cloning anything with side effects, it is OK with me. I think your current patch is still cloning things it should not (like calls, but you should not have a "reject list", you should either use MLIR side effect interface or have an "op-in" list for things you know you want to clone because this is the common case you want to optimize: e.g: constant and shape.... Using the MLIR side effect interface is likely the cleanest).

Here is an example of something I think the patch may not be dealing with correctly (cloning a function call while the function may evaluate to different value):

module m
integer :: n

contains
pure integer function get_n()
  get_n = n
end function

subroutine omp_target_implicit_bounds()
  integer :: a(get_n(), 1024)
  n = n + 1
  !$omp target
    a(11, 22) = 33
  !$omp end target
end subroutine omp_target_implicit_bounds
end module

@TIFitis
Copy link
Member Author

TIFitis commented Nov 22, 2023

If you use side effect interface to check that you are not cloning anything with side effects, it is OK with me. I think your current patch is still cloning things it should not (like calls, but you should not have a "reject list", you should either use MLIR side effect interface or have an "op-in" list for things you know you want to clone because this is the common case you want to optimize: e.g: constant and shape.... Using the MLIR side effect interface is likely the cleanest).

Thanks for this suggestion. I have updated the code to only clone ops if they are mlir::isMemoryEffectFree, else their value is stored into a new temporary to be loaded and used inside the region.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the cloning mechanism. That solves my comment.

This patch removes the `val` field from the MapInfoOp.

Previously when lowering TargetOp, the bounds information for the BoxValues were also being mapped. Instead these ops are now duplicated inside the target region to prevent mapping of non reference typed values.
@TIFitis
Copy link
Member Author

TIFitis commented Nov 23, 2023

Thanks for the review. I'll leave the patch up till tomorrow in case someone has any other suggestions.

@TIFitis TIFitis merged commit f1d7738 into llvm:main Nov 24, 2023
@TIFitis TIFitis deleted the fix-target-val branch November 30, 2023 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category mlir:llvm mlir:openmp mlir openmp:libomptarget OpenMP offload runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants