Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 28 additions & 1 deletion lld/ELF/Arch/RISCV.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -651,12 +651,39 @@ static void relaxHi20Lo12(const InputSection &sec, size_t i, uint64_t loc,
if (!isInt<12>(r.sym->getVA(r.addend) - gp->getVA()))
return;

// The symbol may be accessed in multiple pieces with different addends.
// If we are relaxing the HI20 relocation, we need to ensure that we only
// relax (and delete the instruction) if all possible LO12 relocations
// that depend on it will be relaxable. The compiler will only access multiple
// pieces of an object with low relocations on the memory op if the alignment
// allows it. Therefore it should suffice to check that the smaller of the
// alignment and size can be reached from GP.
uint32_t alignAdjust =
r.sym->getOutputSection() ? r.sym->getOutputSection()->addralign : 0;
alignAdjust = std::min<uint32_t>(alignAdjust, r.sym->getSize());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignAdjust = std::min<uint32_t>(alignAdjust, r.sym->getSize()); is not tested

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I added a test now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If addend < st_size, it seems that this can be improved to min(addralign, st_size-addend)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I think that is indeed the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is only true if we assume that the addend on the LO12 will be equal or greater than the addend on the HI20.
Ultimately, we need to ensure that the entire window of
min(st_size, st_align) bytes that contains the relocation is reachable from GP.

if (alignAdjust)
alignAdjust--;

switch (r.type) {
case R_RISCV_HI20:
case R_RISCV_HI20: {
uint64_t hiAddr = r.sym->getVA(r.addend);
// If the addend is zero, the LO12 relocations can only be accessing the
// range [base, base+alignAdjust] (where base == r.sym->getVA()).
if (r.addend == 0 && !isInt<12>(hiAddr + alignAdjust - gp->getVA()))
Copy link
Member

@MaskRay MaskRay Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R_RISCV_HI20 and R_RISCV_LO12_I/R_RISCV_LO12_S should have consistent decisions on whether to do relaxation. Placing the condition only at R_RISCV_HI20 could lead to inconsistent results. I think the condition should be moved closer to if (!isInt<12>(r.sym->getVA(r.addend) - gp->getVA())) above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here was that only this direction involves a functional problem. Relaxing the LO12 and not relaxing the HI20 leaves a redundant HI20, but that's about it. Whereas the other way is the crux of the problem. However, if you'd prefer that I implement the LO12 pessimistically as well, I am not against it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we retain lui/HI20 but relax addi/LO12_i, the output will be lui a0, ...; addi a0, gp, ... with a redundant lui. I think it's confusing to leave lui there and the addi change has no benefit anyway. Therefore, I prefer that we disable the relaxation completely, which aligns with GNU ld.

return;

// However, if the addend is non-zero, the LO12 relocations may be accessing
// the range [HI-alignAdjust-1, HI+alignAdjust].
if (r.addend != 0 &&
(!isInt<12>(hiAddr - alignAdjust - 1 - gp->getVA()) ||
!isInt<12>(hiAddr + alignAdjust - gp->getVA())))
return;

// Remove lui rd, %hi20(x).
sec.relaxAux->relocTypes[i] = R_RISCV_RELAX;
remove = 4;
break;
}
case R_RISCV_LO12_I:
sec.relaxAux->relocTypes[i] = INTERNAL_R_RISCV_GPREL_I;
break;
Expand Down
40 changes: 40 additions & 0 deletions lld/test/ELF/riscv-relax-gp-partial-align-min.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# REQUIRES: riscv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need lld/test/ELF/riscv-relax-gp-* files? Can they be folded into two files or even one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually wondering the same. I don't have much experience writing these. I'll give it a shot.

# RUN: rm -rf %t && split-file %s %t && cd %t

# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax a.s -o rv32.o
# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax a.s -o rv64.o

# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv32.o lds -o rv32
# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv64.o lds -o rv64
# RUN: llvm-objdump -td -M no-aliases --no-show-raw-insn rv32 | FileCheck %s
# RUN: llvm-objdump -td -M no-aliases --no-show-raw-insn rv64 | FileCheck %s

# CHECK: 000017e0 l .data {{0+}}80 Var1
# CHECK: 00000ffc g .sdata {{0+}}00 __global_pointer$

# CHECK: <_start>:
# CHECK-NEXT: lui a1, 1
# CHECK-NEXT: lw a0, 2020(gp)
# CHECK-NEXT: lw a1, 2044(a1)

#--- a.s
.global _start
_start:
lui a1, %hi(Var1)
lw a0, %lo(Var1)(a1) # First part is reachable from gp
lw a1, %lo(Var1+28)(a1) # The second part is not reachable

.section .sdata,"aw"
.section .data,"aw"
.p2align 5
Var1:
.quad 0
.zero 120
.size Var1, 128

#--- lds
SECTIONS {
.text : { }
.sdata 0x07fc : { }
.data 0x17E0 : { }
}
54 changes: 54 additions & 0 deletions lld/test/ELF/riscv-relax-gp-partial-hi20-addend.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# REQUIRES: riscv
# RUN: rm -rf %t && split-file %s %t && cd %t

# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax a.s -o rv32.o
# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax a.s -o rv64.o

# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv32.o lds -o rv32
# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv64.o lds -o rv64
# RUN: llvm-objdump -td -M no-aliases --no-show-raw-insn rv32 | FileCheck %s
# RUN: llvm-objdump -td -M no-aliases --no-show-raw-insn rv64 | FileCheck %s

# CHECK: 00001000 l .data {{0+}}08 Var0
# CHECK: 00001f80 l .data1 {{0+}}80 Var1
# CHECK: 00001800 g .sdata {{0+}}00 __global_pointer$

# CHECK: <_start>:
# CHECK-NEXT: lui a1, 1
# CHECK-NEXT: lw a0, -2048(gp)
# CHECK-NEXT: lw a1, -2044(gp)
# CHECK-NEXT: lui a1, 2
# CHECK-NEXT: lw a0, 1920(gp)
# CHECK-NEXT: lw a1, 2044(gp)

#--- a.s
.global _start
_start:
lui a1, %hi(Var0+4) # Cannot prove that %lo relocs will be reachable
lw a0, %lo(Var0)(a1) # Reachable from GP
lw a1, %lo(Var0+4)(a1) # Reachable from GP
lui a1, %hi(Var1+124) # Cannot prove that %lo relocs will be reachable
lw a0, %lo(Var1)(a1) # Reachable from GP
lw a1, %lo(Var1+124)(a1) # Reachable from GP

.section .sdata,"aw"
.section .data,"aw"
.p2align 3
Var0:
.quad 0
.size Var0, 8

.section .data1,"aw"
.p2align 7
Var1:
.quad 0
.zero 120
.size Var1, 128

#--- lds
SECTIONS {
.text : { }
.sdata 0x1000 : { }
.data 0x1000 : { }
.data1 0x1f80 : { }
}
58 changes: 58 additions & 0 deletions lld/test/ELF/riscv-relax-gp-partial.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# REQUIRES: riscv
# RUN: rm -rf %t && split-file %s %t && cd %t

# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax a.s -o rv32.o
# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax a.s -o rv64.o

# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv32.o lds -o rv32
# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv64.o lds -o rv64
# RUN: llvm-objdump -td -M no-aliases --no-show-raw-insn rv32 | FileCheck %s
# RUN: llvm-objdump -td -M no-aliases --no-show-raw-insn rv64 | FileCheck %s

# CHECK: 00000080 l .data {{0+}}08 Var0
# CHECK: 00001000 l .data {{0+}}80 Var1
# CHECK: 00000815 g .sdata {{0+}}00 __global_pointer$

# CHECK: <_start>:
# CHECK-NOT: lui
# CHECK-NEXT: lw a0, -1941(gp)
# CHECK-NEXT: lw a1, -1937(gp)
# CHECK-NEXT: lui a1, 1
# CHECK-NEXT: lw a0, 2027(gp)
# CHECK-NEXT: lw a1, 124(a1)

#--- a.s
.global _start
_start:
lui a1, %hi(Var0)
lw a0, %lo(Var0)(a1)
lw a1, %lo(Var0+4)(a1)
lui a1, %hi(Var1)
lw a0, %lo(Var1)(a1) # First part is reachable from gp
lw a1, %lo(Var1+124)(a1) # The second part is not reachable

.section .rodata
foo:
.space 1
.section .sdata,"aw"
.space 1
.section .data,"aw"
.p2align 3
Var0:
.quad 0
.size Var0, 8
.space 3960
.p2align 7
Var1:
.quad 0
.zero 120
.size Var1, 128

#--- lds
SECTIONS {
.text : { }
.rodata : { }
.sdata : { }
.sbss : { }
.data : { }
}