-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[PowerPC] Support local-dynamic TLS relocation on AIX #66316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-mc ChangesTo enable TLS local-dynamic on AIX, according to docs, this patch need to generate below sequence of code:.tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier Patch migrated from https://reviews.llvm.org/D157673Patch is 134.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66316.diff 16 Files Affected:
<pre>
diff --git a/llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp b/llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp
-/// This helper function creates the TlsGetAddr MCSymbol for AIX. We will
+// On AIX, TLS-local-dynamic requires that symbol for the module handle must
+}
@@ -964,7 +997,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
|
@bzEq @ecnelises If there is no major concern, can we land this first before the weekend? Thank you! |
@@ -32,7 +33,7 @@ entry: | |||
; RELOC-NEXT: Section (index: 1) .text { | |||
; RELOC-NEXT: Relocation { | |||
; RELOC-NEXT: Virtual Address: 0x16 | |||
; RELOC-NEXT: Symbol: .TIInit (17) | |||
; RELOC-NEXT: Symbol: TIInit (19) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is not this patch's scope. But it may make more sense to use a RE to handle the symbol index. We were trying to fix this, but I guess we didn't change them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments may need follow up, so I posted them first. I will continue to review the code generation part.
We need a front end patch to eliminate the error report in clang/lib/Frontend/CompilerInvocation.cpp
and clang/lib/Sema/SemaDeclAttr.cpp
, right?
@@ -240,15 +241,35 @@ entry: | |||
; SYM-NEXT: } | |||
; SYM-NEXT: Symbol { | |||
; SYM-NEXT: Index: 3 | |||
; SYM-NEXT: Name: | |||
; SYM-NEXT: Name: .__tls_get_addr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment I know this is not this patch's scope. But it may make more sense to use a RE to handle the symbol index. We were trying to fix this, but I guess we didn't change them all.
is more appropriate here.
I tried to parallelize the review effort, but github didn't show the delta for me: #66972 |
; SMALL32-NEXT: stw 0, 40(1) | ||
; SMALL32-NEXT: bla .__tls_get_addr[PR] | ||
; SMALL32-NEXT: lwz 6, L..C4(2) # target-flags(ppc-tlsld) @TIInit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use r4
here? That should align with the assembler manual and save register use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__tls_get_mod() clobbers r4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated a version to address the issue, will continue improve this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New approach will invalidate d400de0, please ignore that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now for the small-model, we did reuse r4. However for the large-model, since machine-scheduler will decide to move the ADDIS before the .__tls_get_mod call, I did not try to move ADDIS in PPCTLSDynamicCall.cpp
// PPCTLSDynamicCall. | ||
SDValue VariableOffsetTGA = | ||
DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, PPCII::MO_TLSLD_FLAG); | ||
SDValue VariableOffset = getTOCEntry(DAG, dl, VariableOffsetTGA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it missing TOC entry for the module handle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. We can add another flag PPCII::MO_TLSLDM_FLAG
(Maybe there is not enough bit in PPCII
, we can use a distinct value, or extend the MO_ACCESS_MASK to free more bits.).
With this way, seems we don't need the "strange" function isSpecialAIXSymbolTLSML()
in the PPCAsmPrinter file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was thinking we need the TOC entry for the module handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a blocker which limits FLAG bit size to be 12:
llvm/include/llvm/CodeGen/MachineOperand.h
...
unsigned SubReg_TargetFlags : 12;
...
void setTargetFlags(unsigned F) {
assert(!isReg() && "Register operands can't have target flags");
SubReg_TargetFlags = F;
assert(SubReg_TargetFlags == F && "Target flags out of range");
}
Currently PPCII::TOF is already full. I'm afraid this will be a FIXME...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still didn't figure out how to remove the strange function.
The way I tried was getOrCreate ExternalLinkage GV named "_$TLSML", and call getTOCEntry on that GV. However there will be a symbol table entry for the symbol, but we don't want that.
For asm mode, I can add hack in MCAsmStreamer::emitXCOFFSymbolLinkageWithVisibility to do not emit the ".extern _Renamed..5f24__TLSML; rename _Renamed..5f24__TLSML ......".
For obj mode, things are more complex, and I'm not able to cleanly remove the symbol table entry without by using hack approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I did have another version which can essentially remove the assert check with some side effects:
(1) It did generated extra external symbol reference, which looks like no harm with simple HelloWorld. (Notice: xlC does not generate this symbol reference.)
.extern _Renamed..5f24__TLSML[UA]
.rename _Renamed..5f24__TLSML[UA],"_$TLSML"
(2) Inside XCOFFObjectWriter::recordRelocation()
logic I have to apply a hack to get the correct symbol, because now there are two symbols: one is _Renamed..5f24__TLSML[UA]
, the other is _Renamed..5f24__TLSML[TC]
. Without hack it will choose the [UA] one which will cause linker failure.
Attached the draft patch: ReplaceAssertHack.diff.txt
I think current version has the advantage that it does not generate any extra external symbol reference or any extra relocation entry, although it looks ugly with regard to those asserts.
Is it possible that we fix the hack implementation with some future version (I will open an issue and work on it afterwards)? How about let's move on with current approach?
@amy-kwan @stephenpeckham @bzEq @chenzheng1030 Appreciate your comments. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this a bit offline, and we should add a FIXME (thanks for adding it!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4b932d8 is committed to redesign the target flags of machine operand on PPC. Now we should be able to add new flags.
Let's include FE change in this PR. |
(2) Update comments.
The "_$TLSML" symbol did not lower through getTOCEntry().
The "_$TLSML" symbol did not lower through getTOCEntry(). The pseudo node "TLSLDAIX" now takes the _$TLSML GV node, and the ppc-tls-dynamic-call pass is updated to fine tune the relative order between the LoadOffset@toc node and the .__tls_get_mod node.
… getSectionForTOCEntry()
Since `getSectionForExternalReference` explicitly set XCOFF::XTY_SD for the `_$TLSML` symbol reference, and that only CSType `XCOFF::XTY_ER` symbol references are added into `UndefinedCsects`, the checks inside `executePostLayoutBinding` are redundant. Update test case to make sure external reference to the `_$TLSML` symbol is not observed.
7740a3f
to
1a95e93
Compare
Update test case changed by #66316
It looks like a few of the regression tests are failing on the reverse-iteration buildbot (https://lab.llvm.org/buildbot/#/builders/54/builds/9683) |
Thank you! Posted: #89714 |
Supports TLS local-dynamic on AIX, generates below sequence of code: