Skip to content

[lld][MachO] Fix symbol insertion in transplantSymbolsAtOffset #120737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 22, 2024

Conversation

carlocab
Copy link
Member

@carlocab carlocab commented Dec 20, 2024

The existing comparison does not insert symbols in the intended place.

Closes #120559.

@llvmbot
Copy link
Member

llvmbot commented Dec 20, 2024

@llvm/pr-subscribers-lld

Author: Carlo Cabrera (carlocab)

Changes

The existing comparison does not insert symbols in the intended place.

Unfortunately, it is not very clear how to test this. Suggestions
appreciated.

Closes #120559.


Full diff: https://github.com/llvm/llvm-project/pull/120737.diff

1 Files Affected:

  • (modified) lld/MachO/SymbolTable.cpp (+1-1)
diff --git a/lld/MachO/SymbolTable.cpp b/lld/MachO/SymbolTable.cpp
index f0a92da8777e13..36e421b52abcac 100644
--- a/lld/MachO/SymbolTable.cpp
+++ b/lld/MachO/SymbolTable.cpp
@@ -69,7 +69,7 @@ static void transplantSymbolsAtOffset(InputSection *fromIsec,
   // Ensure the symbols will still be in address order after our insertions.
   auto insertIt = llvm::upper_bound(toIsec->symbols, toOff,
                                     [](uint64_t off, const Symbol *s) {
-                                      return cast<Defined>(s)->value < off;
+                                      return cast<Defined>(s)->value > off;
                                     });
   llvm::erase_if(fromIsec->symbols, [&](Symbol *s) {
     auto *d = cast<Defined>(s);

@llvmbot
Copy link
Member

llvmbot commented Dec 20, 2024

@llvm/pr-subscribers-lld-macho

Author: Carlo Cabrera (carlocab)

Changes

The existing comparison does not insert symbols in the intended place.

Unfortunately, it is not very clear how to test this. Suggestions
appreciated.

Closes #120559.


Full diff: https://github.com/llvm/llvm-project/pull/120737.diff

1 Files Affected:

  • (modified) lld/MachO/SymbolTable.cpp (+1-1)
diff --git a/lld/MachO/SymbolTable.cpp b/lld/MachO/SymbolTable.cpp
index f0a92da8777e13..36e421b52abcac 100644
--- a/lld/MachO/SymbolTable.cpp
+++ b/lld/MachO/SymbolTable.cpp
@@ -69,7 +69,7 @@ static void transplantSymbolsAtOffset(InputSection *fromIsec,
   // Ensure the symbols will still be in address order after our insertions.
   auto insertIt = llvm::upper_bound(toIsec->symbols, toOff,
                                     [](uint64_t off, const Symbol *s) {
-                                      return cast<Defined>(s)->value < off;
+                                      return cast<Defined>(s)->value > off;
                                     });
   llvm::erase_if(fromIsec->symbols, [&](Symbol *s) {
     auto *d = cast<Defined>(s);

The existing comparison does not insert symbols in the intended place.

Unfortunately, it is not very clear how to test this. Suggestions
appreciated.

Closes llvm#120559.
@carlocab carlocab force-pushed the fix-transplantSymbolsAtOffset branch from 627f54b to 7353c9b Compare December 20, 2024 14:14
@carlocab carlocab changed the title [lld][MachO] Fix symbol inseration in transplantSymbolsAtOffset [lld][MachO] Fix symbol insertion in transplantSymbolsAtOffset Dec 20, 2024
@carlocab
Copy link
Member Author

These comments probably best describe the problem and why this is the correct fix: #120559 (comment), #120559 (comment)

@alx32
Copy link
Contributor

alx32 commented Dec 20, 2024

About adding a test for this - lld/test/MachO/alias-symbols.s already triggers the faulty insert. Currently all the offsets are 0's so the bug doesn't get reproduced.
I think we should be able to slightly tweak alias-symbols.s to insert a symbol with a non-zero offset and then add an assert(std::is_sorted) at the end of transplantSymbolsAtOffset and that will be our test. Have you looked into weather this is possible ?

@carlocab
Copy link
Member Author

I hadn't looked at alias-symbols.s since that wasn't one of the failing tests flagged at #120559, but the approach you suggest sounds promising.

I'll give it a try, but would appreciate as many hints as possible. Not really familiar with how symbols are laid out in x86 assembly.

@bjope
Copy link
Collaborator

bjope commented Dec 20, 2024

Nice idea with the assert. Adding
assert(llvm::is_sorted(toIsec->symbols) && "Symbols should still be sorted at exit.");
at end of transplantSymbolsAtOffset seem to result in assertion failure for the lld/test/MachO/symtab.s.

So that might be enough, considering that there already is a test case that would trigger such an assert.

@bjope
Copy link
Collaborator

bjope commented Dec 20, 2024

Nice idea with the assert. Adding assert(llvm::is_sorted(toIsec->symbols) && "Symbols should still be sorted at exit."); at end of transplantSymbolsAtOffset seem to result in assertion failure for the lld/test/MachO/symtab.s.

So that might be enough, considering that there already is a test case that would trigger such an assert.

Ooops, was a bit fast. That is_sorted call needs to supply the comparison function.
Guess the problem is to understand which comparison function to provide given that the fault is in the comparison function :-)

@carlocab
Copy link
Member Author

Guess the problem is to understand which comparison function to provide given that the fault is in the comparison function :-)

This comment notwithstanding, I added an assertion anyway since it seems like an improvement. Interestingly, the assertion I just added does not fire with Apple Clang/libc++ even if I drop the s/</>/ change from the first commit here.

@alx32
Copy link
Contributor

alx32 commented Dec 20, 2024

the assertion I just added does not fire with Apple Clang/libc++ even if I drop the s/</>/ change from the first commit here.

Does it fire with any other build config when run against the current test set ?
BTW the suggestion about modifying lld/test/MachO/alias-symbols.s was to have it trigger the assert with Clang/libc++ with the current test set.

@carlocab
Copy link
Member Author

Does it fire with any other build config when run against the current test set ?

It should fire with GCC, except GCC crashes before the assert is reached.

BTW the suggestion about modifying lld/test/MachO/alias-symbols.s was to have it trigger the assert with Clang/libc++ with the current test set.

Yes, understood, but I'm still looking into the right way to modify alias-symbols.s. As mentioned above, tips appreciated!

@alx32
Copy link
Contributor

alx32 commented Dec 20, 2024

It should fire with GCC, except GCC crashes before the assert is reached.

GCC doesn't crash without -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON - right ?
Does the assert fire with GCC without the -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON flag ?

Yes, understood, but I'm still looking into the right way to modify alias-symbols.s. As mentioned #120737 (comment), tips appreciated!

Currently I think _weak_2 symbol is the one being inserted. I would either:

  • Modify the _weak_2 symbol to have a value of 1.
  • Add another _weak_3 that has the value of 1.

Given the tests have .subsections_via_symbols this will be a bit tricky - would need to play around with multiple options to get this to happen. It's not obvious right away what would work here.

@carlocab
Copy link
Member Author

carlocab commented Dec 20, 2024

GCC doesn't crash without -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON - right ?

Right.

Does the assert fire with GCC without the -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON flag ?

No, weirdly.

However, it seems we can assert on std::is_partitioned instead. This fires with both Clang and GCC before fixing the comparison lambda.

Currently I think _weak_2 symbol is the one being inserted. I would either:

  • Modify the _weak_2 symbol to have a value of 1.
  • Add another _weak_3 that has the value of 1.

Given the tests have .subsections_via_symbols this will be a bit tricky - would need to play around with multiple options to get this to happen. It's not obvious right away what would work here.

Thanks; I'll give this a try after the weekend.

@carlocab carlocab force-pushed the fix-transplantSymbolsAtOffset branch 2 times, most recently from cbadcfb to b99bbc0 Compare December 20, 2024 19:55
Comment on lines 100 to 104
assert(llvm::is_sorted(toIsec->symbols,
[](const Defined *s, const Defined *t) {
return s->value < t->value;
}) &&
"Symbols should still be sorted at exit.");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably get rid of this one, since I can't get it to fire. My theory is that the failure of this assertion can only be a consequence of UB, so the compiler just optimises this away. (No real evidence to support this though, so could be wrong!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does is_partitioned trigger without expensive checks ? On which tests does it trigger ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've verified that is_partitioned triggers with expensive checks with Clang, and without expensive checks with GCC.

The failing tests (for both) are

Failed Tests (3):
  lld :: MachO/local-alias-to-weak.s
  lld :: MachO/symtab.s
  lld :: MachO/weak-definition-gc.s

I haven't yet verified the assertion failure for Clang without expensive checks. The build is running now; I'll report back soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified with Clang without expensive checks and the comparison in symSucceedsOff reversed:

Failed Tests (3):
  lld :: MachO/local-alias-to-weak.s
  lld :: MachO/symtab.s
  lld :: MachO/weak-definition-gc.s

To be sure, all errors are of the form

Assertion failed: (std::is_partitioned(toIsec->symbols.begin(), toIsec->symbols.end(), [toOff, symSucceedsOff](const Symbol *s) { return !symSucceedsOff(toOff, s); }) && "Symbols in toIsec must be partitioned by toOff."), function transplantSymbolsAtOffset, file SymbolTable.cpp, line 78.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for checking this.

This should help catch problems with symbol insertion.
@carlocab carlocab force-pushed the fix-transplantSymbolsAtOffset branch from b99bbc0 to e77e289 Compare December 21, 2024 15:35
@carlocab
Copy link
Member Author

Tried this:

diff --git a/lld/MachO/SymbolTable.cpp b/lld/MachO/SymbolTable.cpp
index a61e60a944fb..4b6e462f6bdd 100644
--- a/lld/MachO/SymbolTable.cpp
+++ b/lld/MachO/SymbolTable.cpp
@@ -68,7 +68,7 @@ static void transplantSymbolsAtOffset(InputSection *fromIsec,
                                       uint64_t fromOff, uint64_t toOff) {
   // Ensure the symbols will still be in address order after our insertions.
   auto symSucceedsOff = [](uint64_t off, const Symbol *s) {
-    return cast<Defined>(s)->value > off;
+    return cast<Defined>(s)->value < off;
   };
   assert(std::is_partitioned(toIsec->symbols.begin(), toIsec->symbols.end(),
                              [symSucceedsOff, toOff](const Symbol *s) {
@@ -96,6 +96,11 @@ static void transplantSymbolsAtOffset(InputSection *fromIsec,
     }
     return true;
   });
+  assert(llvm::is_sorted(toIsec->symbols,
+                         [](const Defined *s, const Defined *t) {
+                           return s->value < t->value;
+                         }) &&
+         "Symbols should still be sorted at exit.");
 }
 
 Defined *SymbolTable::addDefined(StringRef name, InputFile *file,
diff --git a/lld/test/MachO/alias-symbols.s b/lld/test/MachO/alias-symbols.s
index 44a388c35722..231ada5aa4c8 100644
--- a/lld/test/MachO/alias-symbols.s
+++ b/lld/test/MachO/alias-symbols.s
@@ -124,7 +124,9 @@ _strong:
 _weak_1:
   .space 1
 _weak_2:
-  .space 1
+  .long 1
+_weak_3:
+  .long 1
 _dead:
   .space 1
 _pext:

Still no failure from lld/test/MachO/alias-symbols.s. Given that the is_partitioned assertion isn't even firing here, I wonder if trying to modify alias-symbols.s is going to be a lot more complicated than working with the other tests that are failing.

Copy link
Contributor

@alx32 alx32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@carlocab carlocab marked this pull request as draft December 22, 2024 13:25
@carlocab
Copy link
Member Author

carlocab commented Dec 22, 2024

To be sure that we're not in a "works on my machine" situation, I pushed 0db5264e5918221edec3dacd2e8c0c6328769517 to ensure that the is_partitioned assertion triggers in CI. Sure enough, it does:

https://buildkite.com/llvm-project/github-pull-requests/builds/131531#0193ee8b-b917-40e0-a963-6ac1f5e01685
https://buildkite.com/llvm-project/github-pull-requests/builds/131531#0193ee8b-b918-4754-9f43-fc24ea57463b

Going to revert that commit then merge.

@carlocab carlocab force-pushed the fix-transplantSymbolsAtOffset branch from 0db5264 to e77e289 Compare December 22, 2024 13:42
@carlocab carlocab marked this pull request as ready for review December 22, 2024 13:42
@carlocab carlocab merged commit a0f0a69 into llvm:main Dec 22, 2024
13 checks passed
@carlocab carlocab deleted the fix-transplantSymbolsAtOffset branch December 22, 2024 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[lld][MachO] llvm::upper_bound called on not correctly sorted input
4 participants