Skip to content

Conversation

cjappl
Copy link
Owner

@cjappl cjappl commented Dec 3, 2024

Ok - a lot going on here.

Top level goal

This is all to get the following error to be printed when a plug-in is loaded in a DAW without the interceptors being enabled:

==9891==ERROR: Interceptors are not working. This may be because RealtimeSanitizer is loaded too late (e.g. via dlopen). Please launch the executable with:
DYLD_INSERT_LIBRARIES=/usr/local/lib/libclang_rt.rtsan_osx_dynamic.dylib
"interceptors not installed" && 0
fish: Job 1, '/Applications/REAPER.app/Conten…' terminated by signal SIGABRT (Abort)

This will (ideally) lead people to track down this issue faster.

Problem description

A DAW loading a plug-in is just a host process loading a shared library with the dlopen call. So this pertains to ALL other shared libraries that get loaded in this method.

If you build a shared library instrumented with rtsan, and that library is opened with dlopen and the interceptors are not working - we would ideally like to have a nice error message and abort instead of just silently carrying on. This is what At Salot's issue was in the discord - just silent failure loading his rtsan-enabled plugin in a DAW with no indication of what is going wrong.

The other sanitizers (namely ASan and TSan) have this error and emit it under the same circumstance, you can see evidence of it here for ASan.

This entire fix is to put in InitializePlatformEarly() into __rtsan_init() and make sure it is called in every binary at least once.

WTF does this have to do with clang and LLVM?

OK - we have previously believed that on Mac, the first intercepted function called will always enable RTSan. This is absolutely correct, but what if the interceptors are not installed, as in the case of a shared library?

We need to install into our shared library the initializer in the global namespace. This means creating another pass, a Module pass that calls __rtsan_ensure_initialized() in the shared library.

When this shared library is instrumented, this call is inserted.
When the shared library is opened with dlopen - __rtsan_ensure_initialized() will be called
When that is called, it will call InitializePlatformEarly and abort the process if the interceptors are not working.

These steps were directly copped from TSan and Asan, if you look in the LLVM pass for threadsanitizer, you will see this code exactly. Almost all tests and code have been taken and translated from the TSan version of this same thing.

Other ways to solve this

  • Documentation and education - "if you don't see rtsan working when it is running from within another application do X"
  • ???

Other details

  • One can turn on/off this check with RTSAN_OPTIONS="verify_interceptors=0"
  • The prescribed solution in the error message works as intended - verified by At Salot in discord

@cjappl
Copy link
Owner Author

cjappl commented Dec 3, 2024

@davidtrevelyan I need your thoughts on this - if you agree that we should submit it I am going to break it into at least 2 reviews (possibly 3) and do upstream PRs

Copy link

@davidtrevelyan davidtrevelyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent! Thank you so much for putting this together @cjappl. Two very minor comments about test debug output, and one more interesting comment about test coverage for instrumented binaries.

Great work man!


// Launching a non-instrumented binary that dlopen's an instrumented library should fail.
// RUN: not %run %t %t.so 2>&1 | FileCheck %s --check-prefix=CHECK-FAIL
// Launching a non-instrumented binary with an explicit DYLD_INSERT_LIBRARIES should work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it also be helpful to have a test for an instrumented binary as well?

Copy link
Owner Author

@cjappl cjappl Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to try to put a test for this in the final version - I tested it locally and it "just works" like normal. I assume just one of the copies of rtsan is chosen over the other one

#else // defined(SHARED_LIB)
int main(int argc, char *argv[]) {
void *handle = dlopen(argv[1], RTLD_NOW);
fprintf(stderr, "handle = %p\n", handle);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I see this is probably helpful for debugging, but it looks like the test doesn't need this print out. Just flagging in case you meant to remove it

void *handle = dlopen(argv[1], RTLD_NOW);
fprintf(stderr, "handle = %p\n", handle);
void (*foo)() = (void (*)())dlsym(handle, "foo");
fprintf(stderr, "foo = %p\n", foo);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Likewise, I think this isn't needed for the test logic - flagging in case you want to delete it

@cjappl cjappl closed this Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants