Skip to content

A way to propagate link-time settings from libraries #14729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RReverser opened this issue Jul 22, 2021 · 27 comments
Open

A way to propagate link-time settings from libraries #14729

RReverser opened this issue Jul 22, 2021 · 27 comments

Comments

@RReverser
Copy link
Collaborator

RReverser commented Jul 22, 2021

Certain settings - for example, --bind, -s ASYNCIFY, -s ASYNCIFY_IMPORTS=[...] - are important for successful compilation of the final binary, that is, without those the application won't work.

When a library relies on any of those features, currently there is no way it can accept & store those settings as part of its own build. Instead, it must document the list of options the user has to pass at link-time when building their application for a successful build.

This seems like an unfortunate leak of implementation details, especially so for options like ASYNCIFY_IMPORTS that require listing names of internal helpers used for an async work.

We should find a way to instead accept those options at compile-time of each library, store them in a custom section of the object files, and then read them back and merge with other options in the linker.

cc @sbc100 @kripken

@kripken
Copy link
Member

kripken commented Jul 22, 2021

I think I remember @floooh asking for this, but that might have been a while ago and/or on Twitter... but maybe there is a previous discussion.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 22, 2021

Interesting idea regarding custom sections.

We could perhaps also do something with symbol properties like we do with EMSCRIPTEN_KEEPALIVE?

@RReverser
Copy link
Collaborator Author

for example, --bind, -s ASYNCIFY, -s ASYNCIFY_IMPORTS=[...]

Remembered another one where this would be useful - --js-library.

@RReverser
Copy link
Collaborator Author

We could perhaps also do something with symbol properties like we do with EMSCRIPTEN_KEEPALIVE?

Yeah that would work too. Although I think EMSCRIPTEN_KEEPALIVE is just a macro for standard visibility attribute, whereas anything else would still require custom sections.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 22, 2021

One downside this approach is that emscripten would have to inspect / parse all the object files passed to the linker. As of today we don't do that. We do run llvm-nm on them in some circumstances but I've been trying to stop doing that for a while now.

@RReverser
Copy link
Collaborator Author

One downside this approach is that emscripten would have to inspect / parse all the object files passed to the linker.

That's true, but parsing a single section out of Wasm should be very quick - it's designed for this kind of random access. And the benefits for the user outweigh the tooling costs IMO.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 22, 2021

How about adding a completely new .o files which is more of a linker script? e.g. emar cr libfoo.a code.o linkflags.o where code.o is built by the compiler and linkflags.o is some kind of text file?

@RReverser
Copy link
Collaborator Author

Idk, that kind of split seems unfortunate as it makes integration with existing build systems harder. Splitting it out also feels wrong because those flags are inherently tied to specific object files - if some object file ends up excluded by the linker as unnecessary, corresponding flags are not needed either.

In fact, we already have at least one place where we do this kind of merging & post-processing - EM_JS creates symbols in a custom linking section that are later discovered and postprocesser by Binaryen after linker took care of their merging.

Can't we do something similar for those settings? Those settings are slightly different because they need to be handled by Emscripten itself not Binaryen, but other than that the mechanism could be similar - add options to a section that linker understands, let wasm-ld take care of their collection, and read the results just once from the final object file in Emscripten rather than looking at each input.

@floooh
Copy link
Collaborator

floooh commented Jul 23, 2021

Apologies for drive-by comment, but this sounds more like a build system problem to me? For instance cmake's target_compile_options() propagates any non-PRIVATE compiler options up the dependency tree:

https://cmake.org/cmake/help/latest/command/target_compile_options.html

That's how I usually deal with supressing warnings in depedency header files for instance (versus warnings in implementation files which can be handled with the PRIVATE mode).

There's also a separate target_link_options for options that should only be added to the final linker step ("compile options" will be added both to compile and linker steps).

Linker dependencies behave the same (target_link_libraries()), if a library target requires linking with another library, this dependency is defined down on the library target, and it will propagate up the dependency tree to the executable.

@RReverser
Copy link
Collaborator Author

RReverser commented Jul 23, 2021

@floooh Sure, such build system options are nice if you control the entire stack and rebuild all the dependencies as part of the project. But often enough a library just builds its own .a that is meant to be shared and linked with independent executables, and in that scenario it's useful to have a way to communicate information to the link step.

Think of it like .o depending on undefined symbols - most of those options are in a similar vein.

@floooh
Copy link
Collaborator

floooh commented Jul 23, 2021

Yeah, I actually agree. This is why I like Emscripten features like EM_JS() so much, or MSVC's/clang's #pragma comment(lib, ...), because I can put those into my header-only libraries, and for the library users it "just works" by adding the header file to their build.

But I wonder if this specific feature could have unexpected side effects, especially for people who are used to build systems taking care of this. Also I guess those options wouldn't show up on the linker command line, which might make debugging build problems more difficult.

In any case, I don't have a "strong opinion" here :)

@RReverser
Copy link
Collaborator Author

But I wonder if this specific feature could have unexpected side effects, especially for people who are used to build systems taking care of this.

That's a good point. I don't think these particular options would have a problem with that - e.g. you can list same function twice in ASYNCIFY_IMPORTS, it's just a set, or you can pass --bind twice, it's just a flag - but it's true we might need to be careful.

In general, I feel that those being link-time option is more of a current limitation / implementation detail, which is why moving them to library compile-time seems more future-proof. For example, I can totally imagine Asyncify transform becoming smarter in the future and transforming functions within each object file in parallel, and then just doing final call graph transforms during the link phase.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 23, 2021

Assuming you are talking about embedded these options in the source code somehow (e.g #pragma) then yes I agree this sounds reasonable.

What I would find odd is actually passing linker flags to the compiler at compile time when running -c. This would be awkward for maybe build systems where specifying specific C flags for just one file is often quite tricky. Also, I would like to continue to have emcc ignore link flags when in compile-only mode.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 23, 2021

i.e. lets follow the EM_JS / #pragma comment(lib, ...) model and not try to make -lfoo into a compile-time flag.

@RReverser
Copy link
Collaborator Author

What I would find odd is actually passing linker flags to the compiler at compile time when running -c.

What I mean by the last comment is that these specific options don't really have to be link-time flags (or at least not just link-time flags), because the transforms they do can be actually done at compile-time in future as an optimisation.

So I still think it makes sense to allow them as compile-time flags as well, except for now all they would do is internally store information for the linker.

I'm not too opposed to the pragma idea, it just feels less natural for those flags.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 23, 2021

What I would find odd is actually passing linker flags to the compiler at compile time when running -c.

What I mean by the last comment is that these specific options don't really have to be link-time flags (or at least not just link-time flags), because the transforms they do can be actually done at compile-time in future as an optimisation.

So I still think it makes sense to allow them as compile-time flags as well, except for now all they would do is internally store information for the linker.

I'm not too opposed to the pragma idea, it just feels less natural for those flags.

Which flags are you referring too? Marking certain things as part of ASYNCIFY_IMPORTS certainly seems more natural to be done as source code annotations, no?

@RReverser
Copy link
Collaborator Author

Marking certain things as part of ASYNCIFY_IMPORTS certainly seems more natural to be done as source code annotations, no?

Oh yeah, absolutely. I'm talking about ASYNCIFY / --bind flags themselves.

@RReverser
Copy link
Collaborator Author

RReverser commented Jul 23, 2021

Although... Maybe we could just imply ASYNCIFY if we found a function annotated with ASYNCIFY_IMPORTS? That could work too.

And same for --bind via EMSCRIPTEN_BINDINGS.

@RReverser
Copy link
Collaborator Author

This would also help use-cases like #15746.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 21, 2022

So I think we can auto-enabled -sASYNCIFY automatically, after linking, based on imports.... and I'm working on that now. That will solve part of this issue.

For embind, its a little harder (because part of embind is a native library which needs to part of the link).

Do you think its reasonable in your use case to set LDFLAGS=-lembind when building? Unlike when ASYNCIFY is missing it should be very obvious if this library is not included because it will result in undefined symbols at link time.

This, I believe is no different to any other native library that has transitive dependencies. For example if -lfoo depends on zlib and its a normal .a archive (as opposed to a .so) then you will need to manually add -lz everywhere you add -lfoo. This is a problem in the native linking world of course. libtool as tool that tries to work around this by using .la files, which are kind of like .a files with transitive dependency info. But libtool, like autoconf, does not have many fans these days.

@RReverser
Copy link
Collaborator Author

While a bit unfortunate, I think that's a reasonable middle ground, I agree Embind is more of a library in that regard.

Perhaps we could have a separate conversation about providing a way to link Emscripten-provided --js-librarys statically outside of this issue, which would cover both Embind and other usecases.

Unlike when ASYNCIFY is missing it should be very obvious if this library is not included because it will result in undefined symbols at link time.

As for this - I haven't actually checked, does this apply only to imports that depend on Embind (emscripten::val class) or to C++ files that just export something via EMSCRIPTEN_BINDINGS too?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 22, 2022

Unlike when ASYNCIFY is missing it should be very obvious if this library is not included because it will result in undefined symbols at link time.

As for this - I haven't actually checked, does this apply only to imports that depend on Embind (emscripten::val class) or to C++ files that just export something via EMSCRIPTEN_BINDINGS too?

Any code that depends on symbols in bind.cpp:

https://github.com/emscripten-core/emscripten/blob/main/system/lib/embind/bind.cpp

I imagine that includes both code that uses emscripten::val and code that uses EMSCRIPTEN_BINDINGS?

@RReverser
Copy link
Collaborator Author

I'm not actually sure what EMSCRIPTEN_BINDINGS itself does under the hood, but... I guess? That's why I said I didn't check :)

@sbc100
Copy link
Collaborator

sbc100 commented Jan 22, 2022

Hmm, turns out all that EMSCRIPTEN_BINDINGS does it declare a static constructor with the contents of the block!

Which means the block of code will run on startup.. if the object gets included.

@RReverser
Copy link
Collaborator Author

Ah, but that code should still be dependent on imports from -lembind so I guess your earlier argument about undefined symbols still stands.

@RReverser
Copy link
Collaborator Author

RReverser commented Jan 22, 2022

Although it would be good to check what the names of those imports are / what the error looks like and whether the user can understand why they're getting it & what to do next.

sbc100 added a commit that referenced this issue Sep 14, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 4, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 4, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 4, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 4, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 4, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 5, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 5, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 7, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 7, 2022
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
sbc100 added a commit that referenced this issue Oct 7, 2022
This change introduces a new `EM_JS_DEPS` macros that can be used to
specific the JS library dependencies of user code in `EM_JS` and/or `EM_ASM`
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
@mackron
Copy link

mackron commented Aug 5, 2023

So I think we can auto-enabled -sASYNCIFY automatically, after linking, based on imports.... and I'm working on that now.

Did this feature ever end up making it in? In my case I have a header-only library which depends on the program linking with -sASYNCIFY, but the thing is the library is designed to be added directly to the user's source tree, and it's rather annoying having build requirements be pushed onto each and every one of my users. But what's particularly annoying in my specific case is that it's an audio library which was previously using ScriptProcessorNode, and I now want to transition to AudioWorklets, but as soon as I make the switch all of my users will have broken builds due to new linking requirements (-sAUDIO_WORKLET=1 -sWASM_WORKERS=1 -sASYNCIFY). It'd be great if these new options could be isolated within my library as implementation details thereby allowing my users to upgrade without a broken build (and also because it just makes build options cleaner and simpler in general).

sbc100 added a commit that referenced this issue Jan 3, 2025
This change introduces a new EM_JS_DEPS macros that can be used to
specific the JS library dependencies of user code in EM_JS and/or EM_ASM
blocks.

This is especially important for library authors who don't want to have
their users maintain link-time list of required symbols.

See #14729
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants