Skip to content

Move dlopen file operations into native code. NFC #19310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions emcc.py
Original file line number Diff line number Diff line change
Expand Up @@ -2307,12 +2307,6 @@ def phase_linker_setup(options, state, newargs):
settings.SYSCALLS_REQUIRE_FILESYSTEM = 0
settings.JS_LIBRARIES.append((0, 'library_wasmfs.js'))
settings.REQUIRED_EXPORTS += ['_wasmfs_read_file']
if settings.MAIN_MODULE:
# Dynamic library support uses JS API internals, so include it all
# TODO: rewriting more of the dynamic linking support code into wasm could
# avoid this. also, after we remove the old FS, we could write a
# more specific API for wasmfs/dynamic linking integration perhaps
settings.FORCE_FILESYSTEM = 1
if settings.FORCE_FILESYSTEM:
# Add exports for the JS API. Like the old JS FS, WasmFS by default
# includes just what JS parts it actually needs, and FORCE_FILESYSTEM is
Expand Down
6 changes: 4 additions & 2 deletions src/generated_struct_info32.json
Original file line number Diff line number Diff line change
Expand Up @@ -1321,12 +1321,14 @@
"d_type": 18
},
"dso": {
"__size__": 28,
"__size__": 36,
"file_data": 28,
"file_data_size": 32,
"flags": 4,
"mem_addr": 12,
"mem_allocated": 8,
"mem_size": 16,
"name": 28,
"name": 36,
"table_addr": 20,
"table_size": 24
},
Expand Down
6 changes: 4 additions & 2 deletions src/generated_struct_info64.json
Original file line number Diff line number Diff line change
Expand Up @@ -1321,12 +1321,14 @@
"d_type": 18
},
"dso": {
"__size__": 48,
"__size__": 64,
"file_data": 48,
"file_data_size": 56,
"flags": 8,
"mem_addr": 16,
"mem_allocated": 12,
"mem_size": 24,
"name": 48,
"name": 64,
"table_addr": 32,
"table_size": 40
},
Expand Down
37 changes: 14 additions & 23 deletions src/library_dylink.js
Original file line number Diff line number Diff line change
Expand Up @@ -905,10 +905,6 @@ var LibraryDylink = {
// - if flags.loadAsync=true, the loading is performed asynchronously and
// loadDynamicLibrary returns corresponding promise.
//
// - if flags.fs is provided, it is used as FS-like interface to load library data.
// By default, when flags.fs=undefined, native loading capabilities of the
// environment are used.
//
// If a library was already loaded, it is not loaded a second time. However
// flags.global and flags.nodelete are handled every time a load request is made.
// Once a library becomes "global" or "nodelete", it cannot be removed or unloaded.
Expand Down Expand Up @@ -960,12 +956,13 @@ var LibraryDylink = {
// libName -> libData
function loadLibData() {
// for wasm, we can use fetch for async, but for fs mode we can only imitate it
if (flags.fs && flags.fs.findObject(libName)) {
var libData = flags.fs.readFile(libName, {encoding: 'binary'});
Comment on lines -963 to -964
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a regression for us because we've been using the fs argument to loadDynamicLibrary. Do you have a recommended alternative @sbc100?

cc @ryanking13

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call _dlopen(...) perhaps instead?

loadDynamicLibrary and the other symbols in library_dylink.js are not really part of the public API .. or at least I would prefer them not to be considered public.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear. What was the intended purpose of this hook if it's not supposed to be used downstream?

There's a few issues. One is asynchronous compilation: dlopen is synchronous I think, so it has to instantiate the module synchronously, which fails on the main thread for large modules.

A second problem is resolving dynamic library dependencies. If we dlopen a library a.so that depends on b.so then the lookup logic by default it does something we don't want.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear. What was the intended purpose of this hook if it's not supposed to be used downstream?

There's a few issues. One is asynchronous compilation: dlopen is synchronous I think, so it has to instantiate the module synchronously, which fails on the main thread for large modules.

I think we can find a way to work around that. There are 3 different versions of dlopen in emscripten and 2 of them are async (emscripten_dlopen_promise and emscripten_dlopen) . They are mostly designed to be called from native code though. There is you loading code happening? Is it originating in native code?

A second problem is resolving dynamic library dependencies. If we dlopen a library a.so that depends on b.so then the lookup logic by default it does something we don't want.

Can you say little more about this. The logic for loading the dependencies of dynamic libraries is all embedded in loadWebAssemblyModule on the JS side so I would expect them to be the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear. What was the intended purpose of this hook if it's not supposed to be used downstream?

No, sadly we don't have much clarity about what is supposed to be internal vs public JS APIs. I wish we did. In this case we used that hook internally, but after this change it was no longer needed. I'd rather not add it back because the new method of loading the data from the filesystem once on the main thread and sharing it via shared memory has a lot of advantages (the alternative is that each thread would need to do FS operations of their own, which themselves get proxied back to main thread, which adds a up to load wasted work).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue here is that we need to adjust how dependencies are looked up. If we call FS.createPreloadedFile directly, it will do its normal dependency resolution process without any way for us to control it. So we either need a hook or we need a way to work out the dependencies ahead of time and resolve them manually. As far as I can tell getDylinkMetadata is the only way for us to work out dependencies ahead of time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say "adjust how dependencies are looked up" do you mean controlling where .so files are found (like LD_LIBRARY_PATH)? Or something else?

Is that because you have DSO with the same name in different locations?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean controlling where .so files are found (like LD_LIBRARY_PATH)? Or something else?

This may be all we need, but I'm not 100% sure. Presumably there is some correct standard way to do this we could be using?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping. Let me make it a little clearer

When you say "adjust how dependencies are looked up" do you mean controlling where .so files are found (like LD_LIBRARY_PATH)? Or something else?

Yes, "adjust how dependencies are looked up" is why Pyodide uses fs.findObject and fs.readfile now.

If a.so depends on b.so, we need a way to locate b.so somewhere in the (virtual) file system. So we want a global paths (LD_LIBRARY_PATH) that dependencies can be searched from.

Additionally, It would be great to have a thing like RPATH (pyodide/pyodide#3854). To make Python packages portable, we are trying to bundle shared libraries into Python wheel, so we want a way to search those "local" paths, which is availble in Linux with RPATH.

Is that because you have DSO with the same name in different locations?

It's similar, but a little different.

The reason we modify LDSO.loadedLibsByName is to handle cases when we pass an absolute path and just a lib name into loadDynamicLibrary: when two libraries are the same.

For example, if we have a situation where we have previously called loadDynamicLibrary (/usr/lib/b.so) to load b.so, and we subsequently load a.so that depends on b.so, we want to know that the b.so that a.so depends on is /usr/lib/b.so, which has already been loaded, so that we don't get a duplicate load.

For now, a.so will call loadDynamicLibrary(b.so) in the process of loading the dependent library, so we need to tell it that this is the same file as /usr/lib/b.so.

Copy link
Contributor

@ryanking13 ryanking13 Jun 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell getDylinkMetadata is the only way for us to work out dependencies ahead of time.

To be more specific, the reason why we are using getDylinkMetadata now in Pyodide is to handle the bug described in #18264.

If a library is loaded locally with RTLD_LOCAL option,
the dependent libraries will also be loaded locally. On Linux, dependent libraries can resolve symbols even if they are loaded locally, but I think there is currently a bug in Emscripten that prevents this.

As a workaround, Pyodide is changing the option to force libraries that are dependencies of other libraries to be loaded globally, using the metadata value returned by getDylinkMetadata.

if (!(libData instanceof Uint8Array)) {
libData = new Uint8Array(libData);
if (handle) {
var data = {{{ makeGetValue('handle', C_STRUCTS.dso.file_data, '*') }}};
var dataSize = {{{ makeGetValue('handle', C_STRUCTS.dso.file_data_size, '*') }}};
if (data && dataSize) {
var libData = HEAP8.slice(data, data + dataSize);
return flags.loadAsync ? Promise.resolve(libData) : libData;
}
return flags.loadAsync ? Promise.resolve(libData) : libData;
}

var libFile = locateFile(libName);
Expand All @@ -987,7 +984,7 @@ var LibraryDylink = {
// lookup preloaded cache first
if (preloadedWasm[libName]) {
#if DYLINK_DEBUG
err('using preloaded module for: ' + libName);
dbg('using preloaded module for: ' + libName);
#endif
var libModule = preloadedWasm[libName];
return flags.loadAsync ? Promise.resolve(libModule) : libModule;
Expand Down Expand Up @@ -1059,7 +1056,7 @@ var LibraryDylink = {
},

// void* dlopen(const char* filename, int flags);
$dlopenInternal__deps: ['$FS', '$ENV', '$dlSetError', '$PATH'],
$dlopenInternal__deps: ['$ENV', '$dlSetError', '$PATH'],
$dlopenInternal: function(handle, jsflags) {
// void *dlopen(const char *file, int mode);
// http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
Expand All @@ -1079,7 +1076,6 @@ var LibraryDylink = {
global,
nodelete: Boolean(flags & {{{ cDefs.RTLD_NODELETE }}}),
loadAsync: jsflags.loadAsync,
fs: jsflags.fs,
}

if (jsflags.loadAsync) {
Expand All @@ -1104,18 +1100,12 @@ var LibraryDylink = {
_dlopen_js: function(handle) {
#if ASYNCIFY
return Asyncify.handleSleep((wakeUp) => {
var jsflags = {
loadAsync: true,
fs: FS, // load libraries from provided filesystem
}
var jsflags = { loadAsync: true }
var promise = dlopenInternal(handle, jsflags);
promise.then(wakeUp).catch(() => wakeUp(0));
});
#else
var jsflags = {
loadAsync: false,
fs: FS, // load libraries from provided filesystem
}
var jsflags = { loadAsync: false }
return dlopenInternal(handle, jsflags);
#endif
},
Expand All @@ -1125,8 +1115,8 @@ var LibraryDylink = {
_emscripten_dlopen_js: function(handle, onsuccess, onerror, user_data) {
/** @param {Object=} e */
function errorCallback(e) {
var filename = UTF8ToString({{{ makeGetValue('handle', C_STRUCTS.dso.name, '*') }}});
dlSetError('Could not load dynamic lib: ' + filename + '\n' + e);
var filename = UTF8ToString(handle + {{{ C_STRUCTS.dso.name }}});
dlSetError('Could not load dynamic libX: ' + filename + '\n' + e);
{{{ runtimeKeepalivePop() }}}
callUserCallback(() => {{{ makeDynCall('vpp', 'onerror') }}}(handle, user_data));
}
Expand All @@ -1136,7 +1126,8 @@ var LibraryDylink = {
}

{{{ runtimeKeepalivePush() }}}
var promise = dlopenInternal(handle, { loadAsync: true });
var jsflags = { loadAsync: true }
var promise = dlopenInternal(handle, jsflags);
if (promise) {
promise.then(successCallback, errorCallback);
} else {
Expand Down
2 changes: 2 additions & 0 deletions src/struct_info_internal.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@
"mem_size",
"table_addr",
"table_size",
"file_data",
"file_data_size",
"name"
]
}
Expand Down
43 changes: 35 additions & 8 deletions system/lib/libc/dynlink.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#define _GNU_SOURCE
#include <assert.h>
#include <dlfcn.h>
#include <fcntl.h>
#include <pthread.h>
#include <threads.h>
#include <stdarg.h>
Expand All @@ -19,6 +20,7 @@
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <unistd.h>

#include <emscripten/console.h>
#include <emscripten/threading.h>
Expand Down Expand Up @@ -81,6 +83,8 @@ static thread_local struct dlevent* thread_local_tail = &main_event;
static pthread_mutex_t write_lock = PTHREAD_MUTEX_INITIALIZER;
static thread_local bool skip_dlsync = false;

static void dlsync();

static void do_write_lock() {
// Once we have the lock we want to avoid automatic code sync as that would
// result in a deadlock.
Expand Down Expand Up @@ -155,6 +159,14 @@ static void load_library_done(struct dso* p) {
p->table_addr,
p->table_size);
new_dlevent(p, -1);
#ifdef _REENTRANT
// Block until all other threads have loaded this module.
dlsync();
#endif
if (p->file_data) {
free(p->file_data);
p->file_data_size = 0;
}
}

static struct dso* load_library_start(const char* name, int flags) {
Expand All @@ -169,6 +181,29 @@ static struct dso* load_library_start(const char* name, int flags) {
p->flags = flags;
strcpy(p->name, name);

// If the file exists in the filesystem, load it here into linear memory which
// makes the data available to JS, and to other threads. This data gets
// free'd later once all threads have loaded the DSO.
struct stat statbuf;
if (stat(name, &statbuf) == 0 && S_ISREG(statbuf.st_mode)) {
int fd = open(name, O_RDONLY);
if (fd >= 0) {
off_t size = lseek(fd, 0, SEEK_END);
if (size != (off_t)-1) {
lseek(fd, 0, SEEK_SET);
p->file_data = malloc(size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for a failing malloc? (I don't think we can depend on p->file_data being null leading to an error in read)

if (p->file_data) {
if (read(fd, p->file_data, size) == size) {
p->file_data_size = size;
} else {
free(p->file_data);
}
}
}
close(fd);
}
}

return p;
}

Expand Down Expand Up @@ -424,10 +459,6 @@ static void dlopen_onsuccess(struct dso* dso, void* user_data) {
dso->mem_addr,
dso->mem_size);
load_library_done(dso);
#ifdef _REENTRANT
// Block until all other threads have loaded this module.
dlsync();
#endif
do_write_unlock();
data->onsuccess(data->user_data, dso);
free(data);
Expand Down Expand Up @@ -526,10 +557,6 @@ static struct dso* _dlopen(const char* file, int flags) {
}
dbg("dlopen_js: success: %p", p);
load_library_done(p);
#ifdef _REENTRANT
// Block until all other threads have loaded this module.
dlsync();
#endif
end:
dbg("dlopen(%s): done: %p", file, p);
do_write_unlock();
Expand Down
6 changes: 6 additions & 0 deletions system/lib/libc/musl/src/internal/dynlink.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ struct dso {
void* table_addr;
size_t table_size;

// For DSO load events, where the DSO comes from a file on disc, this
// is a pointer the file data read in by the laoding thread and shared with
// others.
uint8_t* file_data;
size_t file_data_size;

// Flexible array; must be final element of struct
char name[];
};
Expand Down
2 changes: 1 addition & 1 deletion test/other/metadce/test_metadce_hello_dylink.jssize
Original file line number Diff line number Diff line change
@@ -1 +1 @@
28151
28007
41 changes: 28 additions & 13 deletions test/test_other.py
Original file line number Diff line number Diff line change
Expand Up @@ -6371,7 +6371,13 @@ def test_RUNTIME_LINKED_LIBS(self):

self.assertBinaryEqual('main.wasm', 'main2.wasm')

def test_ld_library_path(self):
@parameterized({
'': ([],),
'pthread': (['-g', '-pthread', '-Wno-experimental', '-sPROXY_TO_PTHREAD', '-sEXIT_RUNTIME'],),
})
def test_ld_library_path(self, args):
if args:
self.setup_node_pthreads()
create_file('hello1.c', r'''
#include <stdio.h>

Expand Down Expand Up @@ -6456,17 +6462,17 @@ def test_ld_library_path(self):
return 0;
}
''')
self.run_process([EMCC, '-o', 'hello1.wasm', 'hello1.c', '-sSIDE_MODULE'])
self.run_process([EMCC, '-o', 'hello2.wasm', 'hello2.c', '-sSIDE_MODULE'])
self.run_process([EMCC, '-o', 'hello3.wasm', 'hello3.c', '-sSIDE_MODULE'])
self.run_process([EMCC, '-o', 'hello4.wasm', 'hello4.c', '-sSIDE_MODULE'])
self.run_process([EMCC, '-o', 'hello1.wasm', 'hello1.c', '-sSIDE_MODULE'] + args)
self.run_process([EMCC, '-o', 'hello2.wasm', 'hello2.c', '-sSIDE_MODULE'] + args)
self.run_process([EMCC, '-o', 'hello3.wasm', 'hello3.c', '-sSIDE_MODULE'] + args)
self.run_process([EMCC, '-o', 'hello4.wasm', 'hello4.c', '-sSIDE_MODULE'] + args)
self.run_process([EMCC, '--profiling-funcs', '-o', 'main.js', 'main.c', '-sMAIN_MODULE=2', '-sINITIAL_MEMORY=32Mb',
'--embed-file', 'hello1.wasm@/lib/libhello1.wasm',
'--embed-file', 'hello2.wasm@/usr/lib/libhello2.wasm',
'--embed-file', 'hello3.wasm@/libhello3.wasm',
'--embed-file', 'hello4.wasm@/usr/local/lib/libhello4.wasm',
'hello1.wasm', 'hello2.wasm', 'hello3.wasm', 'hello4.wasm', '-sNO_AUTOLOAD_DYLIBS',
'--pre-js', 'pre.js'])
'--pre-js', 'pre.js'] + args)
out = self.run_js('main.js')
self.assertContained('Hello1', out)
self.assertContained('Hello2', out)
Expand Down Expand Up @@ -13399,7 +13405,13 @@ def test_windows_batch_file_dp0_expansion_bug(self):
create_file('build_with_quotes.bat', f'@"emcc" {test_file("hello_world.c")}')
self.run_process(['build_with_quotes.bat'])

def test_preload_module(self):
@parameterized({
'': ([],),
'pthread': (['-g', '-pthread', '-Wno-experimental', '-sPROXY_TO_PTHREAD', '-sEXIT_RUNTIME'],),
})
def test_preload_module(self, args):
if args:
self.setup_node_pthreads()
# TODO(sbc): This test is copyied from test_browser.py. Perhaps find a better way to
# share code between them.
create_file('library.c', r'''
Expand All @@ -13408,17 +13420,20 @@ def test_preload_module(self):
return 42;
}
''')
self.run_process([EMCC, 'library.c', '-sSIDE_MODULE', '-o', 'library.so'])
self.run_process([EMCC, 'library.c', '-sSIDE_MODULE', '-o', 'library.so'] + args)
create_file('main.c', r'''
#include <assert.h>
#include <dlfcn.h>
#include <stdio.h>
#include <emscripten.h>
#include <emscripten/threading.h>
int main() {
int found = EM_ASM_INT(
return preloadedWasm['/library.so'] !== undefined;
);
assert(found);
if (emscripten_is_main_runtime_thread()) {
int found = EM_ASM_INT(
return preloadedWasm['/library.so'] !== undefined;
);
assert(found);
}
void *lib_handle = dlopen("/library.so", RTLD_NOW);
assert(lib_handle);
typedef int (*voidfunc)();
Expand All @@ -13429,4 +13444,4 @@ def test_preload_module(self):
return 0;
}
''')
self.do_runf('main.c', 'done\n', emcc_args=['-sMAIN_MODULE=2', '--preload-file', '.@/', '--use-preload-plugins'])
self.do_runf('main.c', 'done\n', emcc_args=['-sMAIN_MODULE=2', '--preload-file', '.@/', '--use-preload-plugins'] + args)