-
Notifications
You must be signed in to change notification settings - Fork 3.4k
WebAssembly and Exception Handling (throw) #23442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Now the point is that for running clang-repl in the browser. This is the workflow taken
Pasting them down below just for reference i) LLVM IR (only relevant part)
ii) wasm module
I think this looks correct to me ! |
Now coming back to the dloepn step. The debugger through chrome tools tells me that this is the last part where it ends up emscripten/src/library_dylink.js Lines 856 to 859 in 4c14f1f
Which means it is trying to execute this block I'd guess
But it isn't able to. Now __wasm_call_ctors calls _GLOBAL__sub_I_incr_module_5 which simply calls __stmts__0 ... So I am guessing its just not able to run __stmts__0 but I think even that is being framed correctly ? |
Here's what I thought might be going wrong.
I think we have everything
If y'all are interested in the configuration, this is what i used.
Apart from adding the - |
Is there still an issue here? You are correct that you need to make sure that From your original backtrace looks like the code its trying to load a DLL called "const char*", which is very odd. Can you stop in the debugger and see why that might be? Is the name of file being loaded really "const char*"? (BTW, you file bugs like this it would be very helpful if you could copy and paste the text rather than attaching screenshots. Using test make it much easier for use to search / copy / etc within the issue.) |
Yes it is.
So I think we tried building the whole stack with
Is it ? So when we use clang-repl in the browser ever code block produced a file named I think it, it might just be the exception ptr type or something (not sure). The below issue looks relevant here. EDIT: Also just questioning my breakdown here. The wasm module generated looks correct to me and I think it is the init() call that I referred above that doesn't work ! Maybe someone could confirm that for me ? |
I think that fact that Can you break at that callsite and see the string ptr value being passed to Can you try building you side modules with |
Hey @sbc100 sorry took some time to get back But this the whole log when we try executing "throw 1;" It points to the addModule function as expected where the dlopen is being called https://github.com/llvm/llvm-project/blob/main/clang/lib/Interpreter/Wasm.cpp#L65 |
Also I don't see absolutely no difference in how dlopen is working for any cell that works vs the cell executing At this point there is so much we can already do (check the example notebook https://github.com/compiler-research/xeus-cpp/blob/main/notebooks/xeus-cpp-lite-demo.ipynb) that not being able to use |
I executed stuff till this the final failure which comes up here emscripten/src/lib/libdylink.js Lines 857 to 860 in f9ca632
As soon as the debugger hits Nothing really seems fishy till the end. I see memory_base is 0 here (not sure if that shouldn't be the case, looks ok to me) |
What is memorySize? memoryBase should only be zero if memorySize is also zero. The memoryBase and tableBase are there the data segment and table segment for your DLL are stored. They will only be zero when you module has no memory segment or no table segment of its own. You can see how much data and table space your module needs by looking at the
Here you can see the hello world program, when compiled to DLL requires 15 bytes of memory and zero table slots. |
This is what metadata has after
Binary shows Also I see you've mentioned about using So the max I can do is go through the incr_module_xx.wasm file which comes out of to be something like this
P.S: Also just for you to confirm for yourself that the error is coming out of dlopen itself (and also to maybe play around and debug any questions you might have) I think you can try running throw 1; through our static link and add breakpoints in the source files to see what's happening https://compiler-research.org/xeus-cpp/lab/index.html |
So BTW you can run Can you set a breakpoint on the "Error loading dyanmic library" line and see inspect the exception (e) that is being thrown? What does the stack trace for that exception look like? |
This is what I see for This seems to have a data segment at the end. So I don't think this is the case for every cell. It's just |
Okay so I set the debugger below and print
I see this
I think I have pointed this out before that eventually we end up at Well what does that mean ? I guess the logic is correct as we would have liked it to be ? But then ..... throw is called and hence the dlopen step errors out I am guessing ! Confused as to what needs to be done here. Does this mean we can't load a DLL/module having a call to __cxa_throw using dlopen ? For some context, |
So one of your static constructors is throwing an exception. Are you actually trying to execute a C++ throw in your notebook? If so, wouldn't you expect the DLL to fail to load? Or is the problem that you want to somehow catch that exception yourself instead of having dlopen fail? |
Hey @sbc100 Sorry for missing out on this but yeah the point is I should be able to exactly replicate what clang-repl does locally (or what xeus-cpp is doing here based on clang-repl) So my point is that I won't expect an error out of dlopen or the module failing to load. Rather the module should be loaded on top of the main module and that should then give back any Error message or whatever we print through the console . Does this mean the wasm being generated is wrong ? Cause the wasm binary calls |
Are you saying that in this case the |
Okay @sbc100 I think this might be a new/separate problem at hand. But let's look into this Case 1: We have Case 2: But for a throw catch block
I still see the same happening i) Now even in this case we obviously first parse and come up with a LLVM IR. Looking at the LLVM IR generated ..... I do see
ii) After this step we end up generating the wasm module which is obviously wrong
|
My understanding related to a catch block is that we definitely should end up seeing a Now my point is how is llvm IR generated through clang and clang-repl turning out to be different ? For example if I put this in test.cpp and run the following
I see most of the important stuff being put to use
|
This is weird as running clang-repl in the browser or locally, the PTU generation step is the same, it's only the execution step that differs So technically we shouldn't be seeing a wrong LLVM IR leading to a wrong wasm module ! EDIT: I have a question.
Is this possibly dependent on how we build LLVM (maybe with some sort of exceptions enabled or disabled). This is what I use to build llvm for wasm currently.
All of this is present as a part of the recipe for llvm on emscripten-forge (https://github.com/emscripten-forge/recipes/blob/main/recipes/recipes_emscripten/llvm/build.sh) |
Hey @sbc100 Not sure you saw my ping above, hence tagging you to maybe help me out with the last 2-3 messages continuing our discussion after #23442 (comment) |
What are the build flags you are using then building the side module in clang-repl? They must be somehow different from those used in emscripten. I'm guessing you are missing You can add |
So these are the flags used for the side module (each cell gives us one that is loaded on top of the main module)
Wait, so for the latest build. I took care of this (basically just updated the cxx_flags to take care of And I still see this So the current CXX_FLAGS being put to use are these
So yeah we know the roadmap here (code -> PTU -> llvm IR -> incr_module_xx.so file -> incr_module_xx.wasm file -> loaded on top of main module using dlopen) My first concern is clang-repl and clang technically promise making use of the same llvm IR. I don't know why we don't see the correct LLVM IR (even before getting to the shared object in clang-repl) for this case. |
Those are the link flags. What are the compile-time flags used to build the object file being linked? |
Hey @sbc100, yes I think stuff boils down to that
But not sure how to get hold of them :\ But everything happens in the addModule code What I think happens is
i) We create a Target (extracted from the target triple from our module)
iv) But that being said, I know of an error like this
v) So although
I am not sure how but I see the 3rd parameter of createTargetMachine allows us to pass some features So we currently pass nothing here. So maybe we can update the code in AddModule to have this
Not sure it is this way you make use of these flags. I just know we can use Does it make sense to see it this way ? I am guessing only the |
Apart from this I am not sure if how we build llvm or (the cxx_flags we pass there does make a difference) I have been using this .... to build
P.S: not sure passing Let me know if you think some changes need to be introduced here. Apart from this yeah, I need some help to look into the llvm IR going to the shared object step. |
The issue here (IIUC) is not how you build llvm, but how llvm is building the side module. |
I'm not sure how you are supposed to inject flags into the |
I think @aheejin can help us out here cause I see some of his commits relevant to work on Hey @aheejin we would appreciate some help here. The following is what we are trying to do.
If you see we have this check which we want to go across
Hence in this case, I want to use But the point is I don't realize how do we pass this flag or make use of it through our TargetMachine ?
And technically we aren't sure as to how/where we inject the Possibly we would also like to pass |
I'm not sure which EH method are you trying to use. Your first code was apparently using Emscripten EH (which was using And regardless of the EH method you are trying to use, I think you are looking at the wrong part of the code. I'm not familiar with how clang-repl generates code, but the code you pasted looks largely from this function and callees from this function: https://github.com/llvm/llvm-project/blob/5f8da7e7738f043dbde447e48622e9b2afb5ba92/clang/lib/CodeGen/BackendUtil.cpp#L1250-L1274 But this backend pipeline invocation happens after the initial code, namely |
Hey @aheejin Sorry for not getting back earlier. Maybe let me give you some context and tell you where I stand right now so that you can help me out
Check this link out and maybe play around to see how it works https://compiler-research.org/xeus-cpp/lab/index.html
As can be seen we intend to use simple c++ exceptions through (
The following is the LLVM IR generated
So I hope as per what is expected through happen while we use simple C++ exceptions is happening here and there is nothing wrong with the LLVM IR
Basically the "catch" part of the code is missing completely. So what @sbc100 and I think is that .... although we are able to generate correct LLVM IR. We are somewhere going wrong in the LLVM IR to generating the shared object step This is where is happens. https://github.com/llvm/llvm-project/blob/main/clang/lib/Interpreter/Wasm.cpp#L68-L90 And I am not sure how can we update the |
So technically answering your questions
Yes, if we have a choice, I was looking to take the simplest route possible. So yeah I guess yeah we can stick the simplest C++ exceptions (which I guess is the default and we move to wasm EH is we use
Because I notice that .... if I pass -fwasm-exceptions in the clang-repl cc1 command and I also add the following in the code above in wasm.cpp (and build llvm)
I notice that the correct wasm is generated and has inspiration from So yeah I was playing around this. And in this has although the correct wasm is generated .... the dlopen step fails when we trying to load this on top of the main module ... cause I think due to some reason the main module doesn't have Basically what I am trying to convey is .... I probably would like to make use of the emscripten EH but if I am struggling there maybe wasm EH might help me out ? That being said if let's say we stick to the emscripten EH, I guess we anyways need to make us of an ExceptionModel. I guess |
The development while trying the wasm EH are mentioned here. I had to open an issue for the same Check #23731 |
Naah, I tried this. I thought we might just be able to excpetion model and get past this but not the case. Just adding something like So yeah still at the starting point. Need to figure out how to correctly convert the LLVM IR (which is absolutely perfect) to an appropriate shared object ! |
Hey @sbc100 @aheejin , I was able to get a working fix here :) But might need some help to make my solution more concrete. So what I realized is that
And now I see the correct wasm module being generated. So I have stuff related to EH working perfectly !! So technically I see that I need to enable this option How do I pass the same flag/enable this option here Where we go from the LLVM IR to the wasm module ? Once I know that, I can raise a PR to LLVM ! |
Ah Ok, so the And you don't need to turn off So to sum up the option you need to pass to the backend is Lines 79 to 80 in 1cebdf5
I'm not 100% sure where you can add these options, but I don't think that should be when creating a |
Hey @aheejin Thanks a lot for the details explanation. Let's discuss your approach below
I think most of it should be inspired by clang itself. I see the following cc1 command. You can also see it through the console on the link running clang-repl in the browser through jupyterlite
Hmmm, I don't think this worked.
code -> PTU -> llvm ir -> incr_module_xx.so (where xx stands for code block) -> incr_module_xx.wasm -> loaded on top of the main module using dlopen So here the llvm ir step is correct but the shared object generated might not be going as expected.
So the current workflow would be like this AddModule -> addPassesToEmitFile -> addPassesToGenerateCode -> addISelPasses & addIRPasses I am guessing it is somewhere between this toolchain we need to switch on |
Playing around with cl options is like 1 way to get this (not sure how concrete) Introducing this diff
I see the correct wasm being generated for an example I tried So yeah the llvm IR is correct, its while we generate the wasm we need these flags I suppose |
These are the flags given to So
No those
Yes that looks like the culprit, and I'm saying you need |
Yeah this basically does what I said, which is, enabling Also the code very much looks like hardcoding and sets the EH options always true, and I don't think that diff is something we should commit to the main llvm repo. |
Yes this is exactly what happens. Through clang-repl too we need to get to the LLVM-IR and there is no difference in how we get there through clang-repl or clang. This is what the docs say
|
IMP: That being said for running clang-repl in the browser, we don't use the LLVM Jit approach. So yeah if you see this code, everything remains same till the execute step ( called through ParseAndExecute)
What we do here is this code -> PTU -> llvm ir -> incr_module_xx.so (where xx stands for cell/code-block number) -> incr_module_xx.wasm -> loaded on top of the main module using dlopen |
Those would be absolutely the same flags + these one when using emscripten https://github.com/llvm/llvm-project/blob/main/clang/lib/Interpreter/Interpreter.cpp#L200-L204
Yeah I thought the above would do the job (actually my first thought was having -cc1 -triple wasm32-unknown-emscripten -emit-obj -disable-free -clear-ast-before-backend -main-file-name "<<< inputs >>>" -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -debugger-tuning=gdb -fdebug-compilation-dir=/ -v -fcoverage-compilation-dir=/ -resource-dir /lib/clang/19 -internal-isystem /include/wasm32-emscripten/c++/v1 -internal-isystem /include/c++/v1 -internal-isystem /lib/clang/19/include -internal-isystem /include/wasm32-emscripten -internal-isystem /include -std=c++20 -fdeprecated-macro -ferror-limit 19 -fvisibility=default -fgnuc-version=4.2.1 -fno-implicit-modules -fskip-odr-check-in-gmf -fcxx-exceptions -fexceptions -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjljfincremental-extensions -o "<<< inputs >>>.o" -x c++ "<<< inputs >>>" But that didn't work. The wasm generated wasn't correct |
Hey @aheejin Find a video below where I try running clang-repl in the browser (passing those flags to clang cc1) through a toy project of my own Screen.Recording.2025-02-27.at.7.32.49.AM.mp4This fails cause the wasm isn't correct here
|
I'm not familiar with clang-repl or https://github.com/llvm/llvm-project/blob/main/clang/lib/Interpreter/Wasm.cpp, which looks like something you and a few other people added in recent months: https://github.com/llvm/llvm-project/commits/main/clang/lib/Interpreter/Wasm.cpp
I really don't understand what this means or the clang-repl workflow diagram, and I'm not sure whether this is relevant to the current problem. I also visited the Jupiter Notebook page you linked but I'm not sure how to proceed after seeing this image: I can only repeat what I said already: In general (gdb) bt
#0 clang::driver::toolchains::WebAssembly::addClangTargetOptions (this=0x55555562f730,
DriverArgs=..., CC1Args=llvm::SmallVector of Size 14, Capacity 16 = {...})
at /usr/local/google/home/aheejin/llvm-git/clang/lib/Driver/ToolChains/WebAssembly.cpp:290
#1 0x00007ffff12d77b2 in clang::driver::tools::Clang::ConstructJob (this=0x555555630f30,
C=..., JA=..., Output=..., Inputs=llvm::SmallVector of Size 1, Capacity 4 = {...},
Args=..., LinkingOutput=0x0)
at /usr/local/google/home/aheejin/llvm-git/clang/lib/Driver/ToolChains/Clang.cpp:6163
#2 0x00007ffff11cdb28 in clang::driver::Driver::BuildJobsForActionNoCache (
this=0x7fffffff76e0, C=..., A=0x555555631170, TC=0x55555562f730, BoundArch="",
AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0,
CachedResults=std::map with 1 element = {...},
TargetDeviceOffloadKind=clang::driver::Action::OFK_None)
at /usr/local/google/home/aheejin/llvm-git/clang/lib/Driver/Driver.cpp:6049
#3 0x00007ffff11cbfe9 in clang::driver::Driver::BuildJobsForAction (this=0x7fffffff76e0,
C=..., A=0x555555631170, TC=0x55555562f730, BoundArch="", AtTopLevel=true,
MultipleArchs=false, LinkingOutput=0x0, CachedResults=std::map with 1 element = {...},
TargetDeviceOffloadKind=clang::driver::Action::OFK_None)
at /usr/local/google/home/aheejin/llvm-git/clang/lib/Driver/Driver.cpp:5736
#4 0x00007ffff11c25a0 in clang::driver::Driver::BuildJobs (this=0x7fffffff76e0, C=...)
at /usr/local/google/home/aheejin/llvm-git/clang/lib/Driver/Driver.cpp:5262
#5 0x00007ffff11bcc2a in clang::driver::Driver::BuildCompilation (this=0x7fffffff76e0,
ArgList=llvm::ArrayRef of length 22 = {...})
at /usr/local/google/home/aheejin/llvm-git/clang/lib/Driver/Driver.cpp:1838
#6 0x000055555558ef95 in clang_main (Argc=22, Argv=0x7fffffffc548, ToolContext=...)
at /usr/local/google/home/aheejin/llvm-git/clang/tools/driver/driver.cpp:372
#7 0x00005555555c1445 in main (argc=22, argv=0x7fffffffc548)
at /usr/local/google/home/aheejin/llvm-git/build.debug/tools/clang/tools/driver/clang-driver.cpp:17 What I am asking are:
The reason I mentioned https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/Targets/WebAssembly.cpp and pasted the backtrace is in case you need to mimic the process of converting/forwarding the arguments that's done in |
Click on the kernel you want to use. That is c++20 in this case ! |
Hey @aheejin sorry for talking long to get back on this. But I think this is what is happening. So clang-repl would follow the exact stack trace that you shared above. This is what I see
i) in the cc1_main command we end up calling ExecuteCompilerInvocation
i) As you said above everything starts with BuildCompilation which does the addClangTargetOptions step ..... followed by creating the CompilerInstance
iv) Now just like clang normally would do, we need a call to That looks like the missing key here.So adding this diff from what clang does is enough to do the job for me
And now I can execute any try-catch block through clang-repl in the browser. |
Actually now that I say this. I would like to know your views on this.
Clang-repl uses
whereas
So I guess we could have a |
I'm not very familiar with the implementation of the interpreter part, so I'm not sure if I'm qualified to review the PR, but it looks that is the right direction. |
Hey all,
After getting clang-repl running in the browser, I worked on integration it with jupyterlite. Xeus-cpp, a C++ Jupyter kernel provides a way to integrate it. Here is a static link that can be used to try C++ completely in the browser (https://compiler-research.org/xeus-cpp/lab/index.html) . An example notebook
xeus-cpp-lite-demo.ipynb
has been provided to show what all can be acheived.Coming back to the issue. I see we can't run
throw
(or a try catch blocking involving throw) while running clang-repl in the browser.The debug logs tell me that this comes from dlopen
All this can be tried through the static link above.
The text was updated successfully, but these errors were encountered: