Skip to content

wasm2c: Remove implicit void* conversions in tail calls #2590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shravanrn
Copy link
Collaborator

Tail call wrappers rely on implicit conversion of void* to the target module's instance types. Modifying this to explicitly cast to the target type in order to play nicely with the -Wc++-compat flag in gcc.

@shravanrn shravanrn requested review from keithw and sbc100 April 24, 2025 04:31
@shravanrn shravanrn force-pushed the no-implicit-casts branch 2 times, most recently from 3206159 to 2d906d8 Compare April 24, 2025 04:34
Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comment

@@ -916,6 +917,7 @@ struct Func {
Index GetLocalIndex(const Var&) const;

std::string name;
std::string ext_module_name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should be part of the ImportMixin rather then the function itself. Is that not possible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that the FuncImport information (and thus any info brought by the ImportMixin that is inherited) seems to be lost by the time wasm2c is outputting code in c-writer.cc. So, i've made this change to capture whether functions are from external modules directly on the Func object. Happy to explore any alternatives you have in mind though. Let me know your thoughts/if the above approach is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope the import information is not lost.. it seems kind of important to have that in the IR. But its been a while since I looked at this code.

Copy link
Collaborator Author

@shravanrn shravanrn Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it does seem like c-writer does not receive this information at all today. From what I can tell, the binary-reader and binary-reader-ir parse wasm files and create instances of objects such as FuncImport (which has both the name of the function and the module in which it exists). It then creates unique names for all function calls (of the form _) and only these unique names are passed into c-writer as objects of type Func. Sincec-writer' only receives Func`, it does not actually receive information about the external module. This was the smallest change I could think of to get things work. Let me know if you'd prefer if I do things differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since c-writer has access the whole Module IR object I would hope that it can use the imports member to figure out the name. If we are going to do it we should explain what why we are duplicating information that already exists elsewhere in the IR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The c-writer already is using the module_name of imported functions.

See 1779:

    if (import->kind() == ExternalKind::Func) {                                       
      import_func_module_set_.insert(import->module_name);                            
    }     

If this data is not available in the right time and place perhaps we could build an additional data structure here in the C-writer to cache the module names of functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "only these unique names are passed into c-writer". The c-writer operates on the complete IR Module object, doesn't it?

Copy link
Collaborator Author

@shravanrn shravanrn Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not somehow do dynamic check here and downcast to a FuncImport? I guess we don't have the RTTI info in the IR classes to do a downcast? It looks like we have IsImport which can check if the function is imported.. and then do a cast mabye?

Oh gotcha. You're saying that the func pointer we are using here is a pointer to the member of a FuncImport due to the mixin system used? So i have to figure out the correct way to do the casting equivalent of FuncImport* import = (char*)func - offsetof(FuncImport, func); (the IsImport function will tell me when I need this cast). I am trying to figure out if this is safe to assume in which case I can try this. @sbc100 Let me know if you have thoughts/info here.

I'm not sure what you mean by "only these unique names are passed into c-writer". The c-writer operates on the complete IR Module object, doesn't it?

Oh I was just saying that all we have are datastructres like import_func_module_set_ or funcs_ which have only the function names, not the name of the IR Module. If the above casting approach doesn't work, I can try your suggestion of an additional data structure here in the C-writer to cache the module names of function.

Copy link
Collaborator Author

@shravanrn shravanrn Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ignore my prior comments. You were right that the information is around and all I needed to do was preserve this in a separate data structure. I've made the change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbc100 Let me know if this looks ok. If yes, I'll go ahead and land it

@keithw
Copy link
Member

keithw commented Apr 26, 2025

lgtm but would be good to have a look from @SoniEx2 since I think that's who contributed the original fix here.

@keithw
Copy link
Member

keithw commented Apr 26, 2025

BTW should we add this warning flag to the testing to catch this kind of thing going forward?

@shravanrn
Copy link
Collaborator Author

shravanrn commented Apr 27, 2025

BTW should we add this warning flag to the testing to catch this kind of thing going forward?

@keithw I'd say we can hold off enabling the flag for now. The use case is to run some custom tools that identify better optimization opportunities and potential UB/vulnerabilities (the tools need C code to be compile-able with that flag). Since I am still only in the testing phase, this spot fix should be fine for now.

@SoniEx2 Please have a look when you have a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants