-
Notifications
You must be signed in to change notification settings - Fork 951
wasm: do not export malloc, calloc, realloc, free #3142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
I think this will break a significant amount of the TinyGo wasm community, so this should sit at least a few days to deal with timezones etc. |
Can we re-export them with a compiler flag but turn them off by default or other code patch?I think most people shouldn't need them. |
What's one project this would break? Maybe we could add one to our test corpus. How about adding a flag to enable/disable malloc export first, and flip the default to disable after next release? |
Yeah, we can keep them in the form of a |
Do we want to add a file with those build tags in this PR or another one? |
First, I would like to decouple this from me. I don't actually build end wasm, just I respond to many people who are using tinygo to various levels of success. Also, the "malloc" "free" functions have been in use before wasm was a glint in my eye, and they are commonly used in emscripten and other wasm toolchains. Main point is that regardless of my response, these functions are absolutely in use, so how to communicate is very important. I would suggest the following:
my 2p! |
Thanks @codefromthecrypt - in my experience, functions to allocate memory within the GC are essential for being able to pass data in from the host to wasm, so losing them will make code harder to write. I think a short term of adding flags to export the functions to preserve existing code, and a longer term of a library that can be imported to provide implementations reusable among TinyGo programs, could be a good way to make this easier for users. |
#3148 looks like it would remove the safety issues in the built-in "malloc", a primary reason to discourage its use. effectively it changes it from unsafe to safe. So, I suggest we wait on #3148 before making a firm outcome here. If "malloc" becomes safe, I believe and correct me if wrong, the rationale here boils down to not wanting to export built-in functions (even if they are safe). |
ps we created a lib which if used can decouple things. since this is integration tested it should be usable regardless of decision here https://github.com/tetratelabs/tinymem/pull/2 |
Correct, I'd like to remove them (by default at least) even if they are made safe. The reason is the reduction in wasm binary size, which many people will appreciate. |
I have made a small change to this PR, so that it is still possible to export malloc. Example: // #cgo LDFLAGS: --export=malloc --export=free
import "C" This exports the malloc and free functions, for people who need it. Would that be a usable workaround? |
@aykevl Thanks for progressing. So if I understand correctly, users won't be able to export via flags to // #cgo LDFLAGS: --export=malloc --export=free
import "C" Instead of a blank import of something that exports 3rd party version of similar, like import (
_ "github.com/tetratelabs/tinymem"
) The idea being that while this requires a source change, it avoids a 3rd party lib. Avoiding a 3rd party dep will allow code that currently has no dependencies to stay that way without copy/pasta. Do I get it? |
Yes, that sounds correct. (But of course the |
As far as I'm concerned the workaround is helpful, if it is planned to stick around. If you think it will be removed later, than maybe still helpful but should mention that. In any case a "what to do" part in the release notes like written above will help folks. And especially mention the rationale about the 2k savings. I'll do one last stab to put into context the savings. It might make you change your mind about defaulting to have these unavailable vs making it possible to exclude them instead. Let's take this cat program which carefully avoids big packages like fmt. package main
import (
"os"
)
// main runs cat: concatenate and print files.
//
// Note: main becomes WASI's "_start" function.
func main() {
// Start at arg[1] because args[0] is the program name.
for i := 1; i < len(os.Args); i++ {
bytes, err := os.ReadFile(os.Args[i])
if err != nil {
os.Exit(1)
}
// Use write to avoid needing to worry about Windows newlines.
os.Stdout.Write(bytes)
}
} 2KB is less than 1pct for a default build $ tinygo build -o cat.wasm -target=wasi cat.go
$ du -k cat.wasm
260 cat.wasm 2KB is less than 4pct for a trimmed build $ tinygo build -o cat.wasm -scheduler=none --no-debug -target=wasi cat.go
$ du -k cat.wasm
60 cat.wasm |
It's not just that. It was never intended this way, like a private API that accidentally got exported. I want to make the absolute minimum available in an API so that the maintenance burden is smaller.
4% is enormous for a compiler developer. Even a 1% saving is really big to a compiler developer. Most changes to reduce code size are less than that. The thing is that there are many of these little changes and they add up. |
These functions were exported by accident, because the compiler had no way of saying these functions shouldn't be exported. This can be a big code size reduction for small programs. Before: $ tinygo build -o test.wasm -target=wasi -no-debug -scheduler=none ./testdata/alias.go && ls -l test.wasm -rwxrwxr-x 1 ayke ayke 2947 8 sep 13:47 test.wasm After: $ tinygo build -o test.wasm -target=wasi -no-debug -scheduler=none ./testdata/alias.go && ls -l test.wasm -rwxrwxr-x 1 ayke ayke 968 8 sep 13:47 test.wasm This is all because the GC isn't needed anymore. This commit also adds support for using //go:wasm-module to set the module name of an exported function (the default remains env).
This allows people to export some functions, such as malloc. Example: // #cgo LDFLAGS: --export=malloc import "C" This exports the function malloc. Note that this is somewhat unsafe right now, but it is used regardless. By using this workaround, people have some time to transition away from using malloc/free directly or until malloc is made safe to be used in this way.
b651a46
to
28a0836
Compare
I can understand where you are coming from. keep in mind that tinygo is for end users, not its developers. Also, as I mentioned, a normal developer would probably use "fmt" instead of dancing around things, and the default wasm size would be 404KB. I know you are going to remove this now, I'm mainly trying to provide you a perspective you can consider for the release notes which are for end users, some of whom will break, and some of whom may have megabytes or more wasm. |
Using a slice requires a lot less in code size than a map. This is visible when compiling a very small "hello world" style program. Before tracking memory in malloc/free: 2873 bytes With tracking using a map: 6551 bytes With a slice instead of a map: 3532 bytes Of course, most of this code size increase won't be visible with #3142, but it's still a saving of around 3kB in this minimal example.
Absolutely. But end users wouldn't be using TinyGo if it didn't produce small binaries.
I guess there are many different kinds of developers and use cases. I try to make TinyGo useful to as many as possible (which also means making Anyway, I agree this will need some special mention in the release notes so that wasm people will know the next release might break their code. |
Last call for comment for interested WASM/WASI people! I personally agree with the reasons for this PR. The |
@deadprogram and I suppose modifying code is in any case necessary meanwhile, as we have no other way to pass cgo LDFLAGS another way (such as |
@k33g (wasm-university etc) @juntao (wasmedge examples) @knqyf263 (trivy, go-plugin) FYI the tinygo built-in exports for "malloc" and "free" will be removed by default and need a code change to put them back, starting next release. You may need to change docs or code generators when this happens. Here's a synopsis of two ways to do it. Personally, I think the cgo LDFLAG thing is a bit scary for newcomers to look at, so will change wazero's docs to not suggest doing that. Up to you with your respective audiences! |
From what I can tell from this thread and the associated code, the fix for any third party packages that are currently using
The first option would be a "quick fix" if package maintainers need it. The second would be preferable, due to being safer, and also backward compatible. For end-user applications that are using @aykevl can you please confirm if my summary is correct? |
Let's wait a bit longer with this PR since it's a bit more contentious than I expected. @deadprogram yes this sounds correct. |
Using a slice requires a lot less in code size than a map. This is visible when compiling a very small "hello world" style program. Before tracking memory in malloc/free: 2873 bytes With tracking using a map: 6551 bytes With a slice instead of a map: 3532 bytes Of course, most of this code size increase won't be visible with #3142, but it's still a saving of around 3kB in this minimal example.
Is the plan to merge this after the next release? |
I don't have a plan, but yeah we could perhaps do that. |
With 0.27 right around the corner, it seems like this will get cleaned up and merged shortly thereafter. |
@dgryski yes let's do it then. If it gives any issues, there is plenty of time to look for a solution. |
These functions were exported by accident, because the compiler had no way of saying these functions shouldn't be exported. Unfortunately, they can easily be misused and bloat WebAssembly programs.
Not exporting them can be a big code size reduction for small programs. Before:
After:
That's a 2kB reduction in binary size, because the GC isn't needed anymore.
This change also adds support for using
//go:wasm-module
to set the module name of an exported function (the default remains env).I expect this can be a controversial change, because some people have been relying on the exported
malloc
function. While this function does work, it is not intended to be used like that (it's only to be used for C linked using CGo). Instead, memory should be allocated by exporting a specific function for it. For example:This is a safe way to export memory.