-
Notifications
You must be signed in to change notification settings - Fork 279
Module idea: WASI logging #402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you describe your goals here in more detail? Are you looking to explore whether WASI will be compatible with your goals in the long term, are you building a program or tool and looking to optimize how something runs on the Web in the short term, both, or something else? |
I am primarily interested in replacing the custom set of non-standard imports AssemblyScript has, like |
Logging in general seems straightforward to consider, and separating out functionality like this into modules is something we're already working on. I have concerns about WTF-16 string support though. As I mentioned elsewhere, interface types currently look like the most likely answer to how to interchange strings in WebAssembly, so that's what we're preparing for in WASI. Using UTF-8 for now aligns with interface-types' canonical representation, so it's the closest approximation to interface types that we can get for now. And, avoiding WTF encodings means that we won't need to worry about pieces of the ecosystem coming to depend on interchanging ill-formed data, causing compatibility problems when we start migrating to interface-types strings. Would it work for your use case if we defined a logging API that only accepted UTF-8 strings for now? I recognize it'd have some overhead for your use case, but we'd plan to address that by migrating the API to interface types as soon they become available. Concerning the GC requirement, for the case of passing a string literal to a logging function, would it be feasible for the compiler to recognize this case, and convert the literal into UTF-8 at compile time? Alternatively, is there a way in AS to do an "unsafe delete"? A logging API could guarantee to not let the pointer you pass it escape, so you could create a string, pass it to the log API, and then "unsafe delete" it afterwards, so it wouldn't need a full GC. |
👍
I guess there is more I can do, yeah, like resorting to malloc and free essentially for intermediate UTF-8 garbage, but that'll still trigger inclusion of the dynamic memory manager, which is one large dependency of GC. Doesn't really matter much anymore once the MM is included, I think.
Hmm, not sure. As far as I can tell, imposing UTF-8 on languages using a different native encoding is causing most of the problem. What do you think of adding both a let's say |
Imagine being C# developer and not understanding that QWASI supports their string encoding type. 😂 |
For the purposes of a logging interface, is the that cost of reading a UTF8 string from an ArrayBuffer in JS really that different from reading a WTF16 string from an ArrayBuffer? Either way the JS string has to created on dynamically right, is the additional UTF8 translation to WTF16 while reading from the array really that slow? (honest question, I have not measured it). Either way, logging interfaces should probably assume they could be writing to filesystem (which they likely be will in many cases) which in generally a slow operation, which I would have thought would dominate the UTF8 decode phase. No? Regardless, discussions about encoding seems separable for the specific question around whether we should add a logging API. |
I agree that this is separable, yeah. Regarding your questions, this isn't entirely about performance. I guess the best one could do in their non-UTF-8 language is something like the following: const staticBuf = memory.data(256);
export namespace console {
export function log(msg: string): void {
let size = computeUTF8Len(msg);
if (size < 256) {
encodeUTF8(msg, staticBuf);
callWasi(staticBuf, size);
} else {
let dynBuf = heap.alloc(size);
encodeUTF8(msg, dynBuf);
callWasi(dynBuf, size);
heap.free(dynBuf);
}
}
} which eliminates the need for dynamic allocation of strings considered small. So, if the string is small, one would get
which some may say is fine, while others may still be a bit unhappy, depends. Note that this already pulled in some code that is only necessary due to UTF-8 everywhere, and in general is not as efficient as it could be. The pain point, however, is not that, but that there is an
A typical compiler may not be able to apply sophisticated optimization in an attempt to DCE the dynamic memory manager post-compilation, in turn leading to every single module doing a Now, even if one would attempt to DCE the MM, there is still the looming problem of what will happen with a polyfill in the browser, which is:
Note that the latter will even be the case with the current state of Interface Types, but it has been mentioned that a "stream of As I said, it still amazes me, and I am not mad or something, just trying to raise awareness towards the implications of UTF-8 everywhere that may perhaps not be on everyone's radar yet 🙂 P.S.S. I'd be happy with a |
I don't think anybody here is suggesting you are mad. Regarding the first part of your example (that part about the cost of including malloc) wouldn't it make more sense to always allocate such strings on the stack using |
Oh, sorry, didn't want to imply that someone suggested that. It's all fine, appreciate your input 🙂 And yeah, AS does not have a C-like stack (well, technically it has some sort of managed shadow stack now for incremental GC, but can't use it for this, it's all pointers). Instead, it exclusively relies on the Wasm execution stack in an attempt to avoid unnecessary stacks, but the Wasm execution stack is a bit limited and cannot be used as well. |
Would it make sense to add a region of heap like llvm does for stack data? The convention that llvm uses is a wasm global called |
(Doing so would also avoid stuff like |
AS uses |
Many of the comments here seem to be talking about not just about logging, but about WASI APIs in general. To be clear about one thing: strings are not WASI's problem. They're WebAssembly's problem. And what's more, WebAssembly is already working on a solution. If anyone doesn't like it, WASI isn't the place to change it. I don't think Please keep this issue focused on logging, and please be open to suggestions specific to logging APIs. |
If we want to take inspiration for existing APIs, it might be worth looking at what linux chose to do: https://man7.org/linux/man-pages/man3/syslog.3.html. We might also want to consider whether we are designing a system for debugging (which is what the web's console.log/error is generally for) to event logging for things like servers and deamons which tend to have a little more structure and used in production builds. If its the former we might want to include the word "debug" somewhere in the name. |
The syslog interface looks good to me. Can map well to JS's |
Is there a fundamental difference between logging for debugging and event logging in production, besides the log level and the consumers of the log messages? I agree that initially these seem different, but I haven't yet been able to think of a way that they're different from an application perspective. If not, I think it makes sense to focus on figuring out what levels to have, and keep the API simple and general. |
I have a use case where I'd love to get rid of a custom ABI in order to switch to WASI for portability purposes, but writing to file descriptors in UTF-8 encoding exclusively doesn't map very well to my use case. So I was wondering if WASI could spec out a logging module, independently of whether the console is a terminal, a browser console or in the future perhaps sends log data over a network if someone wants to.
I am asking because console usage is an unfortunate pain point in my use case currently (bundling encoders, frequent re-encoding into dynamic allocations, potentially GCed, double re-encoding on the web and such), while everything else (like abort, random, time, etc.) would map quite well to WASI already. Just having something that isn't UTF-8 FDs would help a ton to switch to WASI while having a good feeling about it.
I'd naively imagine something like:
I am aware that "logging" can be much more complex than what I outlined here of course. Perhaps "console" would be a better name, but "logging" could become more general. Also, Interface Types may eventually help here to reduce the number of arguments.
What do you think? Is this something worth exploring? (In general I'd probably have not much to complain for a while if only logging was a bit more Web-friendly. 🙂)
The text was updated successfully, but these errors were encountered: