-
Notifications
You must be signed in to change notification settings - Fork 473
[spec] Allow impls to limit code point range #488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me, @sunfishcode approved in a prior discussion (and @lukewagner agreed later), but as I mentioned here I want to make sure this is discussed independently.
Specifically, I'd like to get feedback from @annevk, @domenic, and @tabatkins.
As long as it's a subset (such as ASCII) and not a different charset entirely (like Shift-JIS), yeah, no problem here. |
I don't really understand the spec text for this restriction, but it seems like other people do, so maybe it's fine. Reading the other threads, it seems like the actual interpretation is that implementations are free to reject incoming source bytes if certain bits in those bytes are set to certain values? E.g. an implementation is free to reject incoming source bytes if the most-significant-bit is set, effectively only allowing ASCII names? It would be a lot clearer to me if things were stated that way, but as I said, it seems like others aren't having this comprehension problem, so maybe it's fine. |
@tabatkins @domenic The terms "code point" and "common subsets" are clear enough to me that we're still talking about Unicode values, not binary bytes/bits. It might be helpful to be more explicit about this distinction, though. |
How are we talking about Unicode values? Isn't this spec discussing implementation-specific limitations on the inputs, which are definitely bytes? |
@domenic I have to look at the whole file; not shown in he GitHub "Files changed" feature are the section headings, which add more context to the changes. |
Right, I guess I don't understand what the first change applies to, i.e. the "Syntactic Limits" heading. |
@domenic, it's described in terms of the abstract syntax, which defines names as sequences of Unicode code points. That makes it independent of the concrete input format (e.g. binary or text format). |
Seems like there is approval and no objections, so I'll merge. |
Includes [spec] Allow impls to limit code point range (#488).
[test] Unify the error message of `"null structure reference"`.
As discussed on WebAssembly/design#1016, make it legal for implementations in environments that do not understand (all of) Unicode to only support smaller character subsets.