Skip to content

[spec] Allow impls to limit code point range #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2017
Merged

Conversation

rossberg
Copy link
Member

As discussed on WebAssembly/design#1016, make it legal for implementations in environments that do not understand (all of) Unicode to only support smaller character subsets.

Copy link
Member

@jfbastien jfbastien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me, @sunfishcode approved in a prior discussion (and @lukewagner agreed later), but as I mentioned here I want to make sure this is discussed independently.

Specifically, I'd like to get feedback from @annevk, @domenic, and @tabatkins.

@tabatkins
Copy link

As long as it's a subset (such as ASCII) and not a different charset entirely (like Shift-JIS), yeah, no problem here.

@domenic
Copy link
Member

domenic commented May 30, 2017

I don't really understand the spec text for this restriction, but it seems like other people do, so maybe it's fine.

Reading the other threads, it seems like the actual interpretation is that implementations are free to reject incoming source bytes if certain bits in those bytes are set to certain values? E.g. an implementation is free to reject incoming source bytes if the most-significant-bit is set, effectively only allowing ASCII names?

It would be a lot clearer to me if things were stated that way, but as I said, it seems like others aren't having this comprehension problem, so maybe it's fine.

@RyanLamansky
Copy link

@tabatkins @domenic The terms "code point" and "common subsets" are clear enough to me that we're still talking about Unicode values, not binary bytes/bits. It might be helpful to be more explicit about this distinction, though.

@domenic
Copy link
Member

domenic commented May 30, 2017

How are we talking about Unicode values? Isn't this spec discussing implementation-specific limitations on the inputs, which are definitely bytes?

@RyanLamansky
Copy link

@domenic I have to look at the whole file; not shown in he GitHub "Files changed" feature are the section headings, which add more context to the changes.

@domenic
Copy link
Member

domenic commented May 30, 2017

Right, I guess I don't understand what the first change applies to, i.e. the "Syntactic Limits" heading.

@rossberg
Copy link
Member Author

@domenic, it's described in terms of the abstract syntax, which defines names as sequences of Unicode code points. That makes it independent of the concrete input format (e.g. binary or text format).

@rossberg
Copy link
Member Author

rossberg commented Jun 6, 2017

Seems like there is approval and no objections, so I'll merge.

@rossberg rossberg merged commit 56ac21e into spec.limits Jun 6, 2017
@rossberg rossberg deleted the spec.limits.jf branch June 6, 2017 11:05
rossberg added a commit that referenced this pull request Jun 6, 2017
Includes [spec] Allow impls to limit code point range (#488).
dhil pushed a commit to dhil/webassembly-spec that referenced this pull request Jan 25, 2024
[test] Unify the error message of `"null structure reference"`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants