Skip to content

Add pattern keyword tests for non-BMP Unicode code points #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
awwright opened this issue May 23, 2019 · 1 comment · Fixed by #339
Closed

Add pattern keyword tests for non-BMP Unicode code points #264

awwright opened this issue May 23, 2019 · 1 comment · Fixed by #339
Labels
missing test A request to add a test to the suite that is currently not covered elsewhere.

Comments

@awwright
Copy link
Member

awwright commented May 23, 2019

I got a question about support for a "pattern" expression that matches only printable Unicode characters. It appears many implementations do not have support for multibyte characters in regular expressions, especially in ECMAScript. This means that non-BMP characters (which must be represented with surrogate pairs in UTF-16) don't work as expected:

> new RegExp('^🐲*$').test('')
false
> new RegExp('^🐲*$').test('🐲')
true
> new RegExp('^🐲*$').test('🐲🐲')
false

This regular expression is only matching the second character of the surrogate pair. Even though JSON Schema suggests limiting patterns to ECMAScript-compatible regular expressions, this is not intended to constrain the encoding of the string; JSON only decodes to a string of Unicode code points. ECMAScript implementations that want to match Unicode characters correctly will need to match the two-byte sequence like so:

> new RegExp('^(🐲)*$').test('🐲🐲')
true

Or use the u flag available in newer implementations of ECMAScript:

> new RegExp('^🐲*$', 'u').test('🐲🐲')
true

The "pattern" and "patternProperties" tests should test for this behavior.

@Julian
Copy link
Member

Julian commented May 23, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
missing test A request to add a test to the suite that is currently not covered elsewhere.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants