Skip to content

Sequential scalars in a regex literal combined into a single character #82535

Open
@natecook1000

Description

@natecook1000

In a comment to [#82424], a different issue is highlighted by @stefanspringer1. Here the two scalar literals \x{33C}\x{347} are meant to represent the end of one range and the start of another, but the parser treats them as a single character/grapheme cluster, which is in turn invalid as the bound of a character class range.

...It is not only about an isolated range, but:

 _ = "abc".contains(#/[\x{31C}-\x{333}]/#) // OK
 _ = "abc".contains(#/[\x{339}-\x{33C}\x{347}-\x{349}]/#) // error: '̼͇' is an invalid bound for character class range
 _ = "abc".contains(#/[\x{347}-\x{349}]/#) // OK

Note that in the middle the ranges from the first and the last expression are both used sequentially.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA deviation from expected or documented behavior. Also: expected but undesirable behavior.regex literalsFeature → expressions → literals: regex literals

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions