@@ -13,45 +13,61 @@ double-colon `::`).
13
13
14
14
## Source Text
15
15
16
- SourceCharacter :: "Any Unicode character"
16
+ SourceCharacter :: / [ \u0009\u000A\u000D\u0020-\uFFFF ] /
17
17
18
18
GraphQL documents are expressed as a sequence of
19
19
[ Unicode] ( http://unicode.org/standard/standard.html ) characters. However, with
20
- few exceptions, most of GraphQL is expressed only in the original ASCII range
21
- so as to be as widely compatible with as many existing tools, languages, and
22
- serialization formats as possible. Other than within comments, Non-ASCII Unicode
23
- characters are only found within {StringValue}.
20
+ few exceptions, most of GraphQL is expressed only in the original non-control
21
+ ASCII range so as to be as widely compatible with as many existing tools,
22
+ languages, and serialization formats as possible and avoid display issues in
23
+ text editors and source control.
24
+
25
+
26
+ ### Unicode
27
+
28
+ UnicodeBOM :: "Byte Order Mark (U+FEFF)"
29
+
30
+ Non-ASCII Unicode characters may freely appear within {StringValue} and
31
+ {Comment} portions of GraphQL.
32
+
33
+ The "Byte Order Mark" is a special Unicode character which
34
+ may appear at the beginning of a file containing Unicode which programs may use
35
+ to determine the fact that the text stream is Unicode, what endianness the text
36
+ stream is in, and which of several Unicode encodings to interpret.
24
37
25
38
26
39
### White Space
27
40
28
41
WhiteSpace ::
29
42
- "Horizontal Tab (U+0009)"
30
- - "Vertical Tab (U+000B)"
31
- - "Form Feed (U+000C)"
32
43
- "Space (U+0020)"
33
- - "No-break Space (U+00A0)"
34
44
35
45
White space is used to improve legibility of source text and act as separation
36
46
between tokens, and any amount of white space may appear before or after any
37
47
token. White space between tokens is not significant to the semantic meaning of
38
48
a GraphQL query document, however white space characters may appear within a
39
49
{String} or {Comment} token.
40
50
51
+ Note: GraphQL intentionally does not consider Unicode "Zs" category characters
52
+ as white-space, avoiding misinterpretation by text editors and source
53
+ control tools.
41
54
42
55
### Line Terminators
43
56
44
57
LineTerminator ::
45
58
- "New Line (U+000A)"
46
- - "Carriage Return (U+000D)"
47
- - "Line Separator (U+2028)"
48
- - "Paragraph Separator (U+2029)"
59
+ - "Carriage Return (U+000D)" [ lookahead ! "New Line (U+000A)" ]
60
+ - "Carriage Return (U+000D)" "New Line (U+000A)"
49
61
50
62
Like white space, line terminators are used to improve the legibility of source
51
63
text, any amount may appear before or after any other token and have no
52
64
significance to the semantic meaning of a GraphQL query document. Line
53
65
terminators are not found within any other token.
54
66
67
+ Note: Any error reporting which provide the line number in the source of the
68
+ offending syntax should use the preceding amount of {LineTerminator} to produce
69
+ the line number.
70
+
55
71
56
72
### Comments
57
73
@@ -101,9 +117,11 @@ defined here in a lexical grammar by patterns of source Unicode characters.
101
117
Tokens are later used as terminal symbols in a GraphQL query document syntactic
102
118
grammars.
103
119
120
+
104
121
### Ignored Tokens
105
122
106
123
Ignored ::
124
+ - UnicodeBOM
107
125
- WhiteSpace
108
126
- LineTerminator
109
127
- Comment
@@ -639,17 +657,46 @@ StringValue ::
639
657
640
658
StringCharacter ::
641
659
- SourceCharacter but not ` " ` or \ or LineTerminator
642
- - \ EscapedUnicode
660
+ - \u EscapedUnicode
643
661
- \ EscapedCharacter
644
662
645
- EscapedUnicode :: u /[ 0-9A-Fa-f] {4}/
663
+ EscapedUnicode :: /[ 0-9A-Fa-f] {4}/
646
664
647
665
EscapedCharacter :: one of ` " ` \ ` / ` b f n r t
648
666
649
- Strings are lists of characters wrapped in double-quotes ` " ` . (ex.
667
+ Strings are sequences of characters wrapped in double-quotes ( ` " ` ) . (ex.
650
668
` "Hello World" ` ). White space and other otherwise-ignored characters are
651
669
significant within a string value.
652
670
671
+ Note: Unicode characters are allowed within String value literals, however
672
+ GraphQL source must not contain some ASCII control characters so escape
673
+ sequences must be used to represent these characters.
674
+
675
+ ** Semantics**
676
+
677
+ StringValue :: ` "" `
678
+
679
+ * Return an empty Unicode character sequence.
680
+
681
+ StringValue :: ` " ` StringCharacter+ ` " `
682
+
683
+ * Return the Unicode character sequence of all {StringCharacter}
684
+ Unicode character values.
685
+
686
+ StringCharacter :: SourceCharacter but not ` " ` or \ or LineTerminator
687
+
688
+ * Return the character value of {SourceCharacter}.
689
+
690
+ StringCharacter :: \u EscapedUnicode
691
+
692
+ * Return the character value represented by the UTF16 hexidecimal
693
+ identifier {EscapedUnicode}.
694
+
695
+ StringCharacter :: \ EscapedCharacter
696
+
697
+ * Return the character value of {EscapedCharacter}.
698
+
699
+
653
700
#### Enum Value
654
701
655
702
EnumValue : Name but not ` true ` , ` false ` or ` null `
0 commit comments