68
68
69
69
< h1 > Specification</ h1 >
70
70
71
- < p > This page is the authority on what Jsonnet programs should do. It defines Jsonnet syntax and
72
- parsing . It describes which programs should be rejected statically (i.e. before execution).
71
+ < p > This page is the authority on what Jsonnet programs should do. It defines Jsonnet lexing and
72
+ syntax . It describes which programs should be rejected statically (i.e. before execution).
73
73
Finally, it specifies the manner in which the program is executed, i.e. the JSON that is output, or
74
- the dynamic error if there is one.</ p >
74
+ the runtime error if there is one.</ p >
75
75
76
76
< p > The specification is intended to be terse but precise. The intention is to illuminate various
77
77
subtleties and edge cases in order to allow fully-compatible reimplementations of the language, as
@@ -81,6 +81,65 @@ <h1>Specification</h1>
81
81
semantics</ a > . If that's not your cup of tea, then see the more discussive description of Jsonnet
82
82
behavior in < a href ="/docs/tutorial.html "> tutorial</ a > .</ p >
83
83
84
+ < h2 > Lexing</ h2 >
85
+
86
+ < p > A Jsonnet program is a UTF-8 encoded text file or string. The file is a sequence of tokens,
87
+ separate by optional whitespace and comments. Whitespace consists of space, tab, newline and
88
+ carriage return. Tokens are lexed greedily. Comments are either single line comments, beginning
89
+ with a < code > #</ code > or a < code > //</ code > , or block comments beginning with < code > /*</ code > and
90
+ terminating at the first < code > */</ code > encountered within the comment.</ p >
91
+
92
+ < ul >
93
+
94
+ < li > < i > id</ i > : Matched by < tt > [_a-zA-Z][_a-zA-Z0-9]*</ tt >
95
+ < p >
96
+ Some identifiers are reserved as keywords, thus are not in the set < i > id</ i > :
97
+ < code > assert</ code > < code > else</ code > < code > error</ code > < code > false</ code > < code > for</ code >
98
+ < code > function</ code > < code > if</ code > < code > import</ code > < code > importstr</ code > < code > in</ code >
99
+ < code > local</ code > < code > null</ code > < code > tailstrict</ code > < code > then</ code > < code > self</ code >
100
+ < code > super</ code > < code > true</ code >
101
+ </ p >
102
+ </ li >
103
+
104
+ < li > < i > number</ i > : As defined by < a href ="http://json.org/ "> JSON</ a > but without the leading minus.</ li >
105
+
106
+ < li > < i > string</ i > : Which can have 3 forms:
107
+ < ul >
108
+ < li > Double-quoted, beginning with < code > "</ code > and ending with the first subsequent non-quoted < code > "</ code > </ li >
109
+ < li > Single-quoted, beginning with < code > '</ code > and ending with the first subsequent non-quoted < code > '</ code > </ li >
110
+ < li > Text block, beginning with < code > |||</ code > , followed by optional whitespace and a new-line.
111
+ The next line must be prefixed with some non-zero length whitespace < i > W</ i > . The block ends at the
112
+ first subsequent line that does not begin with < i > W</ i > , and it is an error if this line does not
113
+ contain some optional whitespace followed by < code > |||</ code > . The content of the string is the
114
+ concatenation of all the lines that began with < i > W</ i > but with that prefix stripped. The line
115
+ ending style in the file is preserved in the string.</ li >
116
+ </ ul >
117
+ </ li >
118
+ < p > Double- and single-quoted strings are allowed to span multiple lines, in which case whatever
119
+ dos/unix end-of-line character appears in the string. They both understand the following escape
120
+ characters: < code > "'\bfnrt0</ code > which have their standard meanings, as well as
121
+ < code > \uXXXX</ code > for hexadecimal unicode escapes.</ p >
122
+
123
+ < li > < i > symbol</ i > :
124
+ < ul >
125
+ < li > The following single-character symbols:
126
+ < p > < code > {}[],.();</ code > </ p >
127
+ </ li >
128
+ < li > Sequences of at least one of the following symbols:
129
+ < code > !$:~+-&|^=<>*/%</ code >
130
+ < p > With the following caveats, which will cause the sequence to stop:</ p >
131
+ < ul >
132
+ < li > The sequence < code > //</ code > is not allowed in an operator</ li >
133
+ < li > The sequence < code > /*</ code > is not allowed in an operator</ li >
134
+ < li > The sequence < code > |||</ code > is not allowed in an operator</ li >
135
+ < li > If the sequence has more than one symbol, it is not allowed to end in any of < code > +-~!</ code > </ li >
136
+ </ ul >
137
+
138
+ </ li >
139
+ </ ul >
140
+
141
+
142
+
84
143
< h2 > Abstract Syntax</ h2 >
85
144
86
145
< p > In this notation, < i > x</ i > ★ defines a comma-separated possibly zero-length list of < i > x</ i >
@@ -282,10 +341,6 @@ <h2>Abstract Syntax</h2>
282
341
</ td > </ tr >
283
342
</ table >
284
343
285
- < p > Additionally, < i > id</ i > is defined by regular expression: < tt > [a-zA-Z_][a-zA-Z0-9_]*</ tt > . The
286
- definition of < i > string</ i > is equivalent to the JSON string, including escape characters. Finally,
287
- < i > number</ i > is equivalent to the JSON number, but without the leading < code > -</ code > .</ p >
288
-
289
344
< h2 > Associativity and Operator Precedence</ h2 >
290
345
291
346
< p > The parsing of the concrete syntax into abstract syntax can be controlled by adding parentheses
0 commit comments