46
46
.separated thead tr th { border : 1px solid black; padding : .2em ; }
47
47
.separated tbody tr td { border : 1px solid black; text-align : center; }
48
48
.separated tbody tr td .r { text-align : right; padding : .5em ; }
49
- .grammar td { font-family : monospace;}
49
+ .grammar td { font-family : monospace; vertical-align : top; }
50
50
.grammar-literal { color : gray;}
51
+ .grammar_comment { color : # A52A2A ; font-style : italic; }
51
52
code {color : # ff4500 ;} /* Old W3C Style */
52
53
</ style >
53
54
</ head >
@@ -241,17 +242,22 @@ <h3>RDF Blank Nodes</h3>
241
242
< h2 > A Canonical form of N-Quads</ h2 >
242
243
243
244
< p > This section defines a canonical form of N-Quads which has
244
- less variability in layout.
245
- The grammar for the language is the same .</ p >
245
+ a completely specified layout.
246
+ The grammar for the language is unchanged .</ p >
246
247
247
- < p class ="note "> A canonical form of N-Quads can be used to ensure
248
- that variations in the syntactic representation of terms
249
- within that quad are determined; each code point
248
+ < p > Canonical N-Quads extends
249
+ < a data-cite ="RDF12-N-TRIPLES#canonical-ntriples "> Canonical N-Triples</ a > in [[RDF12-N-TRIPLES]]
250
+ to include < code > < a href ="#grammar-production-graphLabel "> graphLabel</ a > </ code > .</ p >
251
+
252
+ < p > While the N-Quads syntax allows choices for the representation and layout of RDF data,
253
+ the canonical form of N-Quads provides a unique syntactic representation of any quad.
254
+ Each code point
250
255
can be represented by only one of
251
256
< code > < a href ="#grammar-production-UCHAR "> UCHAR</ a > </ code > ,
252
257
< code > < a href ="#grammar-production-ECHAR "> ECHAR</ a > </ code > ,
253
258
or unencoded character,
254
- where the relevant production allows for a choice in representation.</ p >
259
+ where the relevant production allows for a choice in representation.
260
+ Each quad is represented entirely on a single line with specified white space.</ p >
255
261
256
262
< p > Canonical N-Quads has the following additional constraints on layout:</ p >
257
263
< ul >
@@ -266,35 +272,25 @@ <h2>A Canonical form of N-Quads</h2>
266
272
MUST NOT use the datatype IRI part of the < a href ="#grammar-production-literal "> literal</ a > ,
267
273
and are represented using only < a href ="#grammar-production-STRING_LITERAL_QUOTE "> STRING_LITERAL_QUOTE</ a > .
268
274
</ li >
269
- <!--li><code><a href="#grammar-production-HEX">HEX</a></code> MUST use only uppercase letters (<code>[A-F]</code>).</li-->
270
- < li > Characters MUST NOT be represented by < code > < a href ="#grammar-production-UCHAR "> UCHAR</ a > </ code > .</ li >
275
+ < li > < code > < a href ="#grammar-production-HEX "> HEX</ a > </ code > MUST use only uppercase letters (< code > [A-F]</ code > ).</ li >
271
276
< li > Within < a href ="#grammar-production-STRING_LITERAL_QUOTE "> STRING_LITERAL_QUOTE</ a > ,
272
- the characters
273
- < code > U+0022</ code > , < code > U+005C</ code > , < code > U+000A</ code > , < code > U+000D</ code >
274
- MUST be encoded using < code > < a href ="#grammar-production-ECHAR "> ECHAR</ a > </ code > .
275
- < code > < a href ="#grammar-production-ECHAR "> ECHAR</ a > </ code > MUST NOT be used for characters that are
276
- allowed directly in
277
- < code > < a href ="#grammar-production-STRING_LITERAL_QUOTE "> STRING_LITERAL_QUOTE</ a > </ code > . </ li >
278
- < li > The token < code > < a href ="#grammar-production-EOL "> EOL</ a > </ code > MUST be a single < code > U+000A</ code > .</ li >
279
- < li > The final < code > < a href ="#grammar-production-EOL "> EOL</ a > </ code > MUST be provided.</ li >
277
+ the characters
278
+ < code > U+0008</ code > (< code title ="BACKSPACE "> < sub > BS</ sub > </ code > ),
279
+ < code > U+0009</ code > (< code title ="HORIZONTAL TAB "> < sub > HT</ sub > </ code > ),
280
+ < code > U+000A</ code > (< code title ="LINE FEED "> < sub > LF</ sub > </ code > ),
281
+ < code > U+000C</ code > (< code title ="FORM FEED "> < sub > FF</ sub > </ code > ),
282
+ < code > U+000D</ code > (< code title ="CARRIAGE RETURN "> < sub > CR</ sub > </ code > ),
283
+ < code > U+0022</ code > (< code title ="DOUBLE QUOTE "> "</ code > ), and
284
+ < code > U+005C</ code > (< code title ="BACKSLASH "> \</ code > )
285
+ MUST be encoded using < code > < a href ="#grammar-production-ECHAR "> ECHAR</ a > </ code > .
286
+ Characters in the range from < code > U+0000</ code > to < code > U+001F</ code >
287
+ and < code > U+007F</ code > (< code title ="delete "> < sub > DEL</ sub > </ code > )
288
+ that are not represented using < code > < a href ="#grammar-production-ECHAR "> ECHAR</ a > </ code >
289
+ MUST be represented by < code > < a href ="#grammar-production-UCHAR "> UCHAR</ a > </ code > .
290
+ All other characters MUST be represented by their native [[UNICODE]] representation.</ li >
291
+ < li > The token < code > < a href ="#grammar-production-EOL "> EOL</ a > </ code > MUST be a single < code > U+000A</ code > .</ li >
292
+ < li > The final < code > < a href ="#grammar-production-EOL "> EOL</ a > </ code > MUST be provided.</ li >
280
293
</ ul >
281
-
282
- < div class ="issue " data-number ="16 ">
283
- < p > Re-consider the use of `UCHAR` and `ECHAR` escapes in N-Triples/N-Quads canonicalization.
284
- The 1.1-based recommendation prohibits the use of `UCHAR` (`U+XXXX`)
285
- and allows `ECHAR` only for `U+0022` (quote `\"`),
286
- `U+005C` (backslash `\\`),
287
- `U+000A` (< code title ="LINE FEED "> < sub > LF</ sub > </ code > `\n`),
288
- and `U+000D` (< code title ="CARRIAGE RETURN "> < sub > CR</ sub > </ code > `\r`).
289
- However, the use of control characters can obfuscate text when presented,
290
- creating a potential security concern.</ p >
291
-
292
- < p > A future version may consider requiring all characters between
293
- `U+0000` and `U+001F` (other than `U+000A` (< code title ="LINE FEED "> < sub > LF</ sub > </ code > )
294
- and `U+000D` (< code title ="CARRIAGE RETURN "> < sub > CR</ sub > </ code > ))
295
- along with `U+007F` (< code title ="delete "> < sub > DEL</ sub > </ code > )
296
- to be represented using `UCHAR`.</ p >
297
- </ div >
298
294
</ section >
299
295
300
296
< section id ="conformance ">
0 commit comments