Skip to content

Commit 8e75f7e

Browse files
authored
Merge pull request #767 from handrews/content-vocab
Make the "content*" keywords annotations only
2 parents c50a8ce + a40ef07 commit 8e75f7e

File tree

1 file changed

+42
-32
lines changed

1 file changed

+42
-32
lines changed

jsonschema-validation.xml

+42-32
Original file line numberDiff line numberDiff line change
@@ -776,18 +776,19 @@
776776

777777
<section title="Foreword">
778778
<t>
779-
Properties defined in this section indicate that an instance contains
779+
Annotations defined in this section indicate that an instance contains
780780
non-JSON data encoded in a JSON string.
781-
They describe the type of content and how it is encoded.
782781
</t>
783782
<t>
784783
These properties provide additional information required to interpret JSON data
785-
as rich multimedia documents.
784+
as rich multimedia documents. They describe the type of content, how it is encoded,
785+
and/or how it may be validated. They do not function as validation assertions;
786+
a malformed string-encoded document MUST NOT cause the containing instance
787+
to be considered invalid.
786788
</t>
787789
<t>
788790
Meta-schemas that do not use "$vocabulary" SHOULD be considered to
789-
require this vocabulary as if its URI were present with a value of true,
790-
although see the Implementation Requirements below for details.
791+
require this vocabulary as if its URI were present with a value of true.
791792
</t>
792793
<t>
793794
The current URI for this vocabulary, known as the Content vocabulary, is:
@@ -801,16 +802,35 @@
801802

802803
<section title="Implementation Requirements">
803804
<t>
804-
The content keywords function as both annotations and as assertions.
805-
While no special effort is required to implement them as annotations conveying
806-
how applications can interpret the data in the string, implementing
807-
validation of conformance to the media type and encoding is non-trivial.
805+
Due to security and performance concerns, as well as the open-ended nature of
806+
possible content types, implementations MUST NOT automatically decode, parse,
807+
and/or validate the string contents by default. This additionally supports
808+
the use case of embedded documents intended for processing by a different
809+
consumer than that which processed the containing document.
810+
</t>
811+
<t>
812+
All keywords in this section apply only to strings, and have no
813+
effect on other data types.
808814
</t>
809815
<t>
810-
Implementations MAY support the "contentMediaType" and "contentEncoding"
811-
keywords as validation assertions.
812-
Should they choose to do so, they SHOULD offer an option to disable validation
813-
for these keywords.
816+
Implementations MAY offer the ability to decode, parse, and/or validate
817+
the string contents automatically. However, it MUST NOT perform these
818+
operations by default, and MUST provide the validation result of each
819+
string-encoded document separately from the enclosing document. This
820+
process SHOULD be equivalent to fully evaluating the instance against
821+
the original schema, followed by using the annotations to decode, parse,
822+
and/or validate each string-encoded document.
823+
<cref>
824+
For now, the exact mechanism of performing and returning parsed
825+
data and/or validation results from such an automatic decoding, parsing,
826+
and validating feature is left unspecified. Should such a feature
827+
prove popular, it may be specified more thoroughly in a future draft.
828+
</cref>
829+
</t>
830+
<t>
831+
See also the <xref target="security">Security Considerations</xref>
832+
sections for possible vulnerabilities introduced by automatically
833+
processing the instance string according to these keywords.
814834
</t>
815835
</section>
816836

@@ -841,29 +861,18 @@
841861
<t>
842862
The value of this property MUST be a string.
843863
</t>
844-
845-
<t>
846-
The value of this property SHOULD be ignored if the instance described is not a
847-
string.
848-
</t>
849-
850864
</section>
851865

852866
<section title="contentMediaType">
853867
<t>
854-
If the instance is a string, this property defines the media type
868+
If the instance is a string, this property indicates the media type
855869
of the contents of the string. If "contentEncoding" is present,
856870
this property describes the decoded string.
857871
</t>
858872
<t>
859873
The value of this property MUST be a string, which MUST be a media type,
860874
as defined by <xref target="RFC2046">RFC 2046</xref>.
861875
</t>
862-
863-
<t>
864-
The value of this property SHOULD be ignored if the instance described is not a
865-
string.
866-
</t>
867876
</section>
868877

869878
<section title="contentSchema">
@@ -876,8 +885,7 @@
876885
JSON Schema's data model.
877886
</t>
878887
<t>
879-
The value of this property SHOULD be ignored if the instance described is not a
880-
string, or if "contentMediaType" is not present.
888+
The value of this property SHOULD be ignored if "contentMediaType" is not present.
881889
</t>
882890
</section>
883891

@@ -897,8 +905,8 @@
897905
]]>
898906
</artwork>
899907
<postamble>
900-
Instances described by this schema should be strings, and their values
901-
should be interpretable as base64-encoded PNG images.
908+
Instances described by this schema are expected to be strings,
909+
and their values should be interpretable as base64-encoded PNG images.
902910
</postamble>
903911
</figure>
904912

@@ -915,8 +923,9 @@
915923
]]>
916924
</artwork>
917925
<postamble>
918-
Instances described by this schema should be strings containing HTML, using
919-
whatever character set the JSON string was decoded into. Per section 8.1 of
926+
Instances described by this schema are expected to be strings containing HTML,
927+
using whatever character set the JSON string was decoded into.
928+
Per section 8.1 of
920929
<xref target="RFC8259">RFC 8259</xref>, outside of an entirely closed
921930
system, this MUST be UTF-8.
922931
</postamble>
@@ -1100,7 +1109,7 @@
11001109
</section>
11011110
</section>
11021111

1103-
<section title="Security Considerations">
1112+
<section title="Security Considerations" anchor="security">
11041113
<t>
11051114
JSON Schema validation defines a vocabulary for JSON Schema core and concerns all
11061115
the security considerations listed there.
@@ -1276,6 +1285,7 @@
12761285
<t>Moved "definitions" to the core spec as "$defs"</t>
12771286
<t>Moved applicator keywords to the core spec</t>
12781287
<t>Renamed the array form of "dependencies" to "dependentRequired", moved the schema form to the core spec</t>
1288+
<t>Specified all "content*" keywords as annotations, not assertions</t>
12791289
<t>Added "contentSchema" to allow applying a schema to a string-encoded document</t>
12801290
<t>Also allow RFC 4648 encodings in "contentEncoding"</t>
12811291
<t>Added "minContains" and "maxContains"</t>

0 commit comments

Comments
 (0)