Skip to content

Commit 39dd6ae

Browse files
committed
Make "format" behavior more predictable
By default, a false vocabulary prevents "format" from being validated. By default, a true vocabulary requires "format" to be validated, although the degree of validation required remains somewhat vague at least for this draft. In both the true and false cases, validation can be toggled on or off when passing schemas and instances to the implementation (although in the false case, there is no guarantee at all that turning on valdiation will produce any validation behavior; this matches the previous draft's "format" specification).
1 parent 1ef7e00 commit 39dd6ae

File tree

1 file changed

+138
-53
lines changed

1 file changed

+138
-53
lines changed

jsonschema-validation.xml

+138-53
Original file line numberDiff line numberDiff line change
@@ -517,7 +517,7 @@
517517

518518
<t>
519519
Implementations MAY treat "format" as an assertion in addition to an annotation,
520-
and attempt to validate the values conformance to the specified semantics.
520+
and attempt to validate the value's conformance to the specified semantics.
521521
See the Implementation Requirements below for details.
522522
</t>
523523

@@ -532,7 +532,10 @@
532532
<cref>
533533
Note that the "type" keyword in this specification defines an "integer" type
534534
which is not part of the data model. Therefore a format attribute can be
535-
limited to numbers, but not specifically to integers.
535+
limited to numbers, but not specifically to integers. However, a numeric
536+
format can be used alongside the "type" keyword with a value of "integer",
537+
or could be explicitly defined to always pass if the number is not an integer,
538+
which produces essentially the same behavior as only applying to integers.
536539
</cref>
537540
</t>
538541

@@ -555,60 +558,140 @@
555558
<section title="Implementation Requirements">
556559
<t>
557560
The "format" keyword functions as an annotation, and optionally as an assertion.
558-
Declaring the format vocabulary in "$vocabulary" with a value of true indicates
559-
that an implementation MUST treat the keyword as an assertion.
560-
This means that:
561-
<list>
562-
<t>they SHOULD implement validation for attributes defined below;</t>
563-
<t>they SHOULD offer an option to disable validation for this keyword.</t>
564-
</list>
561+
<cref>This is due to the keyword's history, and is not in line with current
562+
keyword design principles.</cref> In order to manage this ambiguity, the
563+
"format" keyword is defined in its own separate vocabulary, as noted above.
564+
The true or false value of the vocabulary declaration governs the implementation
565+
requirements necessary to process a schema that uses "format", and the
566+
behaviors on which schema authors can rely.
565567
</t>
566568

567-
<t>
568-
Due to the complexity involved in fully validating some format attributes
569-
defined in this specification, implementations MAY provide only limited
570-
validation support for some format attributes. Implementations SHOULD
571-
document any such intentional limitations.
572-
</t>
569+
<section title="As an annotation">
570+
<t>
571+
The value of format MUST be collected as an annotation, if the implementation
572+
supports annotation collection. This enables application-level validation when
573+
schema validation is unavailable or inadequate.
574+
</t>
575+
<t>
576+
This requirement is not affected by the boolean value of the vocabulary
577+
declaration, nor by the configuration of "format"'s assertion
578+
behavior described in the next section.
579+
<cref>
580+
Requiring annotation collection even when the vocabulary is declared with
581+
a value of false is atypical, but necessary to ensure that the best
582+
practice of performing application-level validation is possible even when
583+
assertion evaluation is not implemented. Since "format" has always been
584+
a part of this specification, requiring implementations to be aware of it
585+
even with a false vocabulary declaration is deemed to not be a burden.
586+
</cref>
587+
</t>
588+
</section>
573589

574-
<t>
575-
The <xref target="meta-schema">standard core and validation meta-schema</xref>
576-
includes this vocabulary in its "$vocabulary" keyword with a value of false,
577-
since by default implementations are not required to support this keyword
578-
as an assertion.
579-
</t>
590+
<section title="As an assertion">
591+
<t>
592+
Regardless of the boolean value of the vocabulary declaration,
593+
an implementation that can evaluate "format" as an assertion MUST provide
594+
options to enable and disable such evaluation. The assertion evaluation
595+
behavior when the option is not explicitly specified depends on
596+
the vocabulary declaration's boolean value.
597+
</t>
580598

581-
<t>
582-
When the format vocabulary is declared with a value of false, an implementation
583-
SHOULD treat "format" as an annotation keyword, to facilitate applications which
584-
wish to do their own semantic validation.
585-
<cref>
586-
This is not the normal behavior for a vocabulary declared with false, which is
587-
why the requirement is a SHOULD. Implementations MAY ignore "format" entirely
588-
as is allowed by false vocabulary declarations. However, due to the long history
589-
of this keyword, treating it as something of a special case seems reasonable.
590-
This may be revised in future drafts based on feedback.
591-
</cref>
592-
</t>
593-
<t>
594-
Implementations MAY treat "format" as an assertion when the vocabulary is declared
595-
with false, as false vocabularies are optional rather than forbidden. However,
596-
as noted above, implementations SHOULD provide a way to disable such validation.
597-
</t>
598-
<t>
599-
Implementations MAY support custom format attributes. Save for agreement between
600-
parties, schema authors SHALL NOT expect a peer implementation to support such
601-
custom format attributes. An implementation MUST NOT fail
602-
validation or cease processing due to an unknown format attribute.
603-
If treating "format" as an annotation, implementations SHOULD collect both
604-
known and unknown format attribute values.
605-
</t>
606-
<t>
607-
Vocabularies do not support specifically declaring different value sets for keywords.
608-
Due to this limitation, and the historically uneven implementation of this keyword,
609-
it is RECOMMENDED to define additional keywords in a vocabulary rather than
610-
additional format attributes if interoperability is desired.
611-
</t>
599+
<t>
600+
When implementing this entire specification, this vocabulary MUST
601+
be supported with a value of false (but see details below),
602+
and MAY be supported with a value of true.
603+
</t>
604+
605+
<t>
606+
When the vocabulary is declared with a value of false, an implementation:
607+
<list>
608+
<t>
609+
MUST NOT evaluate "format" as an assertion unless it is explicitly
610+
configured to do so;
611+
</t>
612+
<t>
613+
SHOULD provide an implementation-specific best effort validation
614+
for each format attribute defined below;
615+
</t>
616+
<t>
617+
MAY choose to implement validation of any or all format attributes
618+
as a no-op by always producing a validation result of true;
619+
</t>
620+
<t>
621+
SHOULD document its level of support for validation.
622+
</t>
623+
</list>
624+
<cref>
625+
This matches the current reality of implementations, which provide
626+
widely varying levels of validation, including no validation at all,
627+
for some or all format attributes. It is also designed to encourage
628+
relying only on the annotation behavior and performing semantic
629+
validation in the application, which is the recommended best practice.
630+
</cref>
631+
</t>
632+
633+
<t>
634+
When the vocabulary is declared with a value of true, an implementation
635+
that supports this form of the vocabulary:
636+
<list>
637+
<t>
638+
MUST evaluate "format" as an assertion unless it is explicitly
639+
configured not to do so;
640+
</t>
641+
<t>
642+
MUST implement syntactic validation for all format attributes defined
643+
in this specification, and for any additional format attributes that
644+
it recognizes, such that there exist possible instance values
645+
of the correct type that will fail validation.
646+
</t>
647+
</list>
648+
The requirement for minimal validation of format attributes is intentionally
649+
vague and permissive, due to the complexity involved in many of the attributes.
650+
Note in particular that the requirement is limited to syntactic checking; it is
651+
not to be expected that an implementation would send an email, attempt to connect
652+
to a URL, or otherwise check the existence of an entity identified by a format
653+
instance.
654+
<cref>
655+
The expectation is that for simple formats such as date-time, syntactic
656+
validation will be thorough. For a complex format such as email addresses,
657+
which are the amalgamation of various standards and numerous adjustments
658+
over time, with obscure and/or obsolete rules that may or may not be
659+
restricted by other applications making use of the value, a minimal validation
660+
is sufficient. For example, an instance string that does not contain
661+
an "@" is clearly not a valid email address, and an "email" or "hostname"
662+
containing characters outside of 7-bit ASCII is likewise clearly invalid.
663+
</cref>
664+
</t>
665+
<t>
666+
It is RECOMMENDED that implementations use a common parsing library for each format,
667+
or a well-known regular expression. Implementations SHOULD clearly document
668+
how and to what degree each format attribute is validated.
669+
</t>
670+
<t>
671+
The <xref target="meta-schema">standard core and validation meta-schema</xref>
672+
includes this vocabulary in its "$vocabulary" keyword with a value of false,
673+
since by default implementations are not required to support this keyword
674+
as an assertion. Supporting the format vocabulary with a value of true is
675+
understood to greatly increase code size and in some cases execution time,
676+
and will not be appropriate for all implementations.
677+
</t>
678+
</section>
679+
<section title="Custom format attributes">
680+
<t>
681+
Implementations MAY support custom format attributes. Save for agreement between
682+
parties, schema authors SHALL NOT expect a peer implementation to support such
683+
custom format attributes. An implementation MUST NOT fail
684+
validation or cease processing due to an unknown format attribute.
685+
When treating "format" as an annotation, implementations SHOULD collect both
686+
known and unknown format attribute values.
687+
</t>
688+
<t>
689+
Vocabularies do not support specifically declaring different value sets for keywords.
690+
Due to this limitation, and the historically uneven implementation of this keyword,
691+
it is RECOMMENDED to define additional keywords in a custom vocabulary rather than
692+
additional format attributes if interoperability is desired.
693+
</t>
694+
</section>
612695
</section>
613696

614697
<section title="Defined Formats">
@@ -1322,8 +1405,10 @@
13221405
<list style="hanging">
13231406
<t hangText="draft-handrews-json-schema-validation-02">
13241407
<list style="symbols">
1325-
<t>Update "format" implementation requirements in terms of vocabularies</t>
13261408
<t>Grouped keywords into formal vocabuarlies</t>
1409+
<t>Update "format" implementation requirements in terms of vocabularies</t>
1410+
<t>By default, "format" MUST NOT be validated, although validation can be enabled</t>
1411+
<t>A vocabulary declaration can be used to require "format" validation</t>
13271412
<t>Moved "definitions" to the core spec as "$defs"</t>
13281413
<t>Moved applicator keywords to the core spec</t>
13291414
<t>Renamed the array form of "dependencies" to "dependentRequired", moved the schema form to the core spec</t>

0 commit comments

Comments
 (0)