-
-
Notifications
You must be signed in to change notification settings - Fork 345
Description
[this is a bit stream-of-consciousness, but I wanted to get it filed because I keep forgetting- we'll clean up the ideas here on the way to PRs]
format
confuses pretty much everyone. I have noticed people filing issues against various implementations complaining of imperfect enforcement (I believe @Julian has received complaints about "email", and @johandorland about "hostname", and I suspect they are not alone).
format
, contentMediaType
, and contentEncoding
are essentially best effort validation keywords in practice. Many if not most implementations make at least some effort to validate format
. I'm not sure if anything attempts that for content*
as they are new (at least as part of the validation spec), and they would essentially require parsing the string encoding and media type which is potentially very expensive.
Complicating the matter for format
is the fact that many of the relatively fundamental internet-related formats such as "email" and "hostname" are very old, and conformance to specifications is rather complicated.
For "hostname", RFC 1034 forbids leading digits, but this is sometimes ignored in practice, leading to ambiguous overlap with "ipv4" as a format. In practice, most programs that accept hostnames will also accept ipv4 addresses and just recognize that no DNS resolution is required, so this is rarely a concern.
The difficulty of validating email addresses, even on the syntactical level, is well-documented (try finding a regular expression that will do it, for instance, and if you find an actual iron-clad one, let me know).
Leveraging our relatively recent keyword classification work, I think it is best to classify these primarily as annotations rather than treating them as some sort of hybrid annotation+assertion. Annotations can specify any intent, including semantic validation or parsing instructions. The specification should provide guidance on how an implementation might directly offer handlers for such intents, and how to indicate the available level of support.
Applications can, as with any annotation, then perform additional processing if the implementation either does not offer any validation, or offers only incomplete validation. The spec already says that implementations SHOULD offer an ability to turn semantic validation off, so we can extend that guidance (probably at the MAY level) to cover situations like allowing hooks for application-defined processing in addition to or in place of implementation-supplied validation.
And of course, all of this is dependent on an implementation supporting annotations. As with the additionalProperties
and additionalItems
keywords (now in the core spec and defined in terms of annotation collection), the spec should allow for the existing sort of implementations to continue to be valid and in conformance for implementations that do not implement general annotation collection support.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status