Skip to content

Basic vocabulary support #561

Closed
Closed
@handrews

Description

@handrews

Proposal

  • Vocabularies

    • Group keyword definitions and their semantics (the proposal does not add any way to formally define semantics in JSON, they're defined in the specs just as they are now)
    • Are identified by URIs, which (like meta-schema URIs) should not automatically be retrieved
    • The URI SHOULD point to a meta-schema describing only that vocabulary's keyword
    • SHOULD NOT overlap with or redefine keywords from other vocabularies
      • The hyper-schema meta-schema includes validation, but the hyper-schema vocabulary is only base and links, and the keywords in the LDOs
    • Vocabularies are file-scoped; keywords MUST have the same semantics throughout a single schema document
  • Meta-Schemas

    • In a meta-schema, $vocabularies takes a list of URIs identifying the vocabularies described by the meta-schema
    • Like $schema in a schema, $vocabularies must be in the root object of the meta-schema
    • Meta-schemas SHOULD validate the combination of vocabularies they declare
    • Combining vocabularies is facilitated by $recurse (Recursive schema composition #558)
    • Meta-schemas MAY further constrain that vocabulary combination
    • Meta-schemas MAY describe keywords that are not in any declared vocabulary
    • Meta-schemas that do not declare a vocabulary, or that declare additional keywords, create an anonymous vocabulary (this just fits existing meta-schemas into the vocabulary concept)

Examples:

NOTE: "core-applicators" (stuff moved by #513) and "validation-assertions" (stuff left behind by #513) are not final names or vocabulary boundaries, I literally made them up while typing, please do not complain about whether they are "correct".

The applicators (per #513) as a vocabulary

This is where $recurse ( #558 ) would primarily be used. This assumes that dependencies has been split per #528, and the applicator version is still called dependencies, while the string form is re-named and left in the validation vocabulary.

{
    "$id": "http://json-schema.org/draft-08/vocabularies/core-applicators",
    "$schema": "http://json-schema.org/draft-08",
    "type": ["object", "boolean"],
    "properties": {
        "allOf": {"type": "array", "items": {"$recurse": true}},
        "anyOf": {"type": "array", "items": {"$recurse": true}},
        "oneOf": {"type": "array", "items": {"$recurse": true}},
        "not": {"$recurse": true},
        "if": {"$recurse": true},
        "then": {"$recurse": true},
        "else": {"$recurse": true},
        "items": {
            "oneOf": [
                {"type": "array", "items": {"$recurse": true}},
                {"$recurse": true}
            ]
        },
        "additionalItems": {"$recurse": true},
        "contains": {"$recurse": true},
        "properties": {"type": "object", "additionalProperties": {"$recurse": true}},
        "patternProperties": {"type": "object", "additionalProperties": {"$recurse": true}},
        "additionalProperties": {"$recurse": true},
        "propertyNames": {"type": "object", "additionalProperties": {"$recurse": true}},
        "dependencies": {"type": "object", "additionalProperties": {"$recurse": true}
    }
}

Hyper-Schema as a vocabulary

This only shows some of the LDO fields, and ignores that we actually distribute the links schema as a separate file

{
    "$id": "http://json-schema.org/draft-08/vocabularies/hyper-schema",
    "$schema": "http://json-schema.org/draft-08/hyper-schema#",
    "type": ["object", "boolean"],
    "properties": {
        "base": {"type": "string", "format": "uri-template"},
        "links": {
            "type": "array",
            "items": {
                "type": "object",
                "required": ["rel", "href"],
                "properties": {
                    "rel": {"type": "string"},
                    "href": {"type": "string", "format": "uri-template"},
                    "hrefSchema": {"$recurse": true},
                    "targetSchema": {"$recurse": true},
                    "targetMediaType": {"type": "string"},
                    "targetHints": {"type": "object", "additionalProperties": true},
                    "headerSchema": {"$recurse": true},
                    "submissionSchema": {"$recurse": true},
                    "submissionMediaType": {"type": "string"},
                    ...
                }
            }
        }
    },
    "links": [
        {
            "rel": "self",
            "href": "{%24id}",
            "templateRequired": ["$id"]
        }
    ]
}

Hyper-Schema meta-schema with vocabularies

This assumes a "validation-assertions" vocabulary for the vocabulary spec, and assumes the core keywords do not need to be declared as a vocabulary (although maybe they should be, I'm not sure). Also, I'm waving my hands when it comes to where the basic annotations (title, default, etc.) live, just pretend that's settled somehow please, as sorting that out is not the point of this issue.

This also assumes that the draft-08 regular schema properly assembles everything except for the hyper-schema vocabulary. So while we declare all of the vocabularies explicitly, to get the meta-schema behavior, we just combine the regular meta-schema and the hyper-schema-vocabulary-only meta-schema (shown above).

{
    "$id": "http://json-schema.org/draft-08/hyper-schema",
    "$schema": "http://json-schema.org/draft-08/hyper-schema#",
    "$vocabularies": [
        "http://json-schema.org/draft-08/vocabularies/core-applicators",
        "http://json-schema.org/draft-08/vocabularies/validation-assertions",
        "http://json-schema.org/draft-08/vocabularies/hyper-schema"
    ],
    "allOf": [
        "http://json-schema.org/draft-08/schema",
        "http://json-schema.org/draft-08/vocabluaries/hyper-schema"
    ]
}

OpenAPI 3.0's superset/subset problem

Using a meta-schema to constrain or add lightweight extensions helps discourage creating many similar vocabularies. For example, consider a meta-schema for OpenAPI's schema object, which does not allow the "null" type and instead has a boolean "nullable" keyword, and also does not allow patternProperties. @philsturgeon has referred to this mismatch as a "superset/subset".

Also, they require extension keywords to begin with "x-" and forbid other keywords that are not defined in the spec. Note the use of unevaluatedProperties (#556) for this.

This example explicitly allOfs the vocabulary schemas. A variation on the proposal is for $vocabularies to also do that implicitly. Needs a bit more thought on whether you'd ever not allOf them, and why. See #558 for why just allOf works without redefining recursive keywords (the core-applicators vocabulary would be written with "$recurse": true instead of "$ref": "#").

NOTE: "core-applicators" (stuff moved by #513) and "validation-assertions" (stuff left behind by #513) are not final names or vocabulary boundaries, I literally made them up while typing, please do not complain about whether they are "correct".

{
    "$id": "https://www.openapis.org/schema-object-metaschema",
    "$schema": "http://json-schema.org/draft-08",
    "$vocabularies": [
        "http://json-schema.org/draft-08/vocabularies/core-applicators",
        "http://json-schema.org/draft-08/vocabularies/validation-assertions"
    ],
    "allOf": [
        {"$ref": "http://json-schema.org/draft-08/vocabularies/core-applicators"},
        {"$ref": "http://json-schema.org/draft-08/vocabularies/validation-assertions"}
    ],
    "properties": {
        "type": {
            "type": "string",
            "not": {"const": "null"}
        },
        "nullable": {
            "type": "boolean",
            "default": false
       },
       "patternProperties": false
    },
    "patternProperties": {
        "^x-": true
    },
    "unevaluatedProperties": false
}

What's going on here is:

  • Anywhere an implementation sees a keyword form the core-applicators or validation-assertion vocabularies, it knows that the semantics are as defined by those vocabularies
  • It knows this even if it has never previously encountered this OpenAPI meta-schema, because $vocabularies makes that clear while just having an allOf is ambiguous.
  • While all OpenAPI schema objects are valid against the given vocabularies, the reverse is not necessarily true:
    • The validation-assertion vocabulary defines semantics for a "null" value for type, the meta-schema prevents that value from appearing
    • It also prevents the array form of type – in the normal meta-schema, the type of type is "type": ["string", "array"]
    • The core-applicator vocabulary defines semantics for patternProperties, but the meta-schema prevents that keyword from being used
  • The meta-schema defines additional keywords, which form a small anonymous vocabulary of sorts:
    • nullable is an extension keyword
    • ^x- is an extension keyword pattern
    • These are defined just as they would be without $vocabularies

There's more to work out but I think this is enough to start the conversation and find out which parts are particularly confusing.

Activity

added this to the draft-08 milestone on Mar 8, 2018
handrews

handrews commented on Mar 8, 2018

@handrews
ContributorAuthor

A key point of this proposal is that since meta-schemas still work the same way as always (although hopefully with #558's $recurse), an implementation is not required to do anything to support vocabularies. $vocabularies allows a more generic, flexible implementation of JSON Schema to be more intelligent about what it can process.

But a validator that either just looks at $schema for the standard meta-schema, or behaves based on caller input or configuration rather than paying attention to meta-schemas, is still just as compliant as always.

Unless we decide that $vocabularies implicitly means that the vocabularies are combined with allOf, there's no mandatory implementation for $vocabularies.

philsturgeon

philsturgeon commented on Mar 8, 2018

@philsturgeon
Collaborator

So to confirm, by default it's assumed "all vocabularies" are used? What is "all" and where does that come from.

handrews

handrews commented on Mar 9, 2018

@handrews
ContributorAuthor

@philsturgeon I'll break the "by default" down a bit:

  • If there is no $schema in the schema, then (as one presumably does now) the implementation either makes a guess, applies its own default, or requires other input telling it how to proceed
  • If there is a $schema but no $vocabularies in the meta-schema, then just as now, an implementation SHOULD behave in accordance with $schema (but I've noticed many do whatever their default is unless you specifically tell the otherwise- honestly I think we should tighten up this requirement but I'll file that separately)

In the 2nd case, with a $schema but no $vocabularies, the meta-schema is considered to be defining its own anonymous vocabulary. However, there is no practical impact of that concept right now, it just allows us to always say what vocabularies are involved (either an anonymous one, a set of identified ones, or a set of identified ones with an additional anonymous vocabulary (in the OpenAPI example, nullable and ^x- are the terms in the anonymous vocabulary; type is not, as it is defined by one of the $vocabularies and the meta-schema is just adding a syntactical constraint without changing the semantics)

It is still the case that by default, all unrecognized keywords are ignored. Therefore, there is no concept of "all" vocabularies.

If you had an actual blank meta-schema, it would allow everything, but would not indicate any semantics. So I wouldn't consider it a vocabulary. You don't have a vocabulary until you constrain that open set of everything into specific syntax constraints (expressed as meta-schemas) and semantics (defined in prose specifications- I have some thoughts on formalizing this but almost certainly not in draft-08, I'd like to get some feedback and understand use cases with the basic concept first).

43 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Closed

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @philsturgeon@xferra@handrews@gregsdennis@jgonzalezdr

      Issue actions

        Basic vocabulary support · Issue #561 · json-schema-org/json-schema-spec