Vocabulary keyword meta-data, particularly for in-place applicators

Since `unevaluatedProperties` #556 and `unevaluatedItems` #557 depend on the results of other keywords, not just in the immediate schema object but in subschemas, we need to decide how extension keywords can or cannot impact that behavior.  There are two cases:

1.  New [object child applicators](https://github.com/json-schema-org/json-schema-spec/blob/3a517363ebe16441f4c0ba74a0737b742aa76132/jsonschema-core.xml#L1217) or [array child applicators](https://github.com/json-schema-org/json-schema-spec/blob/3a517363ebe16441f4c0ba74a0737b742aa76132/jsonschema-core.xml#L1159)

2.  New [in-place applicators](https://github.com/json-schema-org/json-schema-spec/blob/3a517363ebe16441f4c0ba74a0737b742aa76132/jsonschema-core.xml#L1008)

For child applicators, as noted in https://github.com/json-schema-org/json-schema-spec/issues/530#issuecomment-392608099, we should **not** allow them to change the behavior of `unevaluated*`.  This follows from `additionalProperties` and `additionalItems` which do not change as a result of new keywords.  They are defined just in terms of `properties`/`patternProperties` or `items`.

For new in-place applicators, which could *contain* `*properties` or `*items` keywords, the situation is more complex.

### _TL;DR:_

* we need to allow `*properties` and `*items` to affect `unevaluated*` even when they are in subschemas of an extension in-place applicator.
* we can solve this generally by declaring some keyword meta-data in vocabularies
* implementations that just want to implement standard vocabularies can hardwire stuff and not worry about handling meta-data at runtime, so it won't be a huge burden on implementations

### Example

In our brave new world of multi-vocabulary schemas, let's pretend someone decides to create an extension keyword `patternSchemaDependencies` which is a cross between `patternProperties` and `schemaDependencies` (the old schema form of `dependencies`).  So, if the instance is an object, and at least one property matches a pattern in `patternSchemaDependencies`, then that pattern's subschema is applied to the *current* instance location, making it an *in-place applicator*.

Consider this schema using that keyword (and assume `patternSchemaDependencies` is properly declared in the meta-schema referenced by `$schema`, and in whatever vocabulary stuff we come up with, and that the implementation will only process the schema if it understands the extension vocabulary, etc. see #561 for details)

```JSON
{
    "patternSchemaDependencies": {
        "^foo": {
            "properties": {
                "bar": {"type": "string"}
            }
        }
    },
    "patternProperties": {
        "^foo": true
    },
    "unevaluatedProperties": false
}
```

Should `{"foooo": 1, "bar": "hello"}` be valid or invalid?

My intuition says that it should be _valid_.  `unevaluatedProperties` applies to properties that have never had a subschema from `properties`, `patternProperties`, `additionalProperties`, or another `unevaluatedProperties` applied to them.

In this example, because of `patternSchemaDependencies`, the "bar" property is covered by the schema at `#/patternSchemaDependencies/^foo/properties/bar`.

### The problem

The reason this might not work is that we (presumably) did not know about `patternSchemaDependencies` when we wrote the spec for `unevaluatedProperties`.  So the implementation might not know that it could affect the behavior of `unevaluatedProperties`.

If it happens to check `patternSchemaDependencies` first, this won't matter- as explained in https://github.com/json-schema-org/json-schema-spec/issues/530#issuecomment-392608099, the `properties` keyword in its subschema would put the property name "bar" in the "properties" annotation, and `unevaluatedProperties` would notice it and exclude it from its applicable set.

_However_, if the implementation happens to check `unevaluatedProperties` **before** it checks `patternSchemaDependencies`, then the annotation results for "properties" at that point will **not** include "bar" (or anything else, in this example).  So `unevaluatedProperties` will apply it's `false` subschema to "bar", which will fail validation, and `patternSchemaDependencies` will never even be checked.

So not only would it seem counter-intuitive (to me, at least) for this to fail, it's actually non-deterministic.  It depends entirely on the keyword evaluation order, which is not constrained by the spec.

### Implementation burden

How could an implementation possibly know that it needs to check `patternSchemaDependencies` before `unevaluatedProperties`?  Of course, if the implementation only supports a fixed set of known vocabularies, the schema author could hardwire `patternSchemaDependencies` and any other known in-place applicators as being checked before `unevaluatedProperties`.  

That is totally acceptable for fixed-vocabulary implementations, _and I expect many will go this route_.

However, it breaks down if someone wants to make a generically extensible implementation where 3rd-parties can register handlers for new vocabularies and keywords at runtime.  This is not a hypothetical situation; [Ajv's custom keyword support](https://github.com/epoberezkin/ajv#defining-custom-keywords) does exactly this already.

Of course, an extensible implementation's interface could provide a way to pass in such information when registering the keyword.  However, leaving this interface to individual implementations to design will lead to variable quality and ease of use levels, increasing the barrier to adoption of extensions.  

For that matter, needing to figure out the registration design is a significant task that probably discourages making implementations extensible in the first place.


### A solution

Fortunately there's nothing magical about `patternSchemaDependencies`, specifically.  All in-place applicators will have this effect, whether they are keywords like `allOf` that we know about now, or keywords of this sort added in the future by 3rd parties.

Generally a keyword should either affect things based on its classification (in this example, all present and future in-place applicators, regardless of specific behavior, are involved), or based on the specific keyword itself (in which case, as with `additionalProperties` depending on `properties` and `patternProperties`, the relevant keywords are enumerated in the specification).

With #561 vocabulary support, we now have a way to indicate that we are defining schema keywords.  We can tag these keyword definitions with various properties in the meta-schema.  The structure of these tags would provide a standard interface for writing extensible implementations.

Presumably, most implementations would be passed the relevant meta-schemas as part of their extension loading sequence, and retrieved by recognizing the vocabulary URI at runtime (similar to how most implementations pre-package the standard meta-schemas rather than dynamically resolving them from somewhere).

We could add a keyword description object (KDO), and a keyword called `keyword` or `$keyword` that takes that object as a value.  I'm suggesting an object, similar to `links` with it's array of LDOs, as the information in the KDO will probably be processed very differently from other keywords.  I could also see using the prefixed compound word form, but this does feel distinct enough for an object.


### Solution example

It could look something like this (off the top of my head without much thought to the syntax, so while we can discuss syntax as part of the overall solution, complaints about minor details will be ignored for now- syntax is always solvable).

This example shows the declaration of an in-place applicator (`allOf`), plus a child applicator that depends only on specific keywords (`additionalProperties`) and one that depends on both specific properties and on a whole class of keywords (`unevaluatedProperties`).

The specific keyword dependencies are notated in terms of the annotations produced by that keyword, which is how such dependencies are now described in the specification.  Annotation values are read either from adjacent keywords only, or from subschemas _in addition to_ adjacent keywords.

Note that when `relevantTypes` is absent, the keyword applies to all possible instance types.

```JSON
{
    "type": "object",
    "properties": {
        "allOf": {
            "type": "array",
            "items": {"$recurse": true},
            "$keyword": {
                "applicator": {
                    "instanceLocation": "in-place",
                    "schemaLocation": "local"
                }
             }
         },
         "additionalProperties": {
             "type": "object",
             "$recurse": true,
             "$keyword": {
                 "relevantTypes": ["object"],
                 "applicator": {
                     "instanceLocation": "child",
                     "schemaLocation": "remote"
                 },
                 "annotation": true,
                 "dependsOn": {
                     "annotations": {
                         "properties": "adjacentOnly",
                         "patternProperties": "adjacentOnly"
                     }
                 }
             }
         },
         "unevaluatedProperties": {
             "type": "object",
             "additionalProperties": {"$recurse": true},
             "$keyword": {
                 "relevantTypes": ["object"],
                 "applicator": {
                     "child": true,
                     "remote": false
                 },
                 "annotation": true,
                 "dependsOn": {
                     "annotations": {
                         "properties": "subschemas",
                         "patternProperties": "subschemas",
                         "additionalProperties": "subschemas",
                         "unevaluatedProperties": "subschemas"
                     },
                     "classifications": {
                         "applicator": {
                             "instanceLocation": "in-place"
                         }
                     }
                 }
             }
         }         
     }
 }
 ```
 
To explain the `schemaLocation` part, note that `$ref` (which would not be in the same vocabulary) would have `{"instanceLocation": "in-place", "schemaLocation": "remote"}`.  Since the classification dependency for `unevaluatedProperties` only mentions `instanceLocation`, that means that it matches regardless of the value of `schemaLocation`.  This is very hand-wavy and I have not thought through all implications.  I am sure that there will be a way to work it out.

There's obviously a lot more that could be done in this area, and we need to figure out what is so essential that it needs to be in draft-08, and what can be deferred.  But I think that this mechanism is a key part of enabling schema designers to write their own vocabulary, _and have a viable chance of that vocabulary becoming interoperable across multiple implementations_.

### Should `[$]keyword` be part of core?

Should the keyword description object be part of core or its own vocabulary?  I'd say that this will be determined by whether we consider extensible implementations a fundamental part of the JSON Schema system.  If they are, then we need `$keyword` to bootstrap the system.  If they are not, then we can make this a separate thing that only extensible implementations need to support.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

Vocabulary keyword meta-data, particularly for in-place applicators #602

TL;DR:

Example

The problem

Implementation burden

A solution

Solution example

Should `[$]keyword` be part of core?

Example Time

21 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Vocabulary keyword meta-data, particularly for in-place applicators #602

Description

TL;DR:

Example

The problem

Implementation burden

A solution

Solution example

Should [$]keyword be part of core?

Activity

gregsdennis commented on Jun 11, 2018

Example Time

gregsdennis commented on Jun 11, 2018

handrews commented on Jun 11, 2018

handrews commented on Jun 17, 2018

gregsdennis commented on Jul 2, 2018

21 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions

Should `[$]keyword` be part of core?