Recursive schema composition

## TL;DR:

Re-using recursive schemas is a challenge.  

* `$recurse` is a specialized version of `$ref` with a context-dependent target
* The target is the root schema of the document where schema processing began
* Processing can be either static schema walking or dynamic evaluation with an instance
* The value of `$recurse` is always `true` (discussed in the "alternatives" section)
* This is based on a keyword we have long used in [Doca](https://github.com/cloudflare/doca)

## Example

_APPARENTLY MANDATORY DISCLAIMER: This is a minimal contrived example, please do not point out all of the ways in which it is unrealistic or fails to be a convincing use case because you can refactor it.  It's just showing the mechanism._

foo-schema:
```JSON
{
    "$id": "http://example.com/foo-schema",
    "properties": {
        "type": "object",
        "foo": {"$recurse": true}
    }
}
```
bar-schema:
```JSON
{
    "$id": "http://example.com/bar-schema",
    "allOf": [{"$ref": "http://example.com/foo"}],
    "required": ["bar"],
    "properties": {"bar": {"type": "boolean"}}
}
```

The instance:
```JSON
{
    "bar": true,
    "foo": {
        "bar": false,
        "foo": {
            "foo": {}
        }
    }
}
```
is valid against the first schema, but not the second.

It is valid against foo-schema because the `"$recurse": true` is in foo-schema, which is the same document that we started processing.  Therefore it behaves exactly like `"$ref": "#"`.  The recursive "foo" works as you'd expect with `"$ref": "#"`, and foo-schema doesn't care about "bar" being there (additional properties are not forbidden).

However, it is not valid against bar-schema because in that case, the `"$recurse": true` in foo-schema behaves like `"$ref": "http://example.com/bar-schema"`, as bar-schema is the document that we started processing.  Taking this step by step from the top down:

* Processing the root of the instance, we have the "bar" property required by bar-schema; we got this directly from the root schema of bar-schema, without `$recurse` being involved
* Looking inside "foo", processing follows the `allOf` and `$ref` to foo-schema.  The top-level instance is an object, so we pass the `type` constraint
* Still processing foo-schema, for the contents of the "foo" property, we have `"$recurse": true.  Since we started processing with bar-schema, this is the equivalent of `"$ref": "bar-schema"
* So now we apply bar-schema to the contents of foo.  This works fine: there is a boolean "bar", and we follow `allOf` and `$ref` back to foo-schema, and pass the `"type": "object" constraint
* Now, once again, we look at `"$recurse": true` to go into the next level "foo", and once again this is treated as `"$ref": "bar-schema"`
* Now validation fails, because the innermost "foo" does not have the required "bar" property.

## Use cases

The primary use case for this meta-schemas.  For example, the [hyper-schema meta-schema](https://github.com/json-schema-org/json-schema-spec/blob/master/hyper-schema.json) has to re-define all of the applicator keywords from the [core and validation meta-schema](https://github.com/json-schema-org/json-schema-spec/blob/master/schema.json).  And if something wanted to extend hyper-schema, not only would they have to re-declare all of the core applicators a ***third*** time, but also re-declare all of the LDO keywords that use `"$ref": "#"`.

As we make more vocabularies and encourage more extensions, this rapidly becomes untenable.

I will show what the hyper-schema meta-schema would look like with `$recurse` in a subsequent comment.

There are some other use cases in hypermedia with common response formats, but they are all simpler than the meta-schema use case.

## Alternatives

### Doca's `cfRecurse`

This is a simplified version of an extension keyword, `cfRecurse`, used with [Doca](https://github.com/cloudflare/doca).  That keyword takes a JSON Pointer (not a URI fragment) that is evaluated with respect to the post-`$ref`-resolution in-memory data structure.  [EDIT: Although don't try it right now, it's broken, long story that is totally irrelevant to the proposal.]

If that has you scratching your head, that's part of why I'm not proposing `cfRecurse`'s exact behavior.

In fact, Doca only supports `""` (the root JSON Pointer) as a `cfRecurse` value, and no one has ever asked for any other path.  The use case really just comes up for us with pure recursion.

Specifying any other pointer requires knowing the structure of the in-memory document.  And when the whole point is that you don't know what your original root schema (where processing began) will be until runtime, you cannot know that structure.

One could treat the JSON Pointer as an interface constraint- "this schema may only be used with an initial document that has a `/definitions/foo` schema", but that is a lot of complexity for something that has never come up in practice.

For this reason, `$recurse` does not take a meaningful value.  I chose `true` because `false` or `null` would be counter-intuitive (you'd expect those values to *not* do things), and a number, string, array, or object would be much more subject to error or misinterpretation.

### Parametrized schemas

#322 proposes a general schema parametrization feature, which could possibly be used to implement this feature.  It would look something like:

Parameterized schema for `oneOf`:

```JSON
{
    "$id": "http://example.com/oneof",
    "properties": {
        "oneOf": {
            "items": {"$ref": {"$param": "rootPointer"}}
        }
    }
}
```
Using the parametrized schema:
```JSON
{
    "$id": "http://example.com/caller",
    "allOf": [
        {
            "$ref": "http://example.com/oneof",
            "$params": {
                "rootPointer": "http://example.com/caller"
            }
        }
    ],
    ...
}
```
See #322 for an explanation of how this works.

I'd rather not open the schema parametrization can of worms right now.  `$recurse` is a much simpler and easy to implement proposal and meets the core need for meta-schema extensibility.  It does not preclude implementing schema parametrization, either in a later draft or as an extension vocabulary of some sort (it makes an interesting test case for vocabulary support, actually).

## Summary

* extending recursive schemas is a fundamental use case of JSON Schema as seen in meta-schemas, which happens to require knowledge of where runtime processing started
* referring to something inside a schema document determined at runtime adds a lot of complexity and has no apparent use case (neither from Doca nor from any issue I've ever seen here), so let's not do it

Runtime resolution (whether `$recurse` or parametrized schemas) is sufficiently new and powerful that I feel we should lock it down to the simplest case with a clear need.  We can always extend it later, but it's hard to pull these things back.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

Recursive schema composition #558

TL;DR:

Example

Use cases

Alternatives

Doca's `cfRecurse`

Parametrized schemas

Summary

12 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Recursive schema composition #558

Description

TL;DR:

Example

Use cases

Alternatives

Doca's cfRecurse

Parametrized schemas

Summary

Activity

handrews commented on Mar 6, 2018

handrews commented on Mar 6, 2018

Relequestual commented on Mar 7, 2018

handrews commented on Mar 7, 2018

Relequestual commented on Mar 7, 2018

handrews commented on Mar 7, 2018

handrews commented on Mar 8, 2018

awwright commented on Apr 10, 2018

awwright commented on Apr 10, 2018

12 remaining items

handrews commented on Jun 15, 2018

handrews commented on Jun 16, 2018

ghost commented on Oct 29, 2018

handrews commented on Nov 1, 2018

ghost commented on Nov 1, 2018

handrews commented on Nov 1, 2018

handrews commented on Nov 13, 2018

ghost commented on Nov 13, 2018

handrews commented on Nov 14, 2018

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions

Doca's `cfRecurse`