Skip to content

test that sibling $ids to $refs are ignored #493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 26, 2021

Conversation

karenetheridge
Copy link
Member

This tests that $ref resolves to something different in draft3-7 than one would expect in modern versions, because the $id is not parsed when it is adjacent to a $ref. This is not the same as a $ref not being resolvable at all.

@karenetheridge karenetheridge requested a review from a team June 26, 2021 04:33
@karenetheridge karenetheridge merged commit 1326f36 into master Jun 26, 2021
@karenetheridge karenetheridge deleted the ether/id-sibling-to-ref branch June 26, 2021 04:37
@jdesrosiers
Copy link
Member

This is more controversial that you might think. See #458. In the end we decided to consider this undefined behavior and it should not have tests. #492 is good because it changes the test to avoid the ambiguous case. This, however, is not going to go over well with some people and is contrary to the precedent set in #458.

@karenetheridge
Copy link
Member Author

karenetheridge commented Jun 26, 2021

I agree with the changes in #458 - that a $ref into a definitions adjacent to a $ref should not be resolveable as that part of the schema "doesn't exist". I even have a test for this in my draft7 implementation that fails (and marked TODO, but in reality I never intend to fix this because of the extra complexity involved - either we mutate the schema and physically remove the unaccessable bits, or we keep track of these unaccessable locations in a separate list and check it whenever navigating a $ref to see if that path falls in the verboten area. option 1 violates a no-mutation promise, and 2 just seems ugh since this is only for old versions).

However, I don't think the changes in this PR (#493) are the same thing at all. It's much more straightforward when processing any keyword (or when starting out processing a subschema, and considering what keywords to look at) to go "there's a $ref here, so we won't process any other keywords adjacent to it".

What am I missing? what about the changes here in #493 are controversial/ambiguous?

edit: fix horrific abuse of parentheses

@jdesrosiers
Copy link
Member

I think this change represents how it should be interpreted. @Julian was the one who disagreed with my interpretation in #458, so maybe he will have something to say about this.

I don't think the changes in this PR (#493) are the same thing at all.

I don't know what to say to this. It's exactly the same case as #458. The only difference you've identified is that some keywords are easier to ignore in your implementation that others. In my implementation, ignoring keywords inside a reference is trivial. It's just a natural consequence of the architecture.

If anything, $id and $schema are the ones that it makes sense to make an exception for. Otherwise it's extremely awkward to figure out how to interpret a reference. A dialect needs to be declared somehow in order to know what version of $ref it is. Without allowing a $schema, the only way to know is if the implementation allows you to specify a default dialect. $id is similar. The only way to set a base URI is if the implementation allows you to set it externally. Without a base URI, any relative non-fragment-only URI would be invalid and cause an error.

@karenetheridge
Copy link
Member Author

karenetheridge commented Jun 28, 2021

It's exactly the same case as #458

No, it's not, because it's a different keyword, that's evaluated differently. That's exactly why we have different tests for different combinations of keywords, because we don't know how implementation decisions might affect how different combinations of keywords interact. Some combinations seem similar enough that we don't bother testing all variants, but as various implementations pop up where it's apparent that the differences are relevant, we add tests for them. Dismissing them because they're "exactly the same" to you and the architecture decisions you made is not constructive.

Evaluating an $id (or not) when it is adjacent to a $ref can have a very different implementation compared to processing a $ref from a totally different location to a position that is not adjacent to but rather a nephew to another $ref. The target subschema is not the current subject of the evaluation - not the uncle "$ref", not even the parent "definition" that contains the resolved location.

I can think of a third implementation architecture (different from both mine and ours) that might get tripped up by the test I added here in this PR, because of the way it pre-scans a schema for $ids before it starts the recursive evaluation phase, where it might not properly detect that there is a sibling $ref and therefore know to ignore the keyword. Another case that would be similarly caught (by that particular implementation) is covered by another test case I added a while ago in #484 -- which is actually covered by a totally different part of the spec but because of particular architecture decisions might fail due to the same code implementation.

@jdesrosiers
Copy link
Member

It's exactly the same case as #458

No, it's not, because it's a different keyword, that's evaluated differently.

In #458 we didn't agree on an interpretation. We compromised by agreeing to consider the case undefined and should not be tested either way. If the behavior of definitions in a $ref is considered undefined, why would it be any different for $id? Again, you and I agree on the correct way to handle this. I only bring this up to be respectful of others who have expressed disagreement on this interpretation in the past and to stick to the agreement we made. But maybe I'm misunderstanding the opposing interpretation. I'll shut up now and let others speak for themselves.

Dismissing them because they're "exactly the same" to you and the architecture decisions you made is not constructive.

This is a misunderstanding. You seemed to be making an argument based on properties of your implementation. I mentioned how the properties of my implementation differ in order to show that any argument based on the properties of any implementation is not helpful. I was quite surprised to see the same argument directed back at me.

@karenetheridge
Copy link
Member Author

Okay. If we're all satisfied that the existing tests are okay the way they are, we don't need to belabour this point.

@Julian
Copy link
Member

Julian commented Jun 29, 2021

To me yes this is likely equally undefined, but as I said somewhere, I'm likely tired of arguing about this topic :).

davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request Oct 21, 2021
I guess `$ref` isn't supposed to take sibling `$id` values into account.
This switches to modifying the current `parent_uri` after `$ref` is
processed. I still needed to update the root `parent_uri` to support
`test_it_handles_nested_refs`, when the schema has an `$id` based on
where it was fetched.

Addresses failures from: json-schema-org/JSON-Schema-Test-Suite#493
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request Apr 22, 2022
I guess `$ref` isn't supposed to take sibling `$id` values into account.
This switches to modifying the current `parent_uri` after `$ref` is
processed. I still needed to update the root `parent_uri` to support
`test_it_handles_nested_refs`, when the schema has an `$id` based on
where it was fetched.

Addresses failures from: json-schema-org/JSON-Schema-Test-Suite#493
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 15, 2023
I guess `$ref` isn't supposed to take sibling `$id` values into account.
This switches to modifying the current `parent_uri` after `$ref` is
processed. I still needed to update the root `parent_uri` to support
`test_it_handles_nested_refs`, when the schema has an `$id` based on
where it was fetched.

Addresses failures from: json-schema-org/JSON-Schema-Test-Suite#493
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 16, 2023
I guess `$ref` isn't supposed to take sibling `$id` values into account.
This switches to modifying the current `parent_uri` after `$ref` is
processed. I still needed to update the root `parent_uri` to support
`test_it_handles_nested_refs`, when the schema has an `$id` based on
where it was fetched.

Addresses failures from: json-schema-org/JSON-Schema-Test-Suite#493
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 16, 2023
I guess `$ref` isn't supposed to take sibling `$id` values into account.
This switches to modifying the current `parent_uri` after `$ref` is
processed. I still needed to update the root `parent_uri` to support
`test_it_handles_nested_refs`, when the schema has an `$id` based on
where it was fetched.

Addresses failures from: json-schema-org/JSON-Schema-Test-Suite#493
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 22, 2023
This is an attempt to simplify dereferencing refs and address issues
exposed by new json-schema-test-suite tests. The main thing here is
giving schema objects a `base_uri` that's used when resolving ids (and
as the initial base URI for instances). Child schemas use the `ref_uri`
that was used to resolve them as their `base_uri` in order to get nested
refs and ids to resolve properly.

More details about specific issues:

- `$id` is ignored when `$ref` is present (`instance.base_uri` is
  updated after `validate_ref`), because `$ref` isn't supposed to take
  sibling `$id` values into account. `@base_uri` handles nested refs.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#493

- JSON pointers are evaluated relative to the ref URI. Previously,
  pointers were always evaluated using the root schema. Now they're
  evaluated relative to the schema with a matching `$id` (usually
  nearest parent with an `$id`; or specific id (see below); default is
  root).
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#457

- JSON pointers are evaluated for id refs. This allows a ref to look up
  a schema by `$id` and then apply a JSON pointer to use a subschema.
  This uses the same logic as above. The important part is removing the
  fragment from `ref_uri` if it's a JSON pointer so that the lookup in
  `ids` works properly. The fragment is kept if it's not a JSON pointer
  to support location-independent ids.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

- JSON pointer refs are always joined with the base URI. I [started
  handling them][0] separately because of an [issue][1] with invalid
  URIs. But now I think that was incorrect and that fragment pointers
  need to be encoded properly for URIs. The [specification says][2]:

  > In all cases, dereferencing a "$ref" reference involves first
  > resolving its value as a URI reference against the current base URI.

- Empty fragments are removed in `join_uri` to have consistent URIs to
  lookup in `ids`. Meta schemas, for example, have empty fragments in
  their top-level ids (eg, `http://json-schema.org/draft-07/schema#`)
  and removing the JSON pointer fragments causes them not to be found.

[0]: b91115e
[1]: #54
[2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 26, 2023
This is an attempt to simplify dereferencing refs and address issues
exposed by new json-schema-test-suite tests. The main thing here is
giving schema objects a `base_uri` that's used when resolving ids (and
as the initial base URI for instances). Child schemas use the `ref_uri`
that was used to resolve them as their `base_uri` in order to get nested
refs and ids to resolve properly.

More details about specific issues:

- `$id` is ignored when `$ref` is present (`instance.base_uri` is
  updated after `validate_ref`), because `$ref` isn't supposed to take
  sibling `$id` values into account. `@base_uri` handles nested refs.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#493

- JSON pointers are evaluated relative to the ref URI. Previously,
  pointers were always evaluated using the root schema. Now they're
  evaluated relative to the schema with a matching `$id` (usually
  nearest parent with an `$id`; or specific id (see below); default is
  root).
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#457

- JSON pointers are evaluated for id refs. This allows a ref to look up
  a schema by `$id` and then apply a JSON pointer to use a subschema.
  This uses the same logic as above. The important part is removing the
  fragment from `ref_uri` if it's a JSON pointer so that the lookup in
  `ids` works properly. The fragment is kept if it's not a JSON pointer
  to support location-independent ids.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

- JSON pointer refs are always joined with the base URI. I [started
  handling them][0] separately because of an [issue][1] with invalid
  URIs. But now I think that was incorrect and that fragment pointers
  need to be encoded properly for URIs. The [specification says][2]:

  > In all cases, dereferencing a "$ref" reference involves first
  > resolving its value as a URI reference against the current base URI.

- Empty fragments are removed in `join_uri` to have consistent URIs to
  lookup in `ids`. Meta schemas, for example, have empty fragments in
  their top-level ids (eg, `http://json-schema.org/draft-07/schema#`)
  and removing the JSON pointer fragments causes them not to be found.

[0]: b91115e
[1]: #54
[2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 26, 2023
This is an attempt to simplify dereferencing refs and address issues
exposed by new json-schema-test-suite tests. The main thing here is
giving schema objects a `base_uri` that's used when resolving ids (and
as the initial base URI for instances). Child schemas use the `ref_uri`
that was used to resolve them as their `base_uri` in order to get nested
refs and ids to resolve properly.

More details about specific issues:

- `$id` is ignored when `$ref` is present (`instance.base_uri` is
  updated after `validate_ref`), because `$ref` isn't supposed to take
  sibling `$id` values into account. `@base_uri` handles nested refs.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#493

- JSON pointers are evaluated relative to the ref URI. Previously,
  pointers were always evaluated using the root schema. Now they're
  evaluated relative to the schema with a matching `$id` (usually
  nearest parent with an `$id`; or specific id (see below); default is
  root).
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#457

- JSON pointers are evaluated for id refs. This allows a ref to look up
  a schema by `$id` and then apply a JSON pointer to use a subschema.
  This uses the same logic as above. The important part is removing the
  fragment from `ref_uri` if it's a JSON pointer so that the lookup in
  `ids` works properly. The fragment is kept if it's not a JSON pointer
  to support location-independent ids.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

- JSON pointer refs are always joined with the base URI. I [started
  handling them][0] separately because of an [issue][1] with invalid
  URIs. But now I think that was incorrect and that fragment pointers
  need to be encoded properly for URIs. The [specification says][2]:

  > In all cases, dereferencing a "$ref" reference involves first
  > resolving its value as a URI reference against the current base URI.

- Empty fragments are removed in `join_uri` to have consistent URIs to
  lookup in `ids`. Meta schemas, for example, have empty fragments in
  their top-level ids (eg, `http://json-schema.org/draft-07/schema#`)
  and removing the JSON pointer fragments causes them not to be found.

[0]: b91115e
[1]: #54
[2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants