-
-
Notifications
You must be signed in to change notification settings - Fork 593
recursive relativel references don't seem to work correctly #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not sure when I'll have a second to look at this (I of course appreciate you filing the ticket and especially providing an example!) -- my guess would be that this is possibly related to caching behavior, but as I've said in the past (not to you obviously :P) is that relative URIs don't really make much sense in schemas AFAICT, let alone URIs that reference |
I've run into this as well. I have a group of schemas I'm working on which have nested dependencies like this. The error is below. It looks to me like the scoping URL is getting a "/" added to it in the scope process. Is there a better mental image for how to refer to groups of schema files than with $ref? I couldn't find another way to think about this for schema files I'm working with locally. Is the mental image to simply wrap up all the schemas into one larger file and then reference into that as needed? Traceback (most recent call last): |
I'm using I created example to demonstrate this. There are three
CASE 1: The
CASE 2: The
There is no problem with CASE 1 - it loads relative schema file perfectly.
|
Relative references to files should work being nested at any depth within filesystem tree. WorkaroundAs suggested by @traut here, providing resolver (see original example) makes all relative references work: from jsonschema import validate, RefResolver
# ...
json_schema = read_json(json_schema_path)
json_schema_dir = os.path.dirname(os.path.realpath(json_schema_path))
# python-jsonschema needs base_uri (it needs to be with trailing /)
# to be able to resolve local file references
resolver = RefResolver(referrer = json_schema, base_uri = 'file://' + json_schema_dir + '/')
json_data = read_json(json_data_path)
validate(json_data, json_schema_dir, resolver = resolver) |
I still would like to give this a look at some point, apologies for it taking so long, but just to be clear again -- no, in my opinion, relative references should never work by default. They're unportable and out-of-spec-ish. Having to provide an additional configured object to get them to work is I think a reasonable amount of work to have to do. |
@Julian, I support the stance on portability and sticking to the specs. What is deceptive is the fact that it works when additional file is referenced from level 1 to level 2, but it unexpectedly fails from level 2 to level 3 (and 3-to-4, 4-to-5, ... (N-1)-to-N, I guess). This feature could be treated as unportable but it can also be complete at the same time. From the general public perspective (away from specs), I also guess that relative references in JSON schema are deemed similar to how relative links in HTML work (they make subdirectory with HTML files in filesystem tree "mountable" at any URL prefix served by HTTP server without breaking links within that sub-tree). Such functionality might be useful to become part of the specs eventually. |
The problem is that for the first file the references are resolved against the base_uri, but for reffed files the references are resolved against those file's paths. The fix is to use the file's full URL as base_uri instead of having a path to the file. So, if you have file '/foo/bar/example.json', then your base_uri should be '/foo/bar/example.json'. I guess the problem is that it's too easy to think the base_uri as the base to resolve all references against, when in fact it's the uri to resolve the first file's references against. A small documentation fix might be in order here. |
Another solution might be to just have some validator method for which you give the full path of the schema as parameter, nothing else. This way the validator would immediately know the base_uri for the first file and could always resolve relative to the file being currently under processing. |
@Julian I'm not sure I understand the argument that relative filenames are out-of-spec. It looks to me that the spec refers to JSON-Reference as the way of scoping URLs. The JSON-Ref spec has rules about combining URIs which look to me like they will support relative paths and actually talk explicitly about how to join paths and use base uri scheme in such cases (https://tools.ietf.org/html/rfc3986#section-5.2.3 and nearby). So does the $ref resolving logic use the right scheme inheritance rules in RFC 3986, so that relative uris in schemas work right? That way there'd be the RefResolver/base_uri indicator needed, but schemas could use just "$ref": "relative/path/schema.json". I'm suspicious from my own experimentation that this does not work. Instead, the resolving logic seems to want to see "$ref": "file:schema.json" and to have a bug that is not RFC-3986-compliant in the path merges as well as non-compliance to the scheme logic in RFC-3986 for relative uris. But I'm not a JSON-Ref expert so it could be this interpretation is not right. |
In many cases it would make a lot of sense to build the schema validator with a file reference directly. Something along the lines of:
The
This way you'd get an usable validator in single line, and in addition it would be impossible to get the base_uri wrong. |
According to section 7 of the current JSON Schema core spec:
Unless I am really misunderstanding something here, that means that relative references are part of the specification. Relative references are useful when you have an API suite that is present on many hosts (for instance, for appliance configuration), and those hosts do not necessarily have connectivity to the wider internet. |
@Julian Would it perhaps make sense somewhere around here https://github.com/Julian/jsonschema/blob/ee2de6d0e0b6fed6f580cf2d744fb2790fe06a54/jsonschema/validators.py#L459-L461 to check if |
Apologies for being behind here guys -- the most helpful thing I think for me at this point would be a suggested new test case. @ashb I have to think about that a bit more carefully, but it's possible And to clarify -- what I was referring to with "out of spec" is a relative, scheme-unqualified file references without declaring an ID, but again I'm not too familiar with the URI parts of the spec unfortunately so it's possible I'm wrong even there. |
@Julian we understand. Being an open source maintainer is a hard, often thankless task. I appreciate this module! |
@Julian Late to the party, here -- I second @ashb's suggestion as a possible way forward. In
Happy to raise a PR with a test case if that would still be helpful. Thanks for a great package! |
@erickpeirson a PR would be great! And thanks :) |
Having the same issue here myself. I have a staff.schema file with a I get the error |
How about always use relative-paths with respect to root folder of project?grandparent_json has reference to parent_json ("$ref" : "parent_json")parent_json has reference to child_json ("$ref" : "child_json")All 3 json files can be located anywhere in the project
|
The new jsonschema draft seems to make it clear that $refs should be resolved per https://tools.ietf.org/html/rfc3986#section-4.2. That is, in the absence of a scheme or base uri, these should be taken from the referring document. That implies knowing the base uri of the referring document (or passing it explicitly). The use case I'm particularly interested in is having a path (in openapi) like |
(hopefully not too noisy) I made this recursive ref resolver here: https://gist.github.com/kratsg/96cec81df8c0d78ebdf14bf7b800e938 The idea was that (on some systems), one wants to recursively resolve the entire schema at once into a single local-file. So this was an implementation that relies on the referrer document to be used as the base for resolving refs located in that document. |
Sorry for taking so long to get to some of these, but I'm going through a bunch of Specifically, the base URI being provided in the linked repo has no trailing slash, so relative URIs resolved against it indeed are supposed to strip the last component. If I change the code in that repo to: diff --git a/schemas/container.json b/schemas/container.json
index e9b1598..43b8516 100644
--- a/schemas/container.json
+++ b/schemas/container.json
@@ -3,7 +3,7 @@
"title": "container",
"description": "JSON schema for container",
"type": "object",
- "allOf": [ { "$ref": "file:schema/generic.json" } ],
+ "allOf": [ { "$ref": "file:generic.json" } ],
"properties": {
"type": { "type": "string" },
"children": {
diff --git a/schemas/project.json b/schemas/project.json
index e5c8d37..a33e4dd 100644
--- a/schemas/project.json
+++ b/schemas/project.json
@@ -3,7 +3,7 @@
"title": "project",
"description": "JSON schema for project",
"type": "object",
- "allOf": [ { "$ref": "file:schemas/container.json" } ],
+ "allOf": [ { "$ref": "file:container.json" } ],
"properties": {
"type": { "type": "string", "pattern": "^project$" },
"children": {
diff --git a/validate.py b/validate.py
index bca15c8..892b98f 100644
--- a/validate.py
+++ b/validate.py
@@ -11,11 +11,11 @@ type_hierarchy_dict = None
# puts them in a dict, indexed by title
for filename in [each for each in os.listdir('schemas') if each.endswith('.json')]:
path = os.path.join('schemas', filename)
- print "reading schema:", path
+ print("reading schema:", path)
with open(path) as json_data:
tmp = json.load(json_data)
- if (tmp['properties'].has_key('type')):
+ if 'type' in tmp['properties']:
schemas_dict[tmp['title']] = tmp
schema_types = schemas_dict.keys()
@@ -24,8 +24,8 @@ schema_types = schemas_dict.keys()
def validate(data):
try:
- uri = 'file://'+os.path.join(os.getcwd(), 'schemas')
- print "uri:", uri
+ uri = 'file://'+os.path.join(os.getcwd(), 'schemas') + '/'
+ print("uri:", uri)
r = jsonschema.RefResolver(uri, None)
v = jsonschema.Draft4Validator(schemas_dict[data['type']], resolver=r)
v.validate(data) everything works as it should. Closing this, though I'll be adding a FAQ entry since this seems to be a common mistake, but if I've missed anything (either in this comment or in others here) please feel free to open a new issue. |
For some reason I'm still trapped in this one, is this already fixed in 4.14.0? My setup
JSON files and schemas are all in the same folder. Codeimport jsonschema as jsch
schema = load_json('/path/to/root.schema.json')
jsch.validate(instance=ins, schema=schema)
The first layer: root.schema.json...
"external": {
"anyOf": [
{
"$ref": "file:mid1.schema.json"
},
{
"$ref": "file:mid2.schema.json"
}
]
}
...
The second layer: mid1.schema.json...
"external": {
"anyOf": [
{
"$ref": "file:target.schema.json"
},
{
"$ref": "file:else.schema.json"
}
]
}
... ExpectedThere should be no missing reference missing. Got
I've also tried to run the CLI program This error looks like |
Maybe I'm misunderstanding JSON Schema, but the resolver seems to be adding directories onto it's resolver path as it goes down a relative path. If I have a 'project' schema that references a 'container' schema, that references a 'generic' schema, the references have to look like this to work, even with all files in the same directory:
project: "allOf": [ { "$ref": "file:schemas/container.json" } ],
container: "allOf": [ { "$ref": "file:generic.json" } ],
I've got a github project here because the example has to be a little involved: https://github.com/fleur/jsonschema-bug
The text was updated successfully, but these errors were encountered: