-
Notifications
You must be signed in to change notification settings - Fork 203
Implement just-in-time context resolution. #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add optional `tag` feature processing to returned RemoteDocuments. A `tag` will be understood to mean that the same context document needn't be processed twice. A special `tag` value of `static` is interpreted to mean that a context does not even need to be retrieved more than once. - Enables greater reuse of already processed contexts and quicker discovery (via `static` tag) of already processed contexts for a given context URL (instead of requiring the context content itself to be seached for and found in a cache, only its URL needs to be found). - Addresses #339.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, LGTM, aside from some bits I don't quite understand without walking through it with an example.
Presumably, base is handled properly per recent tests.
const resolved = await options.contextResolver.resolve({ | ||
context: localCtx, | ||
documentLoader: options.documentLoader, | ||
base: options.base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the actual document location, or potential value of @base
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's ever @base
(could be wrong). It is either a document location (if the API was invoked with a URL) or the base passed via the API options. The tests pass.
if(!resolved) { | ||
// not resolved yet, resolve | ||
resolved = await this._resolveRemoteContext( | ||
{url: ctx, documentLoader, base, cycles}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have the affect of serializing all remote context fetches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_resolveRemoteContext
will fetch all remote contexts (recursively) and return an array of resolved contexts. Note that this does not include any scoped contexts, i.e., it doesn't deeply inspect the resolved contexts for those, but it will resolve any relative URLs encountered in scoped contexts to ensure the base URL used is proper. Any scoped contexts will be resolved later, JIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If by this
you meant "the await
" -- then, yes, an array of context URLs will be loaded serially now. A future PR can add more complexity here to parallelize that.
let remoteDoc; | ||
|
||
try { | ||
remoteDoc = await documentLoader(url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we're going to need to pass in API arguments to the documentLoader for options such as extractAllScripts
, but this can be handled in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I didn't want to do anything with that in this PR but it will need to be addressed in a subsequent PR.
The tests pass -- including the |
@@ -90,6 +91,11 @@ const wrapper = function(jsonld) { | |||
/** Registered RDF dataset parsers hashed by content-type. */ | |||
const _rdfParsers = {}; | |||
|
|||
// resolved context cache | |||
// TODO: consider basing max on context size rather than number | |||
const RESOLVED_CONTEXT_CACHE_MAX_SIZE = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this adjustable? Should there be some API so it can be adjustable or a custom cache could be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not adjustable in the previous version, so there's no change here. We can make it so in a future PR.
// resolve, cache, and return context | ||
const resolved = await this.resolve( | ||
{context, documentLoader, base, cycles}); | ||
this._cacheResolvedContext({key: url, resolved, tag: remoteDoc.tag}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some docs on the tag feature. Is 'static' special?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static
is a special value, yes. We can document in a separate PR.
This PR removes the preprocessing step for resolving contexts and replaces its functionality with JIT context resolution. This should fix problems with mutating JSON literal content. This feature also allows document loaders to return an additional
tag
property that jsonld.js can use to more efficiently cache and reuse already processed contexts.tag
feature processing to returned RemoteDocuments.A
tag
will be understood to mean that the same context documentneedn't be processed twice. A special
tag
value ofstatic
isinterpreted to mean that a context does not even need to be retrieved
more than once.
discovery (via
static
tag) of already processed contexts fora given context URL (instead of requiring the context content
itself to be seached for and found in a cache, only its URL
needs to be found).
@context
in JSON literals #339.