-
Notifications
You must be signed in to change notification settings - Fork 157
Detail how IRI conflicts are resolved when compacting/expanding #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think we always have this issue as we allow an author to use an IRI directly in the key position. That way expansion and compaction always have to deal with the above described scenarios. Example:
So I think we should try to handle this complexity and eliminate all ambiguity instead of restricting the syntax. |
The intent for how IRI conflicts are resolved when compacting/expanding: any conflicts between terms that use the same IRI will use the most specific solution (considers both When expanding multiple keys that resolve to the same IRI, the expanded value will have all of the values associated with the IRI merged into a single JSON array (the order of the values in the resulting JSON array is undefined). |
Added an example which shows how compact IRIs can be used in a context. This addresses #74. I'll leave the issue open as long as the API spec hasn't been updated.
More on this discussion here: http://json-ld.org/minutes/2012-03-06/#topic-3 |
I would be ok with Markus' suggestion of choosing the most specific context definition (the one with the most @type/@container attributes) and if there isn't one, the lexicographically least term. Although, we might want to pick the shortest term before checking for the lexicographically least one. I believe this makes the most sense in a "compaction" algorithm. |
I think the algorithm for compacting a property IRI should be modified to be something like this:
The processors I work on do keyword aliasing at the same time (via the same compaction method) -- and an alias is picked using the same sorting operation of shortest string and then lexicographically least string (and there is obviously no need to check @type, @language, or @container). I believe my context processing handling is a little different from the algorithm in the spec, but if this isn't already there, we should add mappings of keywords to arrays of aliases (and sort them) during the context processing step. |
I think this mostly works, but we also need to check for variations on @language and @container..
Also, note that an entry might not have a @language, but the context has @language and a term definition exists where @language: null; we would want to use that term in this case. We'd need to fold that interpretation in as well. |
Yeah, I was just coming back in to comment on that. We also need to check on the existence of just @container. |
Actually, there are a few more cases that need to be covered ... and I don't think the algorithm was working properly for a few reasons. Anyway, here's another attempt at it that is no longer recursive and maybe covers all the cases? I don't recall what we do with plain string literals ... but we might need to tweak something for them. I think we also might have to start passing the parent container (or whether or not it's a @list) when we recurse in the compaction algorithm. We were thinking of doing this anyway to ensure we throw exceptions for lists of @lists.
|
We will also need to permit matching of @container type @set to a null container to get the most expected behavior, IMO. Also, the "Otherwise" lines within the for loop should be nested. We can clean it up if people agree that this is the right way to go. We will also need to discuss in greater detail how we want to handle @lists and the ambiguities that arise from values that aren't in lists that match the same term that a list does. (Do we just concatenate, throw an exception? etc.) |
PROPOSAL: In IRI compaction for each term mapped to the input IRI a term rank is calculated depending on the |
The algorithm described and implemented does a bit more than this proposal suggests. There are more sophisticated selection criteria based on if it's a list or not and how to consider compact iris. Also, the selection of terms of equal rank looks for the shortest match before the lexographically first. |
Any suggestion how we could formulate that in a short proposal? I thought the one I put up is already specific enough to get consensus. |
If necessary, we could enter the body of the current algorithm as the proposed resolution, or just resolve that to accept the current spec text. |
I would like to first upgrade my processor to the current spec before +1'ing on the exact algorithm. I think the proposal describes the current spec well enough to accept the current algorithm. |
I'd rather we vote on the spec text than the proposal Markus put in here - namely because we've gotten tripped up on this multiple times, it's not very easy to see all of the "moving parts", and because having the text in front of you is easier than spending time discussing whether or not we've captured everything on the call. @lanthaler - what's missing from the current spec text that your proposal addresses? |
The algo is much more complex and I haven't implemented it yet so I don't understand it's full consequences yet. I think it's easier to agree on what we are trying to achieve, the exact algorithm is a consequence thereof (and depends on many more details). |
I'm ok with voting on the spec text since it's the very specific solution to the problem. Perhaps we can come up with some proposal text that broadly outlines the goal but then says that we think the spec text accomplishes the specifics, if this would be helpful/a good compromise. |
I started updating my processor to the current spec but I'm not done yet. I think the current spec is not complete so I would really not like to vote on it (yet). |
I'm closing this issue as I created a new one (#113) to agree on how compaction is supposed to work in detail. |
From Gregg:
So, one significant complication by looking up term coercions vs. IRI coercions is when compacting IRIs. IRI compaction is now more complicated, as there may be multiple terms associated with an absolute IRI. We now must take into consideration the datatypes of values associated with a specific key to choose between multiple possible term mappings.
Are we making this to complex? Before, the algorithms specified coercion based on expanded IRIs, not unexpanded terms, CURIEs or IRIs.
For example, consider the following:
I've not declared the following RDF:
When expanded, I get the following:
Of course, this is illegal, so expansion rules need to consider that multiple keys may need to be resolved:
Now when we go to compact, it becomes even more difficult. Do we use two different keys and re-split this? Looking up the appropriate term becomes much more complicated.
This problem would be simplified if we restricted a context from having at most on mapping from a term/CURIE/IRI to an IRI, allowing a reverse map. As the context algorithm is specified now, this is pretty much what happens:
This describes that the mapping is from IRIs, not terms. I suggest we keep this algorithm and restrict the context to have only a single mapping from term to IRI and no mapping for CURIEs or IRIs to IRI, depending on the prefix being a term.
Still, even as it is now, a user could specify two terms mapping to the same IRI, in which case the one which is used for compaction becomes undefined, as is any coercion rule used.
The text was updated successfully, but these errors were encountered: