Skip to content

Define exactly how (IRI) compaction is supposed to work #113

Closed
@lanthaler

Description

@lanthaler

In the context of issue #74 we were already discussing how compaction should work and came to the conclusion that:

The intent for how IRI conflicts are resolved when compacting/expanding: any conflicts between terms that use the same IRI will use the most specific solution when compacting (for example, when compacting "foo": "5" and having to pick between a term that specifies "xsd:integer" as the type and one that doesn't, the one that specifies "xsd:integer" is selected).
If there is no solution that is more specific than the other, then a lexicographical comparison is made between the terms in the @context and the lexicographically least term and it's associated @type and other information is used to expand the data.

We didn't go any further in describing it more specifically. This lead to the situation that Gregg and Dave implemented the algorithm completely different than I did. My strategy is to choose the term that eliminates the most expanded object forms (in lists, for arrays, every value is compacted separately now). If more terms achieve the same ranking according that criteria, the lexicographically least term is chosen.

Edit: I updated my algorithms based on the feedback, please read the algorithms in the comments below instead!

My IRI compaction algorithm looks like this:

  - calculate rank of full IRI
  - calculate rank for every term/prefix
  - chose term/compact IRI/IRI with highest rank, falling back to lexicographically least

The term rank calculation for a term (could also be a (compact) IRI) and a value works as follows:

  - set rank to 0
  - if it's a term that is defined in the context rank++ (we prefer terms)
  - if value is a `@list` object
    - if term has a list-container, rank++
    - check for every item in the list if the result of compaction would be a scalar, 
      if so rank++ else rank--
  - otherwise
    - if term has a list-container, return rank = false (will never be chosen)
    - if term has a set-container, rank++
    - check if the result of compacting value would be a scalar, if so rank++ else rank--

You can find the full implementation in Processor.php of JsonLD.

I think this algorithm is much easier to implement and much easier to understand than the current algorithm in the spec so I would propose to change it. The only open issue is whether a term with a mismatching type will ever match.

Consider for example the following:

{ 
    "http://example.org/term1": { 
        "@type": "http://example.org/different-datatype", 
        "@value": "v1" 
    }
}

Should this be compacted to

{ 
    "@context": { 
        "term1": { 
            "@id": "http://example.org/term1", 
            "@type": "http://example.org/datatype" 
     },
    "term1": { 
        "@type": "http://example.org/different-datatype", 
        "@value": "v1" 
    }
}

or not?

Another question is whether we compact to empty suffixes, so the example above could also be compacted to "term1:" (note that the suffix is empty which would drop the type coercion).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions