Should each member in a list contribute to term rank? #172

lanthaler · 2012-10-22T15:58:46Z

There have been some discussions on the mailing list about what the outcome of the following compaction should be.

Input

{
  "@id": "http://example.com/id1",
  "http://example.com/term": [
    {
      "@list": [
        { "@value": "v1.1", "@language": "de" },
        { "@value": "v1.2", "@language": "de" },
        { "@value": "v1.3", "@language": "de" },
        4,
        { "@value": "v1.5", "@language": "en" },
        { "@value": "v1.6", "@language": "en" }
      ]
    },
    {
      "@list": [
        { "@value": "v2.1", "@language": "en" },
        { "@value": "v2.2", "@language": "en" },
        { "@value": "v2.3", "@language": "en" },
        4,
        { "@value": "v2.5", "@language": "de" },
        { "@value": "v2.6", "@language": "de" }
      ]
    }
  ]
}

Context

{
  "@context": {
    "@language": "de",
    "term1": {
      "@id": "http://example.com/term", "@container": "@list" },
    "term2": {
      "@id": "http://example.com/term", "@container": "@list", "@language": "en" }
  }
}

Please note that term1 uses the context's default languag, i.e., de, whereas term2 uses en; otherwise they have exactly the same definition.

The question is whether both lists should be compacted to term1 (and thus trigger an error) or whether term2 should be choosen instead for list 2 as there are more matches (three en compared to two de).

The text was updated successfully, but these errors were encountered:

msporny · 2012-11-20T01:00:37Z

OPTION 1: Both lists should be compacted to term1.
OPTION 2: term2 should be chosen for list 2.
PROPOSAL 1: Lists-of-lists should only throw an error in JSON-LD when converting the document to RDF.

msporny · 2012-11-20T01:05:48Z

I don't want to complicate the term ranking algorithms any more than what they are right now. It seems as if this may be a corner-case. Since term2 doesn't match every item in the list, I think term1 should be picked instead for both lists. I don't know if this should throw an error... converting toRDF should throw an error, but probably not compaction? What do other folks feel about whether lists-of-lists should be allowed in regular JSON-LD, but when converting to RDF, they should throw an error?

OPTION 1: +1
OPTION 2: -1
PROPOSAL 1: +1

tidoust · 2012-11-20T09:50:26Z

I thought lists-of-lists were not allowed in this version of JSON-LD specifically because of the added complexity with regards to algorithms?

Provided the term ranking algorithm is clear enough for implementers, I don't think there's a quick-and-easy way to improve it without introducing further complications.

OPTION 1: +1 (i.e. follow the current algorithm)
OPTION 2: -1 (no change to the algorithm to support this corner case)
PROPOSAL 1: -1. If we don't want lists-of-lists, it would be good not to fail silently when one is found.

lanthaler · 2012-11-20T10:33:57Z

OPTION 1: -1, it is quite obvious what the best match would be and this is not a corner case IMHO
OPTION 2: +1, because term2 is able to compact 4 elements instead of just 2 as term1 does
PROPOSAL 1: -1 as most algorithms would probably have to be changed

Honestly I'm a bit surprised by your choices. Would your opinion change if the context would look like this:

{
  "@context": {
    "term1": { "@id": "http://example.com/term", "@container": "@list", "@language": "de" },
    "term2": { "@id": "http://example.com/term", "@container": "@list", "@language": "en" }
  }
}

Actually I don't think this adds complexity, but it would remove some complexity.

lanthaler · 2012-11-20T10:39:15Z

As I just found out, the playground fails completely in this example as it collapses the two lists into one: http://bit.ly/UQG2C7

gkellogg · 2012-11-20T13:14:02Z

Te problem with this example is that it subtly depends on narrow specifics of the term ranking algorithm.. We might just consider this non-conforming and leave it up to the processor to select one; there's no obviously right answer. Alternatively, just make sure that the choice (and the examples) are appropriate for the specified algorithm, which this is not.

PROPOSAL 1: -1, lists of lists are not well formed, so Postel's rule applies.

lanthaler · 2012-11-20T13:31:20Z

Why should such a list be non-conforming? Why is this an "in-appropriate choice for the specified algorithm"? This is not a list of lists but there are two separate lists in this example.

gkellogg · 2012-11-20T14:45:24Z

What could be non-conforming is the selection of lists having multiple languages along with language maps. If not conforming, it's certainly a pathological corner-case.

Regarding the example, I noted this in d1b3ad3 and in this email.

The ranks in this test add up as follows (using the spec'ed algorithm):

first list, term 1/term 2: 13/7
second list term1/term 2: 11/10

This is why the example isn't appropriate. If you definitely want to have term 2 selected for the second list, it should be less ambitious as to how it turns out. If the point of the example is to show that your alternate algorithm is better, fine, but if we have tests, they should be consistent with the algorithm specified.

In any case, this is like re-aranging armchairs on the Titanic. It's a corner case, and there's no absolutely right answer.

lanthaler · 2012-11-20T16:56:59Z

We are not discussing language maps in this example.

I know what the spec'ed algorithm sums up to, and in my opinion it shouldn't sum up to these numbers. If you use the following context it would in fact separate the lists.

{
  "@context": {
    "term1": { "@id": "http://example.com/term", "@container": "@list", "@language": "de" },
    "term2": { "@id": "http://example.com/term", "@container": "@list", "@language": "en" }
  }
}

I can't see any compelling argument why the context above should yield a different result than the context below:

{
  "@context": {
    "@language": "de",
    "term1": { "@id": "http://example.com/term", "@container": "@list" },
    "term2": { "@id": "http://example.com/term", "@container": "@list", "@language": "en" }
  }
}

If the point of the example is to show that your alternate algorithm is better, fine, but if we have tests, they should be consistent with the algorithm specified.
In any case, this is like re-aranging armchairs on the Titanic. It's a corner case, and there's no absolutely right answer.

Well, in my opinion the most important tests are the ones that test corner cases. I tried several times to have a discussion about the algorithms but it seems to be impossible because "there are already two implementations" and that's "how it is currently specified" so I'll just give up on this.

gkellogg · 2012-11-20T17:23:02Z

If Term Tank needs to be re-written any way to take into consideration language maps and other resolutions, then I think it's fair game to play with and improve. If it doesn't need to be updated, I'd just say leave wee enough alone.

lanthaler · 2012-11-27T16:06:04Z

RESOLVED: When compacting lists, the most specific term that matches all of the elements in the list, taking into account the default language, must be selected.

This addresses #172.

This addresses #113, #160, #172, and #202.

lanthaler · 2012-12-20T20:57:18Z

I've updated all algorithms, unless I hear objections I will close this issue in 24 hours.

lanthaler mentioned this issue Nov 19, 2012

Define exactly how (IRI) compaction is supposed to work #113

Closed

gkellogg mentioned this issue Nov 20, 2012

Add '@language' container type #133

Closed

lanthaler added a commit that referenced this issue Dec 8, 2012

Update compact-0018 according resolution of issue #172

f677925

lanthaler added a commit that referenced this issue Dec 8, 2012

Add compaction test for mixed lists

3abfe61

This addresses #172.

lanthaler added a commit that referenced this issue Dec 8, 2012

Include mixed list in compact-0018 to test fallback

dc37b13

This addresses #172.

lanthaler added a commit that referenced this issue Dec 20, 2012

Update the Compaction and IRI Compaction algorithms

c748a99

This addresses #113, #160, #172, and #202.

lanthaler closed this as completed Dec 21, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Should each member in a list contribute to term rank? #172

Should each member in a list contribute to term rank? #172

lanthaler commented Oct 22, 2012

msporny commented Nov 20, 2012

Uh oh!

msporny commented Nov 20, 2012

Uh oh!

tidoust commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

gkellogg commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

gkellogg commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

gkellogg commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 27, 2012

Uh oh!

lanthaler commented Dec 20, 2012

Uh oh!

Should each member in a list contribute to term rank? #172

Should each member in a list contribute to term rank? #172

Comments

lanthaler commented Oct 22, 2012

msporny commented Nov 20, 2012

Uh oh!

msporny commented Nov 20, 2012

Uh oh!

tidoust commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

gkellogg commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

gkellogg commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 20, 2012

Uh oh!

gkellogg commented Nov 20, 2012

Uh oh!

lanthaler commented Nov 27, 2012

Uh oh!

lanthaler commented Dec 20, 2012

Uh oh!