Skip to content

Multiple graphs syntax #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidlehn opened this issue Jan 23, 2012 · 18 comments
Closed

Multiple graphs syntax #68

davidlehn opened this issue Jan 23, 2012 · 18 comments

Comments

@davidlehn
Copy link
Member

The current specs do not explicitly specify how multiple graphs are represented and processed. I think this was in a former version of the spec but perhaps it got lost? There are currently two methods, a top level array and an @id array:

[
   { ... },
   { ... },
   ...
]
{
   "@context": { ... },
   "@id": [
      { ... },
      { ... },
      ...
   ]
}

The specs should spell out these two ways of doing things an list the advantages of each. For instance, the @id array method allows you to have a common top-level @context that will be preserved when compacting data. Syntax and processing tests should also be written.

Also see:

@lanthaler
Copy link
Member

This is still in the spec but well hidden (in the RDFa example):

http://json-ld.org/spec/latest/json-ld-syntax/#rdfa

-----Original Message-----
From: David I. Lehn [mailto:reply+i-2942624-
[email protected]]
Sent: Tuesday, January 24, 2012 6:14 AM
To: Markus Lanthaler
Subject: [json-ld.org] Multiple graphs syntax (#68)

The current specs do not explicitly specify how multiple graphs are
represented and processed. I think this was in a former version of the
spec but perhaps it got lost? There are currently two methods, a top
level array and an @id array:

[
   { ... },
   { ... },
   ...
]
{
   "@context": { ... },
   "@id": [
      { ... },
      { ... },
      ...
   ]
}

The specs should spell out these two ways of doing things an list the
advantages of each. For instance, the @id array method allows you to
have a common top-level @context that will be preserved when compacting
data. Syntax and processing tests should also be written.

Also see:


Reply to this email directly or view it on GitHub:
#68

@lanthaler
Copy link
Member

I think the easiest solution for this issue is to just remove the @id: @id "optimization".

So, requiring an array at the top level would be cleanest solution IMO. Since the graphs are disjoint, it would even be fair to assume that they don't share the same context. If they are really the same, then well, one has to define them twice. Is that really such a bad thing?

@lanthaler
Copy link
Member

Ivan proposed a @data keyword that could be used as the "root" element. I think that's overhead for an optimization which we don't really need. If someone would have to serialize a large number of disjoint graphs, he could simply put the context in an external file.

@msporny
Copy link
Member

msporny commented Feb 5, 2012

Markus said:

Since the graphs are disjoint, it would even be fair to assume that they don't share the same context.

This is not true for RDFa - you have plenty of disjoint graphs that share the same set of prefixes. In fact, this is the common case, not an exception.

Defining a context twice used to be very bad, but now that we can define external contexts, it may not be so bad. I agree that the "@id": [{}, {}, {}] idiom is not very intuitive to a beginner and that processing such graphs in a programming language could be a bit annoying. In this case, having an array of objects (which is what normalization does) may be the better approach.

That is, we may want to only have one way of expressing disjoint graphs. Maybe we should consider getting rid of the "@id": [{}, {}, {}] syntax now that we can specify external contexts?

The other alternative is to treat the first array item in a special way if it just contains a @context. Although, that seems almost as strange as the "@id": [{}, {}, {}] syntax.

The other reason for the "@id" syntax is to support multiple graphs in the future without having to introduce a new idiom. So, this is how you could make statements about a graph in the future:

{
  "@context": { ... },
  "@graph": "http://example.com/graphs/ABCD",
  "@id": [{}, {}, {}],
  "prov:iri": "http://example.com/foo/bar/1234"
  "prov:retrievalDate": "20120205T143245"
}

Note that no changes to the current syntax are required other than adding a simple "@graph" keyword. We thought about this problem quite a bit in the beginning and that's where the @id": [{}, {}, {}] syntax comes from. We could get rid of it, but know that we also get rid of this potential future named graph solution in the process.

@lanthaler
Copy link
Member

OK, I see the use case, but I still don't like the @id idiom. @DaTa looks nice but has a number of corner-cases to circumvent. So I'm not really sure yet what to think..

@niklasl
Copy link
Member

niklasl commented Feb 6, 2012

I think that @data is more clear than using a list value for @id, so I'd be in favour of that.

I've also thought a bit about getting the context from the first item of a top level array, as Manu mentioned. While that is a bit strange, I do think that is has merit. For one, it's a top-level array, which is a special case anyway.

Furthermore, it would allow for stream-based parsing of gigantic graphs in JSON-LD, since a stream parser could pop the first item, parse its context, and then apply it to each consecutively parsed item in turn. Not the common case for sure, but it has come up in the past. (Notice that this would be a safer design than relying on specific implementations handling the order of keys, since that would have to be the case if the top-level is an object with one @context key and subsequent data items. Because, formally, that key might be available only after the huge stream of items.).

And even if it is considered funny to apply the first @context found in a list of objects to the other objects, I can see the logic in it. I'm sure we can explain it properly. Is there any situation where it might cause real harm?

@niklasl
Copy link
Member

niklasl commented Feb 16, 2012

There is a rather appealing alternative to @data as a keyword: we can use @set for this.

Given that we're already considering "@container": "@set" to mean that a term must always have its values in JSON arrays (see #44 and #60), we can gain a lot of design uniformity here. The form would be:

{
  "@context": { ... },
  "@set": [
      {"@id": ...},
      {"@id": ...},
      ...
  ]
}

This would also pave the way for graphs, if we want to go there in the future. The form of one identified graph would be:

{
  "@context": { ... },
  "@id": "http://example.org/dataset#graph",
  "@set": [ ... ]
}

(Given that it's rather obvious what a set of graphs would look like: the objects in the set would carry sets.)

Also, for symmetry, we may consider a similar form to represent top-level (identified) rdf:Lists (see #75). Just replace @set above with @list (in turn making it symmetrical with the non-coerced literal list notation):

{
  "@context": { ... },
  "@id": "http://example.org/a_list",
  "@list": [ ... ]
}

Although (as I've commented on that) I'm not sure about the value of that usage, it is a fact that Turtle allows a similar construct, so it may be worth considering.

@gkellogg
Copy link
Member

Being able to define a top-level item has being a @set or @list has numerous advantages, including potentially Markus' thought about one document referencing a list defined in another. However, the use of @id doesn't really work.

For the multiple-graph case, TriG semantics would have each graph optionally have it's own IRI (still under discussion in RDF WG). Using @id would seem to provide an IRI for every graph in the set. I think we'd need something different:

{
  "@context": { ... },
  "@graphs": {
    "@id": "context",
    "@graph": [],         // Could be either an array or an object
  },
  {
    "@graph": { ... }
  }
}

This would define one named graph, with the array listing a set of objects, and one default graph, with a rooted tree-based graph.

The use of @set would be more useful to reference an array of objects that are at the top level, and not the value of some predicate. Same for @list, if we decide to go there. I would say that the use of @set or @graphs might be limited to the top-level object.

An alternate for graphs, might be to use it as a key in an object definition, indicating that that object defines a new graph:

{
  "@context": { ... },
  "@graph": "graph-iri",
  "@id": ...
}

This would define this node as rooting a newly named graph, or if @set is used, a set of rooted objects that are all in that graph.

@lanthaler
Copy link
Member

I like the conciseness of Niklas' proposal and think that such a syntax makes it really clear what's meant. Gregg, could you elaborate why "the use of @id doesn't really work"? I'm not sure if I understand what you are saying.

@gkellogg
Copy link
Member

In researching this, I could be wrong. My understanding was that the domain of rdf:first/rdf:rest needed to be a BNode, but I think this is really a restriction on the syntactic forms on Turtle/N3 and RDF/XML. RDFa supports using @inlist, and I believe it will not support list items having an IRI subject using this notation.

This implies that JSON-LD could support lists who's elements (including the first) have IRI subjects, I just don't think it's a good idea.

Consider two documents, the first references a second document intended to contain a list:

Doc 1:

{
  "@context": { ... },
  "@id": "http://example.com/music-album/",
  "@tracks": "http://example.com/music-album/list-of-tracks/"
}

Doc 2:

{
  "@context": { ... },
  "@type": "rdf:List",
  "@list": [
    "http://example.com/music-album/track-1",
    "http://example.com/music-album/track-2",
  ]
}

This would result in the following (2) Turtle documents:

Doc1.ttl

@prefix ma: <http://example.com/music-album/>
ma: <tracks> ma:list-of-tracks .

Doc2.ttl

( ma:track-1 ma:track2 ) a rdf:List .

Giving the list an @id, would create a non-conformant rdf:List. Of course, the other fact, is that the first document implicitly gives the list an IRI, if you consider that the URL of Doc2 could be considered the subject of that document, but not by RDF rules. Linked Data basically says that the object SHOULD reference resource returning a description. From our requirements doc:

An IRI that is a label in a linked data graph should be dereferencable to a Linked Data document describing the labeled subject, object or property.

I think the same reasoning applies to @set.

@lanthaler
Copy link
Member

RESOLVED: The @graph keyword is used to express a collection of one or more JSON objects (to express a graph which may or may not be fully connected)

@lanthaler
Copy link
Member

Clarification: This does not mean that we have a named graph solution, but we do believe that it is forward-compatible with such a solution.

@lanthaler
Copy link
Member

We still have to define what happens in cases when a JSON-LD document contains something like:

{
   "@context" : { ... },
   "@id" : "http://blabla",
   "other": "statements",
   ....
   "@graph" [ { ... }, { ... }, ... ]
}

@niklasl
Copy link
Member

niklasl commented Feb 28, 2012

For the record, I agreed with Gregg et al. that @set cannot be overloaded to mean graphs without some tricky conflation happening.

(E.g. if an expanded form of a value is {"@set": ...}, it would be unclear if it means a quoted graph (like in N3), or an unordered set of values for the given subject and property. So we keep these separate, use @graph here, and deal with @set in #44).)

@gkellogg
Copy link
Member

gkellogg commented Apr 9, 2012

See 9885ffe where I started a description of @graph and named graphs: 9885ffe

@paulwilton
Copy link

this seems to be a workable way of representing multiple named graphs in a single doc:

{
    "@context": {
        "Person": "http://xmlns.com/foaf/0.1/Person",
        "name": "http://xmlns.com/foaf/0.1/name",
        "knows": "http://xmlns.com/foaf/0.1/knows"
    },
    "@set": [
        {
            "@id": "urn:graphs:1",
            "@graph": {
                "@id": "http://person.A",
                "@type": "Person",
                "name": "A",
                "knows": "http://person.B"
            }
        },
        {
            "@id": "urn:graphs:2",
            "@graph": {
                "@id": "http://person.B",
                "@type": "Person",
                "name": "B",
                "knows": "http://person.A"
            }
        }
    ]
}

When converted to nquads this gives:

<http://person.A> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> <urn:graphs:1> .
<http://person.A> <http://xmlns.com/foaf/0.1/knows> "http://person.B" <urn:graphs:1> .
<http://person.A> <http://xmlns.com/foaf/0.1/name> "A" <urn:graphs:1> .

<http://person.B> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> <urn:graphs:2> .
<http://person.B> <http://xmlns.com/foaf/0.1/knows> "http://person.A" <urn:graphs:2> .
<http://person.B> <http://xmlns.com/foaf/0.1/name> "B" <urn:graphs:2> .

@dlongley
Copy link
Member

dlongley commented Apr 3, 2016

@paulwilton,

You can already express multiple named graphs in a single doc with JSON-LD 1.0 (or rather, this is the "correct" way to do it):

{
  "@context": {
    "Person": "http://xmlns.com/foaf/0.1/Person",
    "name": "http://xmlns.com/foaf/0.1/name",
    "knows": "http://xmlns.com/foaf/0.1/knows"
  },
  "@graph": [
    {
      "@id": "urn:graphs:1",
      "@graph": [
        {
          "@id": "http://person.A",
          "@type": "Person",
          "knows": "http://person.B",
          "name": "A"
        }
      ]
    },
    {
      "@id": "urn:graphs:2",
      "@graph": [
        {
          "@id": "http://person.B",
          "@type": "Person",
          "knows": "http://person.A",
          "name": "B"
        }
      ]
    }
  ]
}

Yields the same quads as above.

See: http://json-ld.org/playground/#/gist/6fb5526e2d567fa25760a3009203106c

@paulwilton
Copy link

great thanks @dlongley
a @graph within a @graph seems a little odd though.. @set seems a better construct for this? i.e. it is a set of named graphs.. not a graph of named graphs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants