Skip to content

Support array position to property binding #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
msporny opened this issue Jul 18, 2012 · 8 comments
Closed

Support array position to property binding #146

msporny opened this issue Jul 18, 2012 · 8 comments

Comments

@msporny
Copy link
Member

msporny commented Jul 18, 2012

Reto has asked us to consider binding properties to array positions, so that something like this:

"position": ["18.324235", "-36.4387934"],

would generate this RDF:

<> geo:position [ geo:latitude "18.324235"; geo:longitude "-36.4387934" ] .

One potential solution would have us create a context entry that looks like this:

"position": {"@id": "geo:position", 
             "@container": [{"@id": "geo:latitude"}, {"@id": "geo:longitude"}]}
@dlongley
Copy link
Member

Keep in mind that that potential solution might conflict with the idea of expanding a term to multiple IRIs (another possible feature under discussion: #142)

@retog
Copy link

retog commented Jul 18, 2012

maybe: "position": {"@id": "geo:position", "@type": [{"@id": "geo:latitude"}, {"@id": "geo:longitude"}]}

@msporny
Copy link
Member Author

msporny commented Jul 18, 2012

The chat that led to this issue:

[Wed 13:10] <reto>  hello, I was wondering if there's a way to do typed enumeration in json-ld something like "author" : ["Max", "Stirner", "Bayalues reuth"] with the context defining for which properties of the author-resource the 3 array elemenets are values for
[Wed 13:46] <manu-db>   reto, JSON-LD doesn't do data validation... you'd want to use something like JSON Schema for that.
[Wed 13:47] <manu-db>   or OWL
[Wed 13:47] <reto>  hi manu-db
[Wed 13:47] <reto>  this isn't about validation but about having a shortcut
[Wed 13:48] <manu-db>   reto, yes, you can do that
[Wed 13:48] <manu-db>   you can create terms for each of the items... but that's not really an enumeration.
[Wed 13:48] <manu-db>   so you could add this in your context:
[Wed 13:48] <manu-db>   "Stirner": "http://max.stirner.com/about#me"
[Wed 13:48] <manu-db>   is that what you mean?
[Wed 13:49] <reto>  In understand depending on the context an js-array is an rdf:list or repetition of the propety
[Wed 13:49] <reto>  I think it would be nice to have a third variant
[Wed 13:50] <reto>  where you specify in the context that an array appearing as value of "author" is foaf:name, foaf:last and foaf:placeOfBirth in that order
[Wed 13:50] <manu-db>   reto, so the third variant that you'd want to see would be an enumeration type? ... oh, I see what you're saying.
[Wed 13:51] <dlongley-db>   reto: you don't just mean this, do you?
[Wed 13:51] <dlongley-db>   http://tinyurl.com/c6a5u5t
[Wed 13:51] <manu-db>   reto, seems like a pretty specialized solution... why can't you just solve it by doing "authorName": "Max Stirner", "authorLast": "Stirner, "placeOfBirth": "Bayalues reuth" ?
[Wed 13:52] <reto>  instead of having to write "author": {"firstName": "Max", "lastName": "Stirner", "placeOfBirth": "Bayalues reuth"} you have just "author": ["Max", "Stirner", "Bayalues reuth"]
[Wed 13:53] <reto>  I'm working on a service delivering auto-tags
[Wed 13:53] <reto>  and we were looking for a compact syntax
[Wed 13:53] <manu-db>   reto, my concern is that requires the developer to know quite a bit about the position of the item in the array having a special meaning. Seems like it would cause more confusion than it would help
[Wed 13:53] <manu-db>   the compact syntax detracts from the developer being able to easily look at the information and understand what it means...
[Wed 13:53] <manu-db>   why do you need the syntax to be that compact?
[Wed 13:54] <manu-db>   wouldn't author.firstName be better than author[0], when accessing the data?
[Wed 13:54] <reto>  well, very short namespace also don't tel the reader what it is
[Wed 13:54] <reto>  in the concrete example it was offset and range
[Wed 13:54] <reto>  which appear as a tuple very often
[Wed 13:55] <reto>  and {"o": 145, "r": "4"} isn't very readable either
[Wed 13:55] <manu-db>   I see...
[Wed 13:55] <manu-db>   yes, but {"offset": 154, "range": 4"} is very readable, no?
[Wed 13:56] <manu-db>   why does it need to be that compact?
[Wed 13:56] <reto>  well, if we would just define the json we would certainly choose the compact variant
[Wed 13:56] <manu-db>   also, keep in mind that you're losing understanding by over-shortening the keys, or removing them altogether (and putting them as values in an array)
[Wed 13:56] <reto>  similar to scala introducing tuples
[Wed 13:57] <manu-db>   right, but then, why aren't you just using JSON?
[Wed 13:57] <reto>  because we want the mapping to rdf !
[Wed 13:57] <manu-db>   is it a show-stopper to not be able to do "author": [a, b, c] ?
[Wed 13:57] -->|    gkellogg__ ([email protected]) has joined #json-ld
[Wed 13:57] |<--    gkellogg__ has left freenode (Client Quit)
[Wed 13:58] <reto>  let me talk to my collegue a sec
[Wed 13:58] <manu-db>   I'm trying to understand why "data": [offset, range] is so much more compelling than {"offset": 154, "range": 4"}
[Wed 13:58] -->|    gkellogg__ ([email protected]) has joined #json-ld
[Wed 14:00] <reto>  we have document were a mentions occurs a 1000 times so it makes a big difference
[Wed 14:00] <manu-db>   two reasons could be: 1) you're very concerned about the amount of data being transmitted, or 2) the compact notation is easier for your developers.
[Wed 14:00] <reto>  I think another example might be coordinates
[Wed 14:00] <reto>  (x,y,z)
[Wed 14:00] <manu-db>   yes, but how much extra data is that?
[Wed 14:01] <reto>  our main concern is the amount of data
[Wed 14:01] <manu-db>   I'm trying to understand the argument for making the spec and processors more complicated than they are right now...
[Wed 14:01] |<--    gkellogg_ has left freenode (Ping timeout: 246 seconds)
[Wed 14:01] <manu-db>   so, if I'm going to make an argument for this - I need to be able to say: "Well, this amounts to multiple gigabytes of data saved" or some similar argument.
[Wed 14:01] <reto>  but for the coordinates it also makes things more readable
[Wed 14:02] <reto>  and the work around is polluting the ontology
[Wed 14:02] <manu-db>   why can't you just do a JSON.stringify(data, null, 2) to make it readable?
[Wed 14:02] <manu-db>   what's the work around?
[Wed 14:02] <manu-db>   and which ontology?
[Wed 14:03] <manu-db>   (and how is the data polluting the ontology?)
[Wed 14:03] <manu-db>   do you have a pastebin of some sample data that we could look at?
[Wed 14:03] <reto>  say we have an ontology with a proprty "topRightCorner" which takes as argument a Position, a Position in turn having x and y property
[Wed 14:04] <reto>  now not to have to write the verbose {"x":....} one would change the ontology to have List being the domain of topRightCorner
[Wed 14:05] <reto>  this way the json-ld looks nicer and more compact
[Wed 14:05] <reto>  but the graph looks much worse
[Wed 14:05] <reto>  and only the prose of the ontology says that the list must be of length 2
[Wed 14:05] <reto>  and that the firt element is x and the second is y
[Wed 14:06] <reto>  (my collegue computing overhead )
[Wed 14:10] <reto>  may collegue says for document indexing data having the explicit properties as required now is way to large
[Wed 14:11] <reto>  at adobe we would very much be using json-ld
[Wed 14:12] <reto>  and this makes a big difference to us
[Wed 14:12] <dlongley-db>   is there a need to build indexes on the x and y properties?
[Wed 14:12] <dlongley-db>   i don't understand what the indexing issue is
[Wed 14:12] <reto>  I don't understand
[Wed 14:12] <dlongley-db>   if, in one case, you're only indexing on topRightCorner ... why change that in the other case ... but if in both cases you're indexing on topRightCorner, x, and y, why is there a size difference?
[Wed 14:13] <dlongley-db>   what is the specific document indexing size issue and how does this change in format resolve it?
[Wed 14:13] <manu-db>   reto - Does this summarize what you're asking for? https://github.com/json-ld/json-ld.org/issues/146
[Wed 14:13] <dlongley-db>   also, are you trying to solve an in-memory issue or a db-storage issue ... and if the latter, can you just use compression if you're dealing with 1000s of duplicated properties?
[Wed 14:16] <reto>  tissue-146: to be clear it shlould say rather than "generate the following rdf" "generate the following node as object of the position property"
[Wed 14:16] <reto>  dlongley-db: it's about bandwith
[Wed 14:17] <dlongley-db>   not trying to be snarky, but why doesn't gzip solve this?
[Wed 14:17] <dlongley-db>   seems like a great case for high compression over the wire -- without losing ease/understandability of data access
[Wed 14:17] <dlongley-db>   topRightCorner.x and topRightCorner.y are better than .0 .1 (IMO)
[Wed 14:18] <dlongley-db>   or [0] [1] rather.
[Wed 14:21] <reto>  comm'on we are talking to have an elegant and compact syntax, of course we are using the best available compression algorith for the network we're using to cmmunicate with mars ;)
[Wed 14:21] <dlongley-db>   if the only reason for doing this is for data transmission -- then i would recommend just using gzip (which, internally, does its own mapping that is far more efficient)
[Wed 14:22] <dlongley-db>   my point is that -- if you're remapping just to achieve compaction (not for using the data/working for it), then gzip will do a better job.
[Wed 14:22] <dlongley-db>   if, however, there are other reasons for compacting it, then i can see doing this manual mapping.
[Wed 14:22] <reto>  really I think the arguments are the same as for supporting tuples in a language with scala, having object with properties is more explicit but that's a verbosity preventng people from using the language in some situations
[Wed 14:23] <dlongley-db>   i'm just not sure how useful/often people would want a feature like this.
[Wed 14:23] <dlongley-db>   (is it worth the understandability/processor complexity trade off?)
[Wed 14:24] <taaz>  perhaps this is yet another case for a specialized container type. in this case a xy or xyz sort of mapping.
[Wed 14:25] <reto>  I think for (x,y) it actually akes things more rather than less readable, in other situations you want to use the feature to have large table style data expressed in a more compact fashion
[Wed 14:25] <dlongley-db>   well, there may be a case for a more generalized container format for this
[Wed 14:27] <dlongley-db>   but .. the more containers we add, the more the syntax requires you to go through more steps before you can start to understand the data
[Wed 14:28] <dlongley-db>   ie: read the property, look it up in the context, look at the container type, know how the container type is formatted, then go back and look at the data.
[Wed 14:28] <dlongley-db>   more cognitive load; just a question of whether or not that's worth it.
[Wed 14:28] <reto>  we are asking for a way to factor out highly redundant schema information which occurs for each array element in arrays with complex element types that are all the same exact set of properties
[Wed 14:29] <dlongley-db>   (this could also be a target for framing ... not sure where that's headed at the moment)
[Wed 14:29] <dlongley-db>   reto: right, understood.
[Wed 14:29] <manu-db>   okay, reto - updated https://github.com/json-ld/json-ld.org/issues/146
[Wed 14:29] <manu-db>   we'll discuss on the call.
[Wed 14:29] <dlongley-db>   very "tabular" data.
[Wed 14:30] <reto>  great
[Wed 14:32] <reto>  I think its two use-cases: 1. tabular data, 2. conventional tuples (like x,y)

@niklasl
Copy link
Member

niklasl commented Jul 18, 2012

I think this can be a valuable feature. With this, it may even become possible to use a JSON-LD context for mapping GeoJSON to RDF (similar to the initial example here).

I also think we should use @container somehow if we do this. While we have mainly used explicit keys for meaning in context definitions,the suggested use of an array value may be workable. But for GeoJSON like:

{
  "type": "LineString",
  "coordinates": [
    [102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0]
  ]
}

, we may need to combine an explicit @list or @set container declaration with this kind of expression somehow.. Perhaps like:

"coordinates": {"@container": {"@list": [{"@id": "geo:latitude"}, {"@id": "geo:longitude"}]}}

Or if the use of just an array is too cryptic (and given that this is quite a feature to add), it might warrant the addition of a keyword, e.g. @tuple (still using an array for the "offset template"):

"coordinates": {"@container": "@list", "@tuple": [{"@id": "geo:latitude"}, {"@id": "geo:longitude"}]}

We must also make sure that the positional values can be coerced. The suggested form does that though, since one can add a complementary @type along with the @id of the target property.

(If we use @container as suggested, I don't think that'd conflict with a potential solution for #142, since the latter suggestion is to use an array as the value of @id in a term declaration.)

@lanthaler
Copy link
Member

I don't know if we should really include this in JSON-LD 1.0 but if we do, I think we should do something along the lines Niklas proposed (@tuple).

@lanthaler
Copy link
Member

RESOLVED: Do not support a declarative mechanism that is capable of mapping array position to an RDF property.

@ceefour
Copy link
Contributor

ceefour commented Nov 19, 2014

Too bad (ironically) the chat is unreadable, that chat should mention who is saying what.

@davidlehn
Copy link
Member

@ceefour: The chat comment above had the correct data but the names were being eaten by markdown as html tags. I edited the comment to put the irc log in a block so it's readable now (though no word wrapping, sorry). Also of interest may be the telecon scribe text for that issue: http://json-ld.org/minutes/2012-08-07/#topic-3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants