-
Notifications
You must be signed in to change notification settings - Fork 20
Ability to use default PREFIX values #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It may also be useful to expose all of these preconfigured prefixes in the SPARQL service description. |
I agree on the idea, we have to be careful with implementation. Prefix.cc has a very polluted namespace, apparently some bots added crap there for a while. Also I do not see any way to get rid of those, there is zero reporting from what I could see. I once defined my own prefixes exactly for that reason, which are based on RDFa initial contexts and extends them via kind of a semantic versioning approach, see the GitHub repo for it. It is a 1:1 mapping, which is not the initial idea of prefixes. |
@ktk The way to deal with spam at prefix.cc is to downvote it—stuff with a bad vote ratio gets removed automatically. There never was a bot attack—the spam is hand-inserted and it's just one link a week or so on average. To the best of my knowledge, no one ever managed to hijack a popular prefix. Everything has |
I think the idea of a default context for things like |
Here we trade convenience for self-contained queries. Many SPARQL editors can auto-complete namespace prefixes, alleviating the usability pains. I think this is more of a tooling issue, than a SPARQL specification issue. |
@jindrichmynarz Agreed this might not require spec changes. It seems clear that some deployments (e.g. Wikidata) have an interest in doing this server-side, though. Given that, I think we should at least be looking at best practices for how to communicate to a client which prefix definitions a server is using. That might take the form of an agreed upon vocabulary to use in the service description (and corresponding implementation in multiple endpoints). Even if we just wanted this to be a client-side tooling issue (without pre-defined namespaces), it might be valuable for endpoints to provide auto-complete information for namespace definitions to the clients for domain-specific namespaces that might not appear in a repository such as prefix.cc. As an implementor of both endpoints and a SPARQL query editor, I think both sides of this approach (prefixes in SD and client-side support for auto-completion of prefix declarations) would be really useful! |
@jindrichmynarz More tool support would be great, but for a tool to help with prefix management, it first needs to know about the prefixes. So how is a tool supposed to learn about the prefixes? Look them up in prefix.cc? Sure, but that doesn't contain private and project-specific prefixes, which outnumber the prefixes of well-known vocabularies in most queries (at least in industry). Some tools allow configuration of custom prefix mappings for autocomplete, but in what format? A JSON-LD context? An RDF file using the VANN vocabulary? A file with SPARQL/Turtle-style The SPARQL server usually already knows the prefixes, so why can't it communicate them to the SPARQL client? What happens in practice is that the SPARQL server operator sticks the prefixes manually into prefix.cc, and the client fetches them from there. So now SPARQL protocol server and SPARQL protocol client trade prefix information not through their shared protocol, but through an out-of-protocol single point of failure with a proprietary API. And I am paying for the bandwidth! |
I'm all for defining a way to include pre-defined namespace prefixes in SPARQL 1.1. Service Description. |
I think that server-supplied namespace definitions ought to be a required part of the query output (something like "you used I have related concerns about client-tool-supplied pre-defined namespace definitions that are not clearly and blatantly visible to the user. The reason is simple: "default" or "generally preferred" expansions of given prefixes have, and will, change over time, and there is no way to assure that the expansion used when I ran a query yesterday is still in effect for today's execution of the "same" query (which is not the same, once the namespace associated with that prefix changes). Similarly, there is no way to know whether someone (let's presume a new user) intends their Just as Turtle documents are properly considered malformed if they omit declarations of any prefixes used therein, SPARQL queries should only be considered well-formed and self-contained if they include declarations for all prefixes used therein. Reliance on any external item -- whether server-supplied definitions, client-app-supplied definitions, a file full of declarations retrieved by dereferencing an in-query URI -- renders that query no longer self-contained. (Client tools that handle this for the user, upon user opt-in, with proper inclusion of the declaration in the final, visible SPARQL query, are entirely permissible. This would include browser-based query submission forms and/or server pre-processors that flag undeclared prefixes and interact with the user to confirm their intended meaning.) |
Reading prefixes from a file would be nice, because that is under the control of the developer doing the queries. But I think it would be unwise to leave the defaults to the SPARQL server or prefix.cc, because those could change if a different SPARQL server is swapped in, or if prefix popularities change, as @TallTed pointed out. |
I typically create cached versions of popular JSON-LD contexts which helps performance quite a bit, and helps to mollify the people running servers. For example, schema.org hosts a context, and it's quite a burden if every time a JSON-LD file is processed that the context is downloaded anew. Of course, there's HTTP caching, but this isn't always honored. We've also considered alternative URL schemes (such as hash links that allows data to be either accessed from multiple locations, or provided out-of-band. |
Among the many standard RDF syntaxes, Turtle (and its subsets and derivates) is the only one that requires documents to be self-contained. RDF/XML has external entities. RDFa has the initial context, which processors are permitted to obtain by resolving its URL. JSON-LD has Obviously, whenever one refers to something via URL, there is a risk that what is being returned by the URL changes in undesired ways. I believe that problem recently had its 30th birthday. URL publishers commit to a certain level of stability. Consumers evaluate the trustworthiness of such commitments. If it's not good enough, they don't need to use the URL. The use of |
@cygri - I believe all of your examples have the dependent (partial) document declaring the external (partial) document which content is part of the "complete" document made by cobbling the partials together. The processor of a given file is not left to its own devices to just fill in whatever it likes. The current situation which amounts to "handling of undeclared namespaces is undefined" is problematic, not least because there's no standard way to learn how those undeclared namespaces were expanded by the query servicer. I'm OK (albeit not thrilled) IFF such external input to SPARQL is to be done via URI ... perhaps with a new (This reminds me of an issue which I had thought was addressed via SPARQL errata, but upon checking just now, apparently not. 19.5 IRI References says "A prefix declared with the PREFIX keyword may not be re-declared in the same query." This is not mentioned in the basic Grammar, neither in notes, nor in the Prologue EBNF area. If enforced, it means that a SPARQL processor cannot simply pre-pend their predefined |
@TallTed Yes, that's how Regarding the other case, where the query does not specify the prefix mapping at all, but relies on the server to implicitly fill in the prefix mapping. You proposed that the server should include its prefix mapping in the query response, so that the client can check whether its expected mappings were used. Can you explain what advantage this has over making the prefix mapping discoverable via the SPARQL Protocol (for example by describing them in the Service Description document as @rubensworks, @kasei and @jindrichmynarz proposed)? That way, clients can not only check whether the server's prefix mapping matches expectations, but can also retrieve the mapping ahead of time to help with autocomplete etc. |
@TallTed "handling of undeclared namespaces is undefined" -- What text are you taking that meaning from? It's an error in 4.1.1.1 because the prefixed name can't be processed. |
For the record, the JSON-LD WG has considered adding metadata to context references (w3c/json-ld-syntax#108), but due to lack of time, this won't land in JSON-LD 1.1 . This kind of mechanism help, at least, to detect that something has changed. And possibly (with protocols such as memento RFC 7089, for example) to ensure that the original data is retrieved. It could be useful to plan ahead the corresponding feature in SPARQL, e.g. allow for
|
if one considers a |
We should also look at the ShACL prefix logic as an inspiration. Perhaps like this. PREFIXES <https://sparql.uniprot.org/sparql>
SELECT ?protein
WHERE {
?protein a up:Protein
} Would execute SELECT ?prefix ?namespace
WHERE
{
[] <http://www.w3.org/ns/shacl#prefix> ?prefix ;
<http://www.w3.org/ns/shacl#namespace> ?namespace .
} at that endpoint and use the definitions inside the query. With an option for PREFIXES LOCAL to make it even easier. |
does
intend that the processor reiterate a probe to that endpoint for each query it processes? |
@lisp my first thought is that it would follow http cache headers for re-probing. |
I think a SPARQL database (backend) should not use implicit prefixes, but SPARQL query editors should auto-insert prefixes:
Note that many databases (including GraphDB) automatically add newly encountered prefixes to their namespaces. |
Jena 5.1.0 adds an optional "prefixes" endpoint for datasets in two forms - "read" to lookup a prefix or URI or to retrieve all prefixes and "write" which provides modification of the set of prefixes. The use case is for graph browser or editor support. https://jena.apache.org/documentation/fuseki2/prefixes-service |
Why?
I am tired of writing / copy-pasting every prefix each time I need to write a query.
The prefix mechanism makes the SPARQL learning curve steeper.
Previous work
Proposed solution
Considerations for backward compatibility
Queries without prefixes would now be valid, while they are currently not. E.g.
The text was updated successfully, but these errors were encountered: