Skip to content

Use schema.org as the base schema? #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
msporny opened this issue Jul 8, 2018 · 12 comments
Closed

Use schema.org as the base schema? #194

msporny opened this issue Jul 8, 2018 · 12 comments
Assignees

Comments

@msporny
Copy link
Member

msporny commented Jul 8, 2018

We could use: https://w3id.org/credentials/v1 as the base schema, or we could use https://schema.org/ as the base schema.

Doing the former enables us to strictly control the base vocabulary.

Doing the latter enables people to express just about everything in schema.org AND digitally sign it (which could turn into a pretty big use case). The downside of the latter approach means that things might be introduced to schema.org that stomp on our vocabulary.

There is a hybrid approach where we still do https://w3id.org/credentials/v1 and put those terms into schema.org in due course. The problem there being that pure JSON implementations might have to check for both https://w3id.org/credentials/v1 and https://schema.org/ and wouldn't be able to reference a static version of schema.org. Another issue may be the overhead when processing the thousands of terms in the schema.org context.

Thoughts?

@David-Chadwick
Copy link
Contributor

Schema.org is perhaps attempting to boil the ocean. The vast majority of the definitions will probably never be used in VCs, whereas the definitions in https://w3id.org/credentials/v1 will be used in all VCs. I envisage different communities of users will define their own properties in their own namespaces, but may reference particular elements in schema.org like person or person properties attributes like address. As long as VCs can point to the URIs defining the types or properties that are contained in them, then we do not need to reference schema.org (at the top level).

I am presuming that a VC type could be https://schema.org/Person which is only importing the properties of person, and not all of schema.org. If https://schema.org/Person is itself too broad, since it contains a lot of properties, then I am presuming that a VC could simply reference https://schema.org/address if only that property is needed. Is this correct?

@msporny
Copy link
Member Author

msporny commented Jul 9, 2018

Schema.org is perhaps attempting to boil the ocean.

See schemaorg/schemaorg#1654

/cc @danbri @rvguha @philbarker

I am presuming that a VC could simply reference https://schema.org/address if only that property is needed. Is this correct?

Yes, it could, but that's not what I'm suggesting. I'm suggesting that we could do a number of things (in increasing order of uneasiness):

  1. Use schema.org as the "vocabulary" for Verifiable Credentials. So, VerifiableCredential is a schema.org term (i.e. http://schema.org/VerifiableCredential instead of https://w3id.org/credentials#VerifiableCredential).
  2. If we do #1 above, we could then re-use those terms in: https://w3id.org/credentials/v1
  3. Or, we could forego https://w3id.org/credentials/v1 entirely and make http://schema.org the base context for the entire specification.

The benefit being that schema.org would get the ability to digitally sign all things that schema.org can express once the Verifiable Credentials spec goes to REC. So, not only can people/orgs make statements about Person and Organization, but who/what said it and when they said it would be cryptographically secured and verifiable.

The difficult parts are:

  1. Getting the schema.org community to agree to add all the necessary terms required to express Verifiable Credentials and digital signatures.
  2. Doing a security review wrt. http://schema.org/ being the context used... it enables attacks against the digital signatures when things aren't loaded over HTTPS.
  3. Ensuring that terms do not change in schema.org, as doing so would break the digital signatures over the long term.

So the question isn't "Can we use schema.org terms in Verifiable Credentials JSON-LD Context?"... because that's easy to do. It's should we not have a Verifiable Credentials JSON-LD Context at all, and just use schema.org instead?

@David-Chadwick
Copy link
Contributor

Unless I have misunderstood what you are wanting to do, I think it is a very bad idea to have a context of http://schema.org. as this will mandate that every VC verifier must know and understands every term in schema.org. It will be virtually impossible to control which VCs to legitimately accept in this case, if they can contain anything from schema.org.
My eco-system viewpoint is that verifiers will trust certain issuers to issue specific types of VCs. So the VC types that will be accepted (implying contexts due to the URI abbreviation issue) will be strictly limited, as will their properties. A context of schema.org blows a gaping big hole in this.

@stonematt
Copy link
Contributor

@David-Chadwick This comment raises a bit of a red flag for me.

While it's true that verifiers will have to exercise judgement as to which VCs they'll recognize and the issuers they'll trust, why wouldn't we leverage existing and accepted name spaces and ontologies to describe content in claims? Schema.org seems like a resource that we should leverage and rely on extensively.

Maybe I'm reading too much into your comment. Please explain.

@BigBlueHat
Copy link
Member

It will be virtually impossible to control which VCs to legitimately accept in this case, if they can contain anything from schema.org.

@David-Chadwick the @context sent with a VC isn't the tool to use for the VC verifier to use for validation. Its use is to shape the underlying identifiers and values in the output graph which is encoded in the JSON-LD. Once you've got that graph, then you would care about what was (or was not) stated in the claims (et al).

My eco-system viewpoint is that verifiers will trust certain issuers to issue specific types of VCs. So the VC types that will be accepted (implying contexts due to the URI abbreviation issue) will be strictly limited, as will their properties. A context of schema.org blows a gaping big hole in this.

If your use case requires only specific terms to be used, then you should check for just those terms--regardless of the @context value you were given. Just because the issuer said that all the terms in the document are from schema.org doesn't mean you can't check for just the ones you care about.

It seems your wanting more from the @context than it's meant to provide.

@David-Chadwick
Copy link
Contributor

Maybe an example of @context using schema.org when the VC is extremely limited in what it will contain (e.g. age property only) would help to clarify the difference between defining terms and restricting which terms can appear in a VC.

Looking at example 8 on extensibility, I have two questions about this example:

  1. Why is the extended type not defined in the type property? How is the verifier to know that the VC has been extended when the type has not been?
  2. why is an indirection necessary to https://example.com/contexts/mycontext.jsonld in order to find the new extensions. Why cant the extended context be inserted directly into the VC (along with the next extended type) as follows:
{
  "@context": [
    "https://w3id.org/credentials/v1",
    "referenceNumber": "https://example.com/vocab#referenceNumber",
    "favoriteFood": "https://example.com/vocab#favoriteFood"
  ],
  "id": "http://example.com/credentials/4643",
  "type": ["VerifiableCredential", "myCorportateExtensions"],
  "issuer": "https://example.com/issuers/14",
  "issuanceDate": "2018-02-24T05:28:04Z",
  "referenceNumber": 83294847,
  "claim": {
    "id": "did:example:abcdef1234567",
    "name": "Jane Doe",
    "favoriteFood": "Papaya"
  },
  "proof": { ... }
}

@danbri
Copy link

danbri commented Jul 9, 2018

Perhaps you might consider capturing a checksum of the state of the context (or even vocabulary, although that is complicated for large vocabularies) at the point at which the claim is made or recorded or examined.

@msporny
Copy link
Member Author

msporny commented Sep 4, 2018

ACTION: @msporny to respond to @danbri and @David-Chadwick, no op, and suggested path forward, then close issue.

@msporny
Copy link
Member Author

msporny commented Sep 10, 2018

Hey @danbri, the VCWG is starting to wrap up its work before CR and this issue hasn't moved forward in a long time. It also raises a variety of security concerns as a result of pulling in a huge JSON-LD Context file that the group is not going to have the time to resolve within the time period that we'd like when going into CR. That means that Verifiable Credentials will have their own JSON-LD context and it may or may not include terms in schema.org.

That said, there is nothing limiting the use of schema.org for the Verifiable Credentials themselves (and we expect a number of schema.org terms to be re-used - e.g., address, name, title, etc).

I'm going to close this issue, but please re-open it if you object or see a way that the group could make progress at this point.

@msporny
Copy link
Member Author

msporny commented Sep 10, 2018

@David-Chadwick wrote:

Why is the extended type not defined in the type property? How is the verifier to know that the VC has been extended when the type has not been?

The general answer is that a verifier can know by 1) seeing if there is a property that it doesn't understand as a part of the credential, or 2) seeing that there is a property that it understands as being an extension of the base set of attributes it expects, or 3) based on what you argued for in the group, it can check to see that the type used is an extended type (or a subtype, etc.)

why is an indirection necessary to https://example.com/contexts/mycontext.jsonld in order to find the new extensions. Why cant the extended context be inserted directly into the VC (along with the next extended type).

The general answer is that the indirection isn't necessary, you can always include extensions inline.

The more specific answer is that if we keep the list of @contexts to a well known, and rigid ordered list (enforced by JSON Schema), and the properties expressed in a well known and rigid tree structure (enforced by JSON Schema), then receiving parties of the data may not need to do any JSON-LD processing at all, even in the case of extensions. This part is a bit experimental and we're trying to understand if there are any corner cases... we haven't found any yet. You would lose this ability by inlining extensions, which many people that are already processing as JSON-LD won't care about... but the people that care about NOT requiring JSON-LD processing do care about.

@msporny
Copy link
Member Author

msporny commented Sep 10, 2018

The specific URL used by the VC context is happening in issue #206.

@msporny
Copy link
Member Author

msporny commented Sep 10, 2018

Closing the issue. Commenters should feel free to open it again if you believe we can make progress resulting in a concrete proposal and PR pulled into the spec by W3C TPAC 2018.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants