Skip to content

Add graph node identifiers for registered claims #1149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 23, 2023

Conversation

OR13
Copy link
Contributor

@OR13 OR13 commented Jun 11, 2023

Since the v2 context contains terms specific to data integrity proofs.

And I have closed the PR attempting to remove those terms: #1091

It stands to reason we will need to add terms related to all proof formats to the v2 context, in the spirit of making things easier for developers.

@OR13
Copy link
Contributor Author

OR13 commented Jun 11, 2023

See related PR here:

@OR13
Copy link
Contributor Author

OR13 commented Jun 11, 2023

Note the data integrity spec has the same problem as did core currently, and it repeats normative definitions from did core:

https://w3c.github.io/vc-data-integrity/#verification-material

Screen Shot 2023-06-11 at 12 14 52 PM

@OR13
Copy link
Contributor Author

OR13 commented Jun 11, 2023

related PR updating the vc data model w3c/vc-data-integrity#99

@msporny
Copy link
Member

msporny commented Jun 11, 2023

It stands to reason we will need to add terms related to all proof formats to the v2 context, in the spirit of making things easier for developers.

I'm not following the logic here. When are JWT terms used inside of VCs? VC-JWT secures VCs by using an external proof. The terms you're adding to the context are never used in a JWT payload (the VC). That is, the specific values you're adding are never used /inside/ of a VC, they're always used /outside/ of a VC (in the securing format), therefore, there is no reason for them to exist inside of the VC core data model.

What am I missing? How does this make developer's lives easier?

@dlongley
Copy link
Contributor

dlongley commented Jun 12, 2023

Adding multiple @id aliases to a variety of different terms not only doesn't make sense, but I think it definitely makes things harder for developers. I also had the same questions as Manu above: I don't understand the conflation going on here even if the intent was not to define @id aliases but terms in some vocabulary. The JSON keys expressed are not expected to appear or be processed in the VC JSON-LD payload, but rather are to appear in and be processed as header information in the securing format.

In short, if the security format is external, I see no reason to see anything related to it in the context, which is for internal securing formats that travel / live within the data model itself. VC-JWT is an external proof mechanism, not an internal one.

@TallTed
Copy link
Member

TallTed commented Jun 12, 2023

I'm with @msporny and @dlongley. This doesn't map to my mental model.

@iherman
Copy link
Member

iherman commented Jun 14, 2023

The issue was discussed in a meeting on 2023-06-14

  • no resolutions were taken
View the transcript

1.4. Add graph node identifiers for registered claims (pr vc-data-model#1149)

See github pull request vc-data-model#1149.

Manu Sporny: pull 1149 has questions.
… vc data integrity.

Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR accidentally introduces an invalid JSON-LD Context that would corrupt any JWT properties used in a VC. It is also not clear why a VC would need to include these properties since a JWT is an external proof format (thus the properties never appear in a VC).

Requesting clarification on what this PR is attempting to do.

Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for defining these terms in the core context? Do you have an example? We might want to put that example in the core specification. I'm asking for the same sorts of demonstrations of specification and usage that were requested for renderMethod and confidenceMethod.

The thing that is still confusing me about this PR is that these properties don't show up inside VCs (AFAIK). They're properties used in an external proof. An example or two from a use case would be helpful.

I'll also note that the fragment identifiers do not resolve. When are the fragment identifiers going to be fixed in the IANA registry so that the fragments resolve?

For the avoidance of doubt, if we have a legitimate use case that the VCWG agrees to that uses these properties, and as long as we figure out what core spec text is going to exist to legitimize the inclusion of these properties, I won't object to the inclusion of these properties.

@OR13
Copy link
Contributor Author

OR13 commented Jun 23, 2023

@msporny same use case as defining proof and DataIntegrityProof in the normative v2 context.

@brentzundel
Copy link
Member

Orie is asserting that including these properties will make life easier for developers of VC-JWT in the same way that the inclusion of DataIntegrityProof makes life easier for Data Integrity developers. I am trusting his assertion about VC-JWT just as I am trusting others' assertions about DataIntegrityProof.

Those opposed to including DataIntegrityProof expressed a lack of understanding about precisely how a developer's life is made easier by including it.
This lack of understanding was not sufficient to remove DataIntegrityProof, why should similar lack of understanding around VC-JWT terms be sufficient to prevent them from being added?

I also want to say that the bar for inclusion in the @context hasn't been "core spec text . . . to legitimize the inclusion of these properties" (unless there is core spec text about Data Integrity that I'm unaware of), so it seems inappropriate to make that a requirement here.

Is there a harm that merging this PR would cause that I'm not seeing? Would merging this PR prevent you from doing what you need with the @context?

tl;dr I don't have deep opinions about the @context file, but I don't understand the pushback here, it seems inconsistent.

@msporny
Copy link
Member

msporny commented Jun 24, 2023

@OR13 wrote:

same use case as defining proof and DataIntegrityProof in the normative v2 context.

Then that would mean that these properties are going to be used inside of VCs. Can we see an example of what that would look like?

tl;dr I don't understand the pushback here, it seems inconsistent.

For the proof and DataIntegrityProof entries in the core VCDM context, we had to demonstrate that developers would benefit. The argument to date has been "We want to minimize the JSON-LD Contexts used to ease the developer experience." That is what led to the default vocabulary decision. Given that we have demonstration of over 20 implementers utilizing proof, and given that all the Data Integrity cryptosuites being moved through the VCWG use the DataIntegrityProof type now, we could reduce the number of required contexts when implementers used proof down from two contexts to one context. IOW, the main way you use data integrity is this:

// This is a VC that is secured using a data integrity proof
{
  "@context": [
    "https://www.w3.org/ns/credentials/v2",
    "https://www.w3.org/ns/credentials/examples/v2"
  ],
  "id": "http://example.gov/credentials/3732",
  "type": ["VerifiableCredential", "ExampleDegreeCredential"],
  "issuer": "https://university.example",
  "validFrom": "2010-01-01T19:23:24Z",
  "credentialSubject": {
    "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
    "degree": {
      "type": "ExampleBachelorDegree",
      "name": "Bachelor of Science and Arts"
    }
  },
  "proof": {
    "type": "DataIntegrityProof",
    "cryptosuite": "ecdsa-2019",
    "created": "2022-11-13T18:19:39Z",
    "verificationMethod": "https://university.example/issuers/14#key-1",
    "proofPurpose": "assertionMethod",
    "proofValue": "z58DAdFfa9SkqZMV...fFPP2oumHKtz"
  }
}

If we didn't have proof and DataIntegrityProof in the core context, you'd have to add another JSON-LD Context to make it work, which people had complained about in the early days of the VCWG, which is what led to the inclusion in the VCDM core context.

This PR adds things like iss, sub, jku, and kid to the core context, but those values are not used in VCs. For example, I don't know of any use case that looks like this at present:

{
  "@context": [
    "https://www.w3.org/ns/credentials/v2",
    "https://www.w3.org/ns/credentials/examples/v2"
  ],
  "kid": "https://university.example/issuers/14#key-1", // JOSE property
  "iss": "https://university.example", // JOSE property
  "sub": "did:example:ebfeb1f712ebc6f1c276e12ec21", // JOSE property
  "jku": "https://university.example/issuers/14.json", // JOSE property
  // rest of VC is below
  "id": "http://example.gov/credentials/3732",
  "type": ["VerifiableCredential", "ExampleDegreeCredential"],
  "validFrom": "2010-01-01T19:23:24Z",
  "credentialSubject": {
    "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
    "degree": {
      "type": "ExampleBachelorDegree",
      "name": "Bachelor of Science and Arts"
    }
  }
}

That doesn't mean a use case might not exist, which is why there is a request to understand what use case this PR addresses where those values would be expressed in embedded form. We had to provide examples for proof and DataIntegrityProof. Examples have not been provided for iss/sub/jku/kid as embedded values yet.

@dlongley
Copy link
Contributor

@brentzundel,

I don't have any objections to this based on the grounds that others used to argue a while back against adding various things to the context. I said those grounds didn't make sense back then -- and I continue to think that. And, as I said back then, adding things to the context that make things easier for developers is generally a good idea, and if this PR were doing that as well, I'd be fine with it.

My concern here is that this is making things more confusing for developers. We had successfully established a clean separation between JWT claims expressed in the external securing mechanism (VC-JWT) and its payload (application/vc+ld+json). This seems to co-mingle those things again.

It also seems to potentially bring back the confusion from JWT usage with VCDM 1.1: e.g., are we expecting people to use iss "instead of" or "in addition to" issuer? Clearly they must include issuer to be VCDM 2.0 compliant, so only "in addition to" would be valid. Did we want to bring that back? And will we see compliance in the wild with that approach or will some people use "instead of" again? Will that create additional standardization pressures? Won't using both fields encourage more confusion? What if the values don't agree in the duplicated fields? Wasn't it a victory to have eliminated all these possibilities?

This seems to reintroduce problems that were railed against in the 1.1 work. In fact, it adds another dimension to the possible (erroneous / non-compliant) ways to secure VCs using JWTs, with implementers potentially being confused over whether the payload will, for example, have only iss, have iss and issuer, whether those fields will agree, whether the payload will have vc (VCDM 1.1 JWT style) or whether iss will be co-mingled directly with the payload vs. using vc -- and so on.

All of that being said, I'm not going to formally object to adding these as scoped properties to a VC via the JSON-LD context -- I just think it's a really bad idea that's asking for trouble. This is not an academic or ideological purity argument in the slightest, but a practical one. I've said similar things in this WG when trying to help people who want to use JWTs to secure the data model in a number of places. Sometimes my advice is used, other times it is dismissed. IMO, it's too infrequently responded to in collaborative conversation.

I get the awful impression that a number of PR discussions are mentally processed like this: "You said some stuff about my PR, but are you going to block it or not?". I suspect that's not the best way to design technology standards. Neither is going around blocking people's efforts, of course, but speaking for myself, I don't go around "just trying to block things", but rather I seek to improve our work here generally. Anyway, this PR, to me, sounds like a bad idea and that it just reintroduces a number of things that were fixed during the 2.0 work. I'd love to hear how I'm wrong about that and why. I'm happy to be wrong if it means things get better.

So, to be clear, the arguments against this PR are not the same arguments that were used elsewhere -- and this PR doesn't do the same things that those other PRs did either. This is a wholly different situation in both reasoning for / against and effect. Regardless, if my advice is ignored that this is a bad idea and no arguments are put forth to help convince me why it's not a bad idea, I continue to think VC-JWT users will suffer from it.

My only other concern would be that people will make more mistakes with this in the wild after 2.0 is released -- and then a future 3.0 WG will have to repeat the same long discussions we had during 2.0 and a lot of progress in other areas will be sacrificed again because of it. I'd rather see us embrace the clean approach and avoid repeating the 1.1 mistakes.

@OR13
Copy link
Contributor Author

OR13 commented Jun 26, 2023

@msporny @dlongley since the VC Data Model is open world, if we don't reserve these words like this, we will be creating harmful security issues... nothing is more confusing than using reserved words, but given them different meaning.

@dlongley
Copy link
Contributor

dlongley commented Jun 26, 2023

@OR13,

@msporny @dlongley since the VC Data Model is open world, if we don't reserve these words like this, we will be creating harmful security issues... nothing is more confusing than using reserved words, but given them different meaning.

Is that the reason for this PR -- reserving terms to avoid confusion? If this PR is about reserving terms, can you make it say so in the title or in your remarks in the opening post? I had no idea.

This PR also only makes changes to the context -- I expect we'll want to say more in the spec. We may want to tell developers specifically NOT to ever use these terms in VCs at all and point them at VC-JWT for how to cleanly separate things. This "reservation" is a bit different from how we've been doing the other term reservations because we intend people to use and experiment with the others, but here we want to recommend they never be used, IMO. Please clarify if that's the goal here, because I suspect it changes how people will think about this PR in significant ways.

EDIT: Maybe we should map these terms to a general "do not use" vocabulary URL.

@iherman
Copy link
Member

iherman commented Jul 15, 2023

As you said above, our process is sorta guiding our architecture here... But I don't see how to fix that, without W3C mirroring IANA...

I think that "mirroring IANA" is a little bit of a strong terminology here, @OR13. IANA is oblivious to the usage of these ID-s in a Linked Data setting, and what we would do is to define an RDF related ID for those terms. We would not define the semantics, we would make a bridge between IANA and Linked Data.

This is what GS1 did for schema.org, see https://www.gs1.org/voc/Organization which points to https://schema.org/Organization

@OR13, I do not know why GS1 did that, but the comparison is wrong. https://schema.org/Organization is a perfectly valid URL that does dereference to a page of its own. I.e., a processing engine (human or digital) can "follow its nose", i.e., dereference a URL to find out more of the term. This isn't the case if the fragid is wrong.

Let's not buy trouble from the future. Let's just wait for them to answer the question and go from there. There are a variety of ways through this maze... let's try to do the right thing first, and then when it looks like that's not going to work out, formulate a plan B (based on what we learn by asking the question of JOSE / IANA in the first place).

+1 there. Let us ask IANA asap, and see if we are, in fact, chasing ghosts here because they would add those fragid in a blink of an eye.

IOW, I'd be surprised if they refuse to add fragment identifiers to that IANA registry.

Hopefully you are right.


(Just for the fun of it, a crazy and probably nonsensical idea would be to make use of the fact that the target page is XHTML: maybe an XPath wizard could come up with an xpointer URL of the form https://www.iana.org/assignments/jose#xpath(…). Alas! Xpointer and all the other XML tools have been buried and forgotten, so this would not have any practical sense... But it may be a nice thought experiment!😀)

@OR13
Copy link
Contributor Author

OR13 commented Jul 15, 2023

I am interested in this part:

a processing engine (human or digital) can "follow its nose", i.e., dereference a URL to find out more of the term. This isn't the case if the fragid is wrong.

Why is it not the case? Because a browser does not scroll to the fragment?

Isn't that making the web browser the "processing engine" ?

Isn't processing the fragment "media type specific" ?

For example, if I search the page for the text iss using a browser, I immediately find the entry I am looking for.

How is this not good enough for the "follow your nose" criteria, and what actual W3C process asserts this requirement?

@OR13
Copy link
Contributor Author

OR13 commented Jul 15, 2023

@brentzundel @Sakurann A few proposals for the next opportunity:

PROPOSAL: The normative v2 context will not define any terms related to vc-jose-cose.

PROPOSAL: URLs to the IANA JOSE registry with fragments will be used to identify the terms related to vc-jose-cose in the normative v2 context.

PROPOSAL: URLs to the W3C Credentials Vocabulary will be used to identify the terms related to vc-jose-cose in the normative v2 context.

@brentzundel
Copy link
Member

@TallTed It looks like your requests for changes were all addressed, can you confirm?

@msporny msporny requested a review from TallTed July 19, 2023 18:30
OR13 and others added 2 commits July 19, 2023 20:54
@@ -5,6 +5,26 @@

"id": "@id",
"type": "@type",
"kid": {
Copy link

@alenhorvat alenhorvat Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OR13 kid always appears in the header.
This means that the context should be used outside of the payload where it appears?

Same comment for all JWS claims. (jku, x*, ...)

Copy link
Contributor Author

@OR13 OR13 Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@context is also allowed in JOSE headers, when its present, should the terms have definitions like this or not?

The core data model states a "VerifiableCredential" has "proof" and "metadata"... headers and signatures seem like proof and metadata to me... also... kid appears in payloads... see:

https://datatracker.ietf.org/doc/html/draft-ietf-oauth-selective-disclosure-jwt-05#appendix-A.3

https://www.rfc-editor.org/rfc/rfc7800.html#section-3.4

     {
      "iss": "https://server.example.com",
      "sub": "17760704",
      "aud": "https://client.example.org",
      "exp": 1440804813,
      "cnf":{
        "jku": "https://keys.example.net/pop-keys.json",
        "kid": "2015-08-28"
       }
     }

Omitting kid from this PR would cause cnf "RDF graph" to be built improperly for key bound SD-JWT Verifiable Credentials that are trying to leverage JSON-LD.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Fine. Let's continue under 1175 -- it's just a matter of keeping everything in 1 file or splitting it per securing mechanism.

Note that SD-JWT is defining only _sd, _sd_alg, and is not requiring cnf or any jwt claims; hence these should be defined in a separate document. (which was not the case in the previous version of sd-jwt specs)

wrt context in the jose headers -- let's discuss this in the context of JAdES that introduced custom header claims

@iherman
Copy link
Member

iherman commented Jul 20, 2023

The issue was discussed in a meeting on 2023-07-19

  • no resolutions were taken
View the transcript

1.1. Add graph node identifiers for registered claims (pr vc-data-model#1149)

See github pull request vc-data-model#1149.

Manu Sporny: 1149 is waiting on TallTed to provide a review.
… The PR maps certain terms to a registry.

@alenhorvat
Copy link

Since the v2 context contains terms specific to data integrity proofs.

And I have closed the PR attempting to remove those terms: #1091

It stands to reason we will need to add terms related to all proof formats to the v2 context, in the spirit of making things easier for developers.

This is an important design decision. IMO the core vocabulary should not define securing-related vocabulary (except for the top level proof property).

Why?
There are 3 signature "topologies"

  • detached (e.g., compact serialised JWS)
  • enveloping (JSON serialised JWS)
  • enveloped (Data Integrity Proofs)

JWS (not JWT) and DIP are not adding any additional claims to the payload itself. DIP is defining and managing the claims inside of the proof property - that's a property of enveloped signatures.

Proposal is to keep the securing independent of the content to decouple the two. It is a design decision.
Vocabulary also contains some other default claims, such as statuslist2021, ...

One option would be to define a minimal vocabulary for the core claim set and a separate vocabulary for a "profile" that would contain definitions for that profile (JsonSchema2023, VerifiableCredentialSchema2023, StatusList2021Credential, StatusList2021, StatusList2021Entry, DataIntegrityProof)

Whether or not this is a viable proposal depends on the purpose and design decision of the core vocabulary.

This is partially reopening the #1091 discussion (partially)

If the decision of the WG is to keep the data integrity vocabulary in the VCDM specs, it would only be appropriate for the enveloped signatures, since for the detached and enveloped the securing-related claims reside outside of the payload.

@OR13
Copy link
Contributor Author

OR13 commented Jul 20, 2023

This is an important design decision. IMO the core vocabulary should not define securing-related vocabulary (except for the top level proof property).

I agree with you, but the working group could not establish consensus to split them up, and has defaulted to bundling them all together (as we did in v1 and v1.1).

I guess the question remains, do we reopen that debate and try to split things up, especially in light of:

#1175

Which was opened after I capitulated to the group regarding this design choice last time?

@OR13
Copy link
Contributor Author

OR13 commented Jul 20, 2023

@davidlehn I believe your suggestions have been addressed, can you re-review?

@iherman Are you objecting to merging this PR, can you address my question on #1149 (comment)

@alenhorvat Lets move the debate you are reopening to #1175 , are you willing to approve the PR while we continue to discuss how we might factor these terms into separate contexts in the future?

@alenhorvat
Copy link

@alenhorvat Lets move the debate you are reopening to #1175 , are you willing to approve the PR while we continue to discuss how we might factor these terms into separate contexts in the future?

Yes, we can continue the discussion under #1175. As said, it is a design decision and my purpose is not to block any PR.

@msporny
Copy link
Member

msporny commented Jul 23, 2023

Normative, multiple reviews, changes requested and made, no objections, merging.

@msporny msporny merged commit c1f8a63 into main Jul 23, 2023
@msporny msporny deleted the update-identifier-aliases branch July 23, 2023 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.