Skip to content

Make the vocabulary URLs and values normative #1159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 30, 2023
Merged

Conversation

msporny
Copy link
Member

@msporny msporny commented Jun 19, 2023

This PR is an attempt to address issue #1103 by making the vocabulary URLs and content at those URLs normative. I will note that this PR is experimental, as it's not clear if it addresses all of the concerns raised in #1103.

One thing the PR does not do is refer to each vocabulary by hash (like we do for context files). It doesn't do this because doing so with vocabularies like schema.org is not possible (because the vocabulary is updated on a regular basis and thus the hashes would change on a regular basis). The same is true for the WG's vocabulary files (over time). If we were to refer to them by hash, then the next iteration of the vocabulary would break previous releases. The same is true if we included the vocabulary, verbatim, in the core specification.

So, this PR attempts to make it not matter what the URLs used in the context file are -- the WG specifies exactly which vocabulary each URL included in the context should refer to such that the meaning of the terms (both human readable and machine readable) is under the control of the WG (or delegated to an entity that the WG feels is adequate).


Preview | Diff

@iherman
Copy link
Member

iherman commented Jun 19, 2023

Several comments

  • To your question:

    I am not sure this PR fully addresses Vocabulary normative, context isn't? #1103".

    In Vocabulary normative, context isn't? #1103 (comment) I asked the question:

    The question that needs an answer is: where is it normatively defined that the official URL of the evidence property (as defined in the VCDM spec) is https://w3.org/2018/credentials#evidence?

    and I do not believe this PR answers this question. I.e., no, it does not address Vocabulary normative, context isn't? #1103 in my view. What is fixed normatively are the references to the vocabularies and not the content. In my view (I know, I am a broken record!) the VCDM document should include the content of https://www.w3.org/2018/credentials/index.jsonld in a normative appendix (but I am fine to keep this table in some way or other).

  • I believe that, in the table you added in B2, we must differentiate between those vocabularies that we can (and we should) refer to normatively and with a hash, and those that we cannot. The basis of the decision is who controls that vocabulary. The credential and the security vocabularies are in the former category, schema.org in the latter. (Future versions of this or the security vocabulary may actually end up referring to other external vocabularies, like, say, Dublin Core or the provenance vocabulary…). This does create a certain level of uncertainty, I realize that, but I do not think we have a choice.

  • Editorially, I think it is a bit of an overkill to have a table with separate lines for the HTML and the JSON-LD versions of a vocabulary. This makes the table difficult to digest. I think that it is perfectly fine to refer to the JSON-LD versions of each of those in the "normative" sense (i.e., with a hash), and just mention somehow that a human-readable version of the vocabulary is also available (the fact that the HTML version contains the same vocabulary in RDFa and that there is also a Turtle version, is just cherry on the birthday cake of Linked Data aficionados).

@msporny
Copy link
Member Author

msporny commented Jun 19, 2023

@iherman wrote:

refer to normatively and with a hash

If we refer to the vocabulary normatively, with a hash, then what happens to the normative statement when we (inevitably) add a new term to the vocabulary in v3.0? Does that not break the hash and make the entire v2.0 ecosystem non-conformant?

@iherman
Copy link
Member

iherman commented Jun 19, 2023

@iherman wrote:

refer to normatively and with a hash

If we refer to the vocabulary normatively, with a hash, then what happens to the normative statement when we (inevitably) add a new term to the vocabulary in v3.0? Does that not break the hash and make the entire v2.0 ecosystem non-conformant?

Good point; as long as we have the normative vocabulary in the spec for V3.0 as well (and we are careful not to remove a term) we may be fine without a hash. Forget my remark.

Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

text/html
</td>
<td>
https://www.w3.org/2018/credentials/index.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expected for the section 3. Term definitions to be empty in this doc? See image below.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not... There seems to be a bug in the generation process.

I will look into it. That being said (1) this should not hold up this PR if it is voted to be merged and (2) at some point we agreed that the vocabulary definitions should happen when things get to an equilibrium point; it is impossible to keep the vocabulary in sync with so many things pending...

Thanks for notifying this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msporny I have contacted you in a separate mail on this possible bug. There is some mystery surrounding the github.io behavior that leads to the disappearance of a full section content...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is probably some race condition when it comes to publishing the file, or the build process isn't getting executed before publishing for gh-pages, or some other strange thing like that. Usually when something like this happens, I look in the gh-pages build logs to see if I can see what's going wrong... or I look for something that could be a race condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I trust you on looking at that? I am pretty much lost when it comes to these github magic...

</tr>
<tr>
<td>
https://w3id.org/security#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to make this one normative like this? or should we move this to a w3c origin before taking a normative dependency on it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes it such that it doesn't matter what URL the group decides to make normative, because it explicitly states what vocabulary is the normative one (the W3C published one). This means that the W3C VCWG is effectively taking over how the URL resolves and stating that the only correct resolution is to the W3C Vocabulary (or, whatever vocabulary the W3C VCWG deems is the one it wants to point to, like schema.org).

The benefit here is that we don't have to go around renaming all of the URLs in the ecosystem. What some in the VCWG wanted was to have control over the vocabulary URL and content, this PR does that without introducing any backwards incompatible changes.

text/html
</td>
<td>
https://w3c.github.io/vc-data-integrity/vocab/security/vocabulary.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be to a w3c origin, not github.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but the appropriate URL doesn't exist at this time.

I would expect that this URL will end up being https://www.w3.org/ns/security, but the VCWG hasn't set up that URL yet. I'll add an issue marker to that effect to the PR.

application/ld+json
</td>
<td>
https://w3c.github.io/vc-data-integrity/vocab/security/vocabulary.jsonld
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be to a w3c origin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but the appropriate URL doesn't exist at this time.

I would expect that this URL will end up being https://www.w3.org/ns/security, but the VCWG hasn't set up that URL yet. I'll add an issue marker to that effect to the PR.

Copy link
Contributor

@OR13 OR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an issue marker that lets us change the URL and/or Content for this section, without going through CR again?

That will allow us to fix the github links, IIRC there has been objection to using github.io URLs in w3c documents before.

@OR13
Copy link
Contributor

OR13 commented Jun 20, 2023

overall, love the direction of the PR, only a few nits.

@msporny
Copy link
Member Author

msporny commented Jun 21, 2023

Can we add an issue marker that lets us change the URL and/or Content for this section, without going through CR again?

Yes, I'll add two issue markers:

  1. That the github.io vocabulary base URL will change to something like https://www.w3.org/ns/security, but the VCWG hasn't set up that URL yet.
  2. That the URLs might change during the CR phase.

That will allow us to fix the github links, IIRC there has been objection to using github.io URLs in w3c documents before.

Yes, agreed, having those links in there long term was not the intent... just needed a placeholder that was "under W3C control" (but really, under Github control, which is not ok).

@msporny
Copy link
Member Author

msporny commented Jun 21, 2023

@iherman wrote:

Good point; as long as we have the normative vocabulary in the spec for V3.0 as well

I'm a -0.8 to copy the entire machine-readable vocabulary into the spec. It takes up a ton of space and no one really looks at it (ever). We did this with the @context value in v1.0 an v1.1 and removed it in v2.0 for the same reason. We now refer to the value of the context by hash.

IF we want to freeze the vocabulary for each version (and I'm a -0.75 on doing that), we should snapshot the machine-readable version of it and publish it with the specification. We could then refer to that snapshot via hash (and not take up a ton of space in the spec dumping in text that hardly anyone is going to look at). Thoughts, @iherman ?

@iherman
Copy link
Member

iherman commented Jun 22, 2023

@iherman wrote:

Good point; as long as we have the normative vocabulary in the spec for V3.0 as well

I'm a -0.8 to copy the entire machine-readable vocabulary into the spec. It takes up a ton of space and no one really looks at it (ever). We did this with the @context value in v1.0 an v1.1 and removed it in v2.0 for the same reason. We now refer to the value of the context by hash.

The comparison is not entirely fair: I would be against putting the context file into the document normatively for the reasons I have already stated elsewhere, no matter what.

That being said, indeed, the jsonld version of the vocabulary is currently cca 400 lines; after some editorial trick (making a version of it that does not include the comment fields and the like) it may be reduced to 300. I agree, it is longish.

IF we want to freeze the vocabulary for each version (and I'm a -0.75 on doing that), we should snapshot the machine-readable version of it and publish it with the specification.

Yes.

We could then refer to that snapshot via hash (and not take up a ton of space in the spec dumping in text that hardly anyone is going to look at).

But we should still keep the copy in the /TR directory of the publication, because you cannot refer externally (i.e., outside the /TR subdirectory) to a normative part of the spec.

Alternative to consider: what about using the HTML details element? Using the respec to include for the jsonld content isn't a big deal (our spec is still tiny compared to HTML, but even compared to EPUB, ie, I do not really buy the argument of the spec being that large), the only real issue is the readability. If there is a normative appendix on the vocabulary that refers to the HTML and Turtle versions and has, inline, the jsonld version in a <details> element, that does not bother anyone who is not interested in the details (sic!) but we are clean spec-wise.

@OR13
Copy link
Contributor

OR13 commented Jun 23, 2023

because you cannot refer externally (i.e., outside the /TR subdirectory) to a normative part of the spec.

@iherman can you elaborate on what you want to see (maybe an example directory structure)?

I don't agree that the spec will be "too large".... I know we have some automation that can help build arbitrary length documents, let's use that... and make sure everything we are saying matters, is defined well, and easily readable / linkable.

@iherman
Copy link
Member

iherman commented Jun 23, 2023

because you cannot refer externally (i.e., outside the /TR subdirectory) to a normative part of the spec.

@iherman can you elaborate on what you want to see (maybe an example directory structure)?

This means that the json-ld file (if it is the one normatively used by the spec) must be in https://www.w3.org/TR/vc-data-model-2.0 (or a subdirectory thereof). This is true not only for a vocabulary file but, for example, for all graphics files, illustration, etc, that are directly embedded in the spec.

Sounds like an arbitrary rule, but it is related to the secure archival of all specifications.

Base automatically changed from msporny-context-normative to main June 30, 2023 09:18
@msporny msporny force-pushed the msporny-vocab-normative branch from a924222 to 73d775d Compare June 30, 2023 10:11
@msporny msporny force-pushed the msporny-vocab-normative branch from 6fdf652 to b214247 Compare June 30, 2023 10:23
@msporny
Copy link
Member Author

msporny commented Jun 30, 2023

@msporny wrote:

I will add two issue markers:

  • That the github.io vocabulary base URL will change to something like https://www.w3.org/ns/security, but the VCWG hasn't set up that URL yet.

This was done in: 73d775d

  • That the URLs might change during the CR phase.

This was done in: b214247

A future PR can deal with how we intend to refer to the machine-readable vocabulary documents, how they will be packaged up and published, and whether or not we want to refer to them by hash (my preference) or by inclusion in an HTML details element in the spec (-0.75 for doing this if we just refer to them by hash). I have also added an issue marker for this here: f3d32dd

Requesting re-review from @OR13 and @iherman .

@msporny msporny requested a review from OR13 June 30, 2023 10:25
@@ -5216,6 +5216,14 @@ <h3>Vocabularies</h3>
location under W3C control.
</p>

<p class="issue" title="How to normatively refer to vocabulary files">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is issue to leave this comment on, I will leave it here.

hash for vocab seems a bit odd, given we are trusting w3c to maintain both the hash, and the vocab.... an attacker with control of the origin can change both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree... which is one of the reasons why I didn't want to use hash or the vocab content in the spec in the first place. Feels like overkill.

@msporny
Copy link
Member Author

msporny commented Jun 30, 2023

Normative, multiple reviews, changes requested and made (issue markers for unresolved issues), no objections, merging.

@msporny msporny merged commit 562b9f3 into main Jun 30, 2023
@msporny msporny deleted the msporny-vocab-normative branch June 30, 2023 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants