Skip to content

Open Community Working Meeting 2022-07-01 #193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
Relequestual opened this issue Jun 30, 2022 · 31 comments
Closed
1 task done

Open Community Working Meeting 2022-07-01 #193

Relequestual opened this issue Jun 30, 2022 · 31 comments
Labels
Working Meeting Identify working meetings

Comments

@Relequestual
Copy link
Member

Relequestual commented Jun 30, 2022

Open Community Working Meeting 2022-07-01 - 12:00 PT

Zoom Meeting link: https://postman.zoom.us/j/89562933116?pwd=OWlsQ0RrcDY4S1JQU2d2Q2M0aFFlZz09

Live minutes and notes (gdoc)

Agenda:

Topic Owner Decision/NextStep
Review last call's action items @Relequestual We should document the current people with permissions and move on from there.
Update from JSON Schema's new Full Time Employee! @jdesrosiers Catching up mostly. Will focus on IETF HTTPAPI-WG Mediatypes RFC
Proposing moving alterschema to the json-schema-org organization @jviotti Not at this time, but we will keep an eye on developments.
Branch structure and authoring language for the specification repository moving forward @jdesrosiers master will be renamed to main, used for future spec development, and any patch releases will be done separately (as ad hoc branches, likely containing the name of the release they patch)
Can we align better around backwards compatibility across releases? Do we want to? @awwright
What are our high level goals for the next release? @jdesrosiers

AOB?
If you want to discuss any other business not on the agenda, please add comments during the meeting.
If we do not complete the agenda, your discussion item will likely be rolled over to the next call.

Action items:

Notes:

Agenda Items rolling over:

  • list

Recording: link

@Relequestual Relequestual added the Working Meeting Identify working meetings label Jun 30, 2022
@jviotti
Copy link
Member

jviotti commented Jun 30, 2022

We should probably discuss the backwards-compatibility problem and see if we can get closer to agreement?

@jviotti
Copy link
Member

jviotti commented Jun 30, 2022

Related to that, shall we quickly discuss moving alterschema to the JSON Schema org? I think this ties to the previous point though. If we see alterschema as a potential solution, then it might make sense to maintain it there, and allocate more resources to it. Otherwise, I'll still maintain it myself outside of it for my own use cases!

@Relequestual
Copy link
Member Author

We should probably discuss the backwards-compatibility problem and see if we can get closer to agreement?

Yes, but I'm timeboxing it to 25 minutes, including current summary. Paging @awwright

@Relequestual
Copy link
Member Author

Related to that, shall we quickly discuss moving alterschema to the JSON Schema org? - @jviotti
Yes, sure, let's do that.
Please also create a Discussion in this repo.

@Julian
Copy link
Member

Julian commented Jun 30, 2022

Perhaps it's worth even broadening the discussion to "a json-schema-contrib org with more members and lower bars for contributing repos" (or perhaps that's a separate one we could put off).

@Relequestual
Copy link
Member Author

Perhaps it's worth even broadening the discussion to "a json-schema-contrib org with more members and lower bars for contributing repos" (or perhaps that's a separate one we could put off).

I think that's something we would struggle to discuss this with the time we have. I have a lot of opinions about this.
Might be better as a discussion.

@jdesrosiers
Copy link
Member

We should probably discuss the backwards-compatibility problem and see if we can get closer to agreement?

I think that might be a bigger discussion that might be best scheduled as separate meeting.

@jdesrosiers
Copy link
Member

We should discuss roadmap/goals for next release.

@handrews
Copy link

We should probably discuss the backwards-compatibility problem and see if we can get closer to agreement?

Yes, but I'm timeboxing it to 25 minutes, including current summary. Paging @awwright

@Relequestual @jviotti @awwright what I most want to get out of this 25 minutes is an understanding of who is demanding this sort of cross-draft behavior that @awwright is proposing, and what their use cases and expectations are for it.

I have also laid out how I see different eras of features and support across drafts in #192 "Guidance around previous draft support"

@jdesrosiers
Copy link
Member

We should talk about converting the spec to something that isn't associated with IETF (probably plain markdown). If we need to convert everything, we should do it sooner than later.

@jdesrosiers
Copy link
Member

Let's discuss what approach we want to take for getting draft-next in sync with the 2020-12 patch changes and where we want to continue work (continue on draft-next or work on master)

Also, now is a good time to rename master to main.

@awwright
Copy link
Member

I can take a little time to discuss reverse compatibility & interoperability; but what I am looking to propose is if we can come up with a set of "Use cases & requirements":

  • What "business functions" should all implementations be able to do, e.g. "assert the value is a string with at least one character" — this would mostly be a representative sample of our test suite
    • Anything that can be processed by a pushdown automata should be required, which is most keywords.
    • What is out of scope — e.g. "verify that this integer is a user ID in my database" is out of scope
  • What sort of extensions should JSON Schema support — can I refer to an extension that describes, for example, a MongoDB ObjectId (stringified)?
  • ... should it be guaranteed that if the extension cannot be loaded or is not understood, that I can guarantee a result that's not misleading (two different implementations will not produce valid & invalid, respectively? But an error is OK.)
  • If there's a technical reason that an implementation cannot implement a keyword (namely: uniqueItems, our usual suspect), that it will not produce a misleading result? (It shall produce an error?)
  • How should we support validators that use a lossy JSON parser? (i.e. how do you validate IEEE floats in a cross-platform manner?)
  • How should we handle forward compatibility? Can a validator ditch support for all existing schemas once a new draft is published?

... I think we should aim for 100% interoperability, but that may be a lofty goal, so start with some concrete cases that we want to support. Let's start by describing the features we already support and want to keep support for.

These use cases are often published as formal documents, e.g. https://www.w3.org/TR/webrtc-nv-use-cases/

@handrews
Copy link

@awwright you are talking about this like we just "come up with" use cases. Unless those align with actual problems, I don't find that process compelling as far as driving further work. They might be compelling in terms of adding new test cases to ensure that any use cases that currently ought to be supported really are, but that's a different thing.

Generally, when we've added complex or significantly different new functionality, it is in response to a well-established problem.

  • unevaluated* resolved the multi-year argument over different approaches like $merge or "strict mode" for functionality many had asked for (or assumed incorrectly that additional* already did)
  • $vocabulary made extensibility interoperable (or closer to interoperable), which solved the problem of people asking for obscure functionality as keywords because only things in the spec every got implemented in an interoperable way. The approach was validated with the OAS community, with their extensions as a test case, rather than being an abstract offering.
  • $recursive* (later $dynamic*) solved the problem that extending meta-schemas was infeasible due to the need to copy-and-paste, which only got worse the more meta-schemas you were trying to work with. And we needed extending meta-schemas to be easy if $vocabulary had any chance of working.
  • Splitting $id and $anchor happened in part because $id was a sticking point for OAS compatibility because of $id's complexity and how some things you could do with it (specifically put a JSON Pointer fragment in it) could produce contradictory URIs. Likewise items/additionalItems becoming prefixItems/items, which was another sticking point (OAS previously forbade the array form of items)
  • Making $ref a regular applicator keyword in response to the steady stream of people being confused by the replacement of adjacent keywords over the years.

All of these changes were in response to long-established, well-documented problems, and/or were done through engagement with a major consumer of JSON Schema with its own large community.

I don't see anything like that in your list, particularly regarding single schema resources drawing on functionality for multiple drafts. Which seems to be some sort of litmus test for "interoperability" for you, although it's honestly still hard to tell what you're going for there.

I'm not interested in solutions looking for problems. I am interested in actual problems that clearly (to all of us) need solutions. Can you point to well-documented problems in the community that you are trying to address?

Another angle is that many of these problems had workarounds of sorts implemented in the community. $merge and "strict mode" were (and possibly still are) supported by some implementations. Many implementations offered extension capabilities, but they were all different. OAS worked around some of the problems by forbidding certain keywords. That sort of thing. Can you point to where people are working around the problems that you want to solve?

@handrews
Copy link

  • What sort of extensions should JSON Schema support — can I refer to an extension that describes, for example, a MongoDB ObjectId (stringified)?

I have a very detailed presentation on this that I had hoped to have finished in time for people to watch it before the meeting, but that has not been possible. I hope to have it finished early next week (as a video / slides+script so folks can consume it at their own pace before our next call).


  • ... should it be guaranteed that if the extension cannot be loaded or is not understood, that I can guarantee a result that's not misleading (two different implementations will not produce valid & invalid, respectively? But an error is OK.)
  • If there's a technical reason that an implementation cannot implement a keyword (namely: uniqueItems, our usual suspect), that it will not produce a misleading result? (It shall produce an error?)

Both of these are fully addressed by $vocabulary today. It is frustrating that you are asking us to go back to first principles rather than working with what has already been done, or at least showing some curiosity about it.


How should we support validators that use a lossy JSON parser? (i.e. how do you validate IEEE floats in a cross-platform manner?)

We have always focused on the data model. To expand the scope of the project to JSON text would be a radical change.

@awwright
Copy link
Member

awwright commented Jul 1, 2022

@handrews I'm not trying to peddle any single change with my list (be it "interoperability" or "functionality for multiple drafts" (which I think should be obsolete)). I was trying to be comprehensive. Maybe you have very different ideas that I haven't thought of, that's OK.

I'm pointing out we so far have not adopted a list of use-cases, and that we should. Obviously, the use cases should be representative of actual problems that we wish to solve. Most of them are going to be problems we've already solved, that's OK. I started my list with "business functions" to emphasize this will be mostly a list of things we already support.

The motivation for "Use cases & requirements" is because I'm having trouble understanding what is in scope, and it seems other people don't either. There seems to be some disagreement on when it's OK for different implementations to return different results on the same input (valid, invalid, error, or otherwise).

Several times now, I've read a suggestion like "JSON Schema must be 100% predictable," so I'll identify a case where two implementations return opposite results; and I get a response like "That's intentional". And then someone else will name it as a thing to be fixed anyways.

If JSON Schema should be 100% predictable, then let's come up with some comprehensive use cases. If it's OK for implementations to vary somewhat, let's decide what's out-of-scope.

But we need goalposts.

We have always focused on the data model. To expand the scope of the project to JSON text would be a radical change.

As far as I know, I have the only ECMAScript implementation that correctly handles arbitrary precision decimals. All the other implementations use the lossy JSON.parse. I'm not saying right now that we should adopt or reject such a use case. I'm saying it's something we must decide one way or the other: is this in scope or not?

What do you think of the "Use case" example? Here's one for SHACL, essentially schema validation for RDF: https://www.w3.org/TR/shacl-ucr/ (It even lists things like HTML form generation!)

@Relequestual
Copy link
Member Author

Relequestual commented Jul 1, 2022

I think I see what you're saying here @awwright. I agree we need a holistic approach to define scope which must impact how we move forward.

I don't think it's worth trying to even START that sort of work till you've seen @handrews observations and proposal. For me, it frames JSON Schema into another era.

How we operate and plan stems further than just the specification itself. JSON Schema as an org and a project is growing, and we have new ground to chart. This means we need to have a more holistic review of vision overall.

Once we have done that, we will have the right context in which to frame the current specification documents we have. Right now, I feel we have the right context, and I'd hate for you to waste your time making something that ultimatly isn't useful.

All that being said, I think what I'm hearing from the rest of the core team, is we would benefit from some processes in place to help guide the changes we make more immediatly.

To clarify, I do not want to be discussing "Use cases & requirements" during this call.

I do want to discuss compatibility, with the previso we are talking about "from the next version of the specification", and not including previous versions in that discussion.

(I say this as I feel this reflects the general consensus of what the team do and do not want to discuss. Please do correct me if anyone feels this is not accurate.)

@Relequestual
Copy link
Member Author

Also, now is a good time to rename master to main. - @jdesrosiers

I think we said we were just going to remove it and use draft-next and draft-patch?
This is something we should maybe discuss.

@jviotti
Copy link
Member

jviotti commented Jul 1, 2022

till you've seen @handrews observations and proposal. For me, it frames JSON Schema into another era.

Where can I see/read this?

@Relequestual
Copy link
Member Author

till you've seen @handrews observations and proposal. For me, it frames JSON Schema into another era.

Where can I see/read this?

It's not yet finished work. See #193 (comment)

@Relequestual Relequestual pinned this issue Jul 1, 2022
@handrews
Copy link

handrews commented Jul 1, 2022

@jviotti it's been a bit delayed, I'd hoped to have had it out at the beginning of this week but external events have interfered (this is all volunteer work for me).

@handrews
Copy link

handrews commented Jul 2, 2022

Sorry I ended up not being able to make this one - could someone post a summary of anything that was decided?

@handrews
Copy link

handrews commented Jul 2, 2022

@awwright I've mentioned this to a few others, but since @Relequestual asked you to wait until seeing my presentation, I want to clarify in this issue that the presentation is a starting point for discussion. While I'll be thrilled if everyone is as enthusiastic about it as Ben was about the outline that I pitched to him, that's not my expectation. I expect others will offer alternative approaches and/or modifications. These could come in the form of a use case document such as the one you link for SHACL, and I agree that having such a document would be useful.

My main reason for not wanting to start with use cases gets back to my comment about some of your questions: some things are already solved. I don't want to ignore what we have and just re-evaluate everything from first principles. I want us to have a common understanding of what we have so we can talk about it coherently.

Assuming no other external delays occur (and sadly I can't guarantee that- I lost the last two days to another unexpected thing), I expect to have the finished video up within a week. It will go to the core collaborators first, and then based on whatever feedback it gets I may make changes before posting it publicly.

@Julian
Copy link
Member

Julian commented Jul 3, 2022

Sorry I ended up not being able to make this one - could someone post a summary of anything that was decided?

There's a doc somewhere with more notes but long story short -- no to alterschema adoption, for branches dev of the next spec will be done now on the default branch, and patch releases will use ad hoc additional branches if needed.

Nothing further decided on the other items.

@Relequestual
Copy link
Member Author

On the call there was some discussion about how we previously decided not to continue to publish through the IETF process.

There's a larger discussion to be had about how we DO move forward. I'd like to share this FAQ answer from the WHATWG:

What's this I hear about 2022?
Back before the Living Standard development model, we were planning to put the contents of the HTML Standard through the W3C process. This was before we understood the fatal flaws of such a snapshot-based development mode.

At the time, the W3C Recommendation label had high standards, such as 100% test coverage of two complete and fully interoperable implementations. In 2008, the editor estimated it would take another 14 years to reach that point, based on comparing it to the amount of work done for HTML4 and other large specifications like CSS2/2.1.

Since then, we've realized that much like the waterfall model is not a good fit for software development, it is also not a good way of developing standards. These days we keep the HTML Standard continually under development, adding tests as we go and verifying them against implementations, per our working mode. So, the year 2022 is no longer relevant.

Personally, I do not want JSON Schema to become a "living standard" like HTML is now. Todays HTML spec seems mostly documentation from the major browser vendors... and that means mostly just chromium now. Such a situation is one of the things the IETF and W3C aim to avoid, and is one of the reasons I was so keen for JSON Schema to join the OpenJS Foundation.

We help avoid this by owning no implementations as an organization. We avoid the "14 years" rethoric by having a reasonably progressivly scheduled release plan (although this is quite rough).

We should take great care to avoid this sort of situation: https://claroty.com/wp-content/uploads/2022/01/Exploiting-URL-Parsing-Confusion.pdf

As an asid, here's an extract from the WHATWG URL spec (for today, at least):

Goals
...
Standardize on the term URL. URI and IRI are just confusing. In practice a single algorithm is used for both so keeping them distinct is not helping anyone. URL also easily wins the search result popularity contest.

😬 Looks like a case of not considering other usecases to me.

Also of interest: https://url.spec.whatwg.org/#url-apis-elsewhere

...Names such as "URL", "URI", and "IRI" should not be used.

Essentially, use "URL" for everything, even if it's only an identifier, because the only usecase that really "matters" is the one that happens in the browser. 🤷‍♂️

@handrews
Copy link

handrews commented Jul 4, 2022

Todays HTML spec seems mostly documentation from the major browser vendors

Yeah, that's exactly what WHATWG is all about. They don't care about anything else, and their insistence on "URL" for everything is a direct manifestation of that.

The "living standard" more-or-less works for them because they make it conform to whatever they are doing with their implementations. And as you noted, they own all of the implementations that matter in the space.

We do not want to be WHATWG.

@jdesrosiers
Copy link
Member

Personally, I do not want JSON Schema to become a "living standard" like HTML is now.

This needs a deeper conversation. Just because there are things we don't like about how HTML's living standard works doesn't mean living standards are bad. Nothing that you mentioned has anything to do with the living standard approach. We don't have to emulate HTML. A better example is ECMAScript.

Roughly, the way I see it working is that new features are added to the spec as they are approved but flagged as "experimental" or "unstable" or "proposed" or something like that. While in this state, changes can be made or it can be removed all together if we change our minds. Once a year, we do a release where we remove the experimental flag from any features we think are stable and any implementation claiming support for that release must support those features. For backward compatibility, no released features can be removed, but they can be marked as deprecated. Patches (clarifications and errors) can be done directly and don't have to wait for a release. It's essentially trunk-based-development applied to a spec.

Goals
...
Standardize on the term URL. URI and IRI are just confusing. In practice a single algorithm is used for both so keeping them distinct is not helping anyone. URL also easily wins the search result popularity contest.

😬 Looks like a case of not considering other usecases to me.

I don't think this is a case of excluding other usecases. They're just trying to make things less confusing by not having three names for essentially the same thing. Standardizing on the term URL doesn't mean every URI or IRI must be network accessible or they don't care about use cases where they aren't. It just means that it's confusing to have a separate term for the same thing just because it's network accessible. I see this as an attempt to reduce the jargon people need to know. It's an anti-gatekeeping move.

@handrews
Copy link

handrews commented Jul 5, 2022

@jdesrosiers what you describe is not a living standard. "living standard" means they just update stuff whenever, without any more notification than a "last updated" date.

What you're talking about is just a standard with revisions, which is far more reasonable than how WHATWG behaves.

"anti-gatekeeping" is not a phrase I would use to describe the exclusionary WHATWG oligarchy.

I'm open to any number of possible approaches, as long as they are not based on WHATWG.

@jdesrosiers
Copy link
Member

what you describe is not a living standard

I think it can be described as a living standard. What I described is pretty close to what ECMAScript does and they call what they do a living standard. However, maybe we should avoid the term if it has too much baggage.

"anti-gatekeeping" is not a phrase I would use to describe the exclusionary WHATWG oligarchy.

😆 Fair point

@handrews
Copy link

handrews commented Jul 5, 2022

However, maybe we should avoid the term if it has too much baggage.

Yeah, it has pretty negative connotations for me, for a variety of reasons (not just WHATWG, although they are a significant factor). While I am totally confident that this is not what you (@jdesrosiers) mean, in my experience it's a phrase people use to avoid having to properly manage changes and expectations. I know at least one or two other folks in this project have similar associations with the term.

I don't think anyone is advocating for taking all of what's currently deemed "JSON Schema" and reaching a totally frozen "done" state. If we take that and phrase "living standard" off the table, then we're having a discussion about how to manage change, and what kinds of change to allow, which I think is a good discussion to have.

@Relequestual
Copy link
Member Author

Relequestual commented Jul 6, 2022

The term "living standard" has a lot of baggage, in particular with WHATWG, for me.
I did some digging on ecma's process, and they call it a living standard... which I guess it IS, and I understand why, and IMHO I feel we should likely follow their approach. I haven't dug into all the details, but I'm pretty happy if we end up with something similar.

Resources for my reference which might be useful to others too:

I'm sure there's a better place for dropping the above, but I'm low on time today so it's here or not at all, for now.

@Relequestual Relequestual unpinned this issue Jul 15, 2022
@Relequestual
Copy link
Member Author

I'm closing this issue as the action items for the call are resolved, not because I feel the discussion above is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Working Meeting Identify working meetings
Projects
None yet
Development

No branches or pull requests

6 participants