Skip to content

Avoid the "undefined behavior" term #1204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ioggstream opened this issue Apr 1, 2022 · 8 comments
Closed

Avoid the "undefined behavior" term #1204

ioggstream opened this issue Apr 1, 2022 · 8 comments

Comments

@ioggstream
Copy link
Contributor

I suggest

  • to avoid using the term "undefined behavior" and explicit that some non-interoperable pattersn such as duplicate json keywords, like the one below are valid json schema documents
{ "foo": 1, "foo": 2 }

Note

While many implementer thing that "undefined behavior" means "do whatever you want", it actually means "don't do it". I suggest to avoid it and to explicitly state what is valid / in-scope for the specification and what doesn't.

This approach will make the document easier, and allows the reader to actually focus on the relevant parts.

@ioggstream ioggstream changed the title Forbid duplicate json keywords Avoid the "undefined behavior" term Apr 1, 2022
@gregsdennis
Copy link
Member

gregsdennis commented Apr 1, 2022

You're right: "undefined behavior" doesn't necessarily mean "do whatever you want," but it also doesn't necessarily mean "don't do it," either.

We use this in various ways.

  • A scenario may be outside of what an implementation is capable of handling, like the duplicate key case you mentioned. While some languages have data models that can handle such scenarios, a lot don't. Because of this, it's unfair to those implementations that can't handle this for us to prescribe a behavior, so we let the implementation decide what to do.
  • The most common way we use this language is in messaging to schema and meta-schema authors rather than to implementors. For example, trying to specify false for the core vocabulary is meaningless. An implementation still needs to be able to handle when authors do things like this. Since it's invalid anyway, it doesn't need to be interoperable, so we leave it to the implementation to decide how to handle it and we don't define a behavior.
  • There are cases where we just don't have an answer for how something should work. The example here is URIs that are also URLs and retrieving data from those locations. In this case, we state that the behavior is undefined, but "reserved for future use," which means future versions of the spec may define this.

In all "undefined" appears 10 times in the core spec (one is just in a release note) and doesn't appear in the validation spec. I think these usages are warranted.

@gregsdennis
Copy link
Member

Related to json-schema-org/community#189

@awwright
Copy link
Member

awwright commented Apr 2, 2022

I'd like to suggest there's a few different terms that all proscribe the same behavior (that any behavior might be legal, with respect to the specification document), but with slightly different usages:

  • "undefined" implies that the behavior may be specified by another specification, and defining it here could over-constrain it. For example, the fact that JSON objects can have repeated keys; defining a behavior here would potentially conflict with implementations.
  • "implementation specific" means the behavior is outside the scope of the specification; because different implementations have legitimately different needs.
  • "reserved" implies nobody should use the feature at all. While currently it must be ignored, other behaviors may become legal in the future.

And like @gregsdennis says I think most of our usage seems to be roughly correct. The only usage that I think is baffling is json-schema-org/community#189 (maybe I should update the title there)

@ioggstream
Copy link
Contributor Author

@awwright I think the distinctions you made are relevant. Especially, it is key to define whether:

  • something is beyond the scope of the document (e.g. is left to the implementers or to future specifications);
  • it's not interoperable, bad design or actually unexpected.

My experience is that, in the second case, specifications will eventually NOT RECOMMEND or FORBID those behavior. e.g see content with GET in the latest spec.

@awwright
Copy link
Member

awwright commented Apr 4, 2022

@ioggstream I think most of our rationale for undefined behavior should be self-evident. Is there one in particular that's confusing?

What do you mean by "NOT RECOMMEND or FORBID"? Putting the terms in capital lettering is generally done when the term is being used according to a specific definition, but NOT RECOMMEND and FORBID are not defined terms afaik. I don't think we should use RFC 2119 Key Words because that could over-constrain the specification (the behavior is already specified by JSON, which is a normative reference).

@ioggstream
Copy link
Contributor Author

most of our rationale for undefined behavior should be self-evident

imho "Explicit is better than implicit" :P

what do you mean by "NOT RECOMMEND or FORBID"?

You are right: I meat "SHOULD NOT" or "MUST NOT" ;)
About over-constrain, consider that JSON in 8259 added MUST UTF-8 when transmitting over the net.

Anyway, it's my 2¢ :)

@awwright
Copy link
Member

awwright commented Apr 6, 2022

Explicit is better than implicit

There's also a balance we have to strike with brevity. Maybe in some cases it's worth explaining the situation with a sentence though. Can you quote specific passages you find confusing?

@handrews
Copy link
Contributor

It's been four months without a clear response to the twice-asked question of specific concerns, so I'm closing this. Please do feel free to file new issues for each specific example of unclear wording / usage of "undefined behavior."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants