Skip to content

(Design) Document the open questions around standalone markup #542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 13, 2023

Conversation

stasm
Copy link
Collaborator

@stasm stasm commented Nov 29, 2023

See #535 (comment).

Right now, the doc recognizes that standalone elements are still an open question, but the documented proposal goes one step further: it suggests to use the registry to encode the standalone aspect.

In this PR, I list all previously suggested alternatives on an equal level, since no decision has been made about them yet.

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it's fine, though see inline comments. On the standalone alternatives, it would be good to point out somewhere that the second and third are valid no matter what we decide about the syntax.

@@ -178,7 +183,7 @@ This is {#strong}bold{/strong} and this is {#img alt=|an image|}.
Markup names are _namespaced_ by their use of the pound sign `#` and the forward slash `/` sigils.
They are distinct from `$variables`, `:functions`, and `|literals|`.

This allows for placeholders like `{#b}`, `{#img}`, and `{#a title=|Link tooltip|}`.
This allows for placeholders like `{#b}`, and `{#a title=|Link tooltip|}`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal does allow for {#img}. It just doesn't differentiate the formatted parts of e.g. {#b} and {#img} from each other except by the name.

Suggested change
This allows for placeholders like `{#b}`, and `{#a title=|Link tooltip|}`.
This allows for placeholders like `{#b}`, `{#img}`, and `{#a title=|Link tooltip|}`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still prefer to have this change reversed. The currently proposed design does factually allow for a placeholder like {#img}, even if it does not strictly communicate in the syntax that this markup element does not expect a closing element {/img}.

@@ -195,13 +200,40 @@ Markup is not valid in _declarations_ or _selectors_.

* Introduces two new sigils, the pound sign `#` and the forward slash `/`.

Copy link
Collaborator

@eemeli eemeli Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this should still be included as a con of the currently-proposed solution.

Suggested change
* As in HTML, differentiating "open" and "standalone" placeholders requires
additional information not encoded in the bare syntax.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not say "elements" (use placeholder instead). HTML has elements, but we don't.

A different way of saying this is:

Suggested change
There is no difference between an "open" and a "standalone" placeholder in terms of syntax.
It is up to the implementation as to whether a given placeholder is treated
as an opening item or an isolated item in the output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated my suggestions above with s/elements/placeholders/.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this suggestion, at least until we discuss the open questions related to standalone elements. The whole point of this PR is to emphasize that the current proposal is incomplete without our answering these questions.

Note that the "hash and slash" proposal did include a way to express standalone markup, and so did the "plus and minus" proposal. I'm concerned that suddenly we're discussing a different proposal in which standalone markup has been removed.

How about we add the following con instead of the ones you both proposed?

Suggested change
- Doesn't answer how to handle standalone markup; see below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stasm I don't think that's quite correct. The proposal does say how to handle standalone markup, but not in a syntactically distinct way.

I think there is a case to be made for separate standalone syntax, as it allows relatively naive implementations to be created that don't have a list of standalone vs. spanning markup.

Thus I think your "con" would be better phrased as:

Suggested change
- Standalone placeholders are not syntactically distinct from opening placeholders;
implementations require out-of-band information to distinguish standalone
from opening placeholders.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I don't seem contrary but £ is called a pound sign. # is called an octothorpe.
Coming from a country that used to have pounds as a unit of currency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kenguest In the specification it would be U+0023 NUMBER SIGN # because that is its Unicode name. Octothorpe and pound sign are both alternate names 😉

Copy link
Collaborator Author

@stasm stasm Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal does say how to handle standalone markup, but not in a syntactically distinct way.

But it’s not the same proposal that was called “hash and slash” which was added in #517.

I object to how #535 removed the standalone syntax from that proposal, quoting consensus which was not officially discussed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remain somewhat confused by the desire to remove this documented "con" of the currently proposed design. This PR and our discussions around this issue have made it clear that there's still desire to discuss standalone markup, but the proposed design itself is not changed here. It still does not differentiate in syntax standalone and open placeholders. Why can't we mention that? Either with the original text, or that proposed by @aphillips or myself earlier in this thread.

On a somewhat meta level, in reply to @stasm from #542 (comment):

The whole point of this PR is to emphasize that the current proposal is incomplete without our answering these questions.

This approach makes this seem like this PR is attempting re-re-define how we do design and how we work with design documents. Thus far, we've been able to iterate on these docs and their proposed solutions while keeping them internally coherent. Witness for instance the changes to this document and its proposed design. Why can't we continue doing so?

We clearly intend to discuss standalone markup further; I don't see how leaving this out helps that. Or are you suggesting that "standalone" and "open" using the same syntax is not a disadvantage of the proposed design?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remain somewhat confused by the desire to remove this documented "con" of the currently proposed design. This PR and our discussions around this issue have made it clear that there's still desire to discuss standalone markup, but the proposed design itself is not changed here. It still does not differentiate in syntax standalone and open placeholders.

It used to, however, right? And then #535 changed it, and I’m trying to object how it did it. See my comment in that PR.

I appreciate the attempt to limit the scope of the proposal to just open and close, but I also note that by doing so, we effectively make a decision about standalone.

I’d prefer it if the doc included {#foo/}, as it used to previously. But I also acknowledge that we will discuss this further on Monday.

Since I’m away this week, I’ll defer to you and @aphillips. And since I’m typing this on my phone, would you mind taking over so that we’re able to merge this PR soon? Thanks!

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions


* Introduce additional syntax inspired by XML: `{#foo/}`.

This approach encodes the _standalone_ aspect in the data model, and doesn't rely on external sources of truth,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just the data model: it's in the syntax too!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mentioned in the same sentence, which continues onto the next line: at the expense of introducing new syntax, which is admittedly a bit clunky.

(e.g. the structure of the source to which the message is applied).

This approach offers a cleaner syntax at the expense of relying on external sources of truth for telling open and standalone elements apart.
Without the access to these sources of truth (e.g. a custom registry), standalone markup will be interpreted as open elements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without access to some implementation, all forms of markup are just so much noise.

I'm slightly sad that we didn't show using namespaces in the design document, since I suspect that {#html:span id=foo}I am spanned{/html:span} is friendlier to tools, given {#ttml:span id=foo}I am timed text{/ttml:span} exists too. Namespacing suggests how implementations would solve the problem of "external knowledge". There's no way that any of this works without some form of external information anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but you can get a lot of mileage without an implementation, just working with the data model, especially in tooling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that html:span would be much friendlier to tools, and probably also to translators. Note that I do mention namespacing in the following bullet, the one about re-using regular functions for standalone markup (:html:img).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think namespaces are important for markup regimes to work. They won't be built-in to the default registry, so there needs to be a way to plug them in and for tools and implementations to know what they mean.

Comment on lines 235 to 236
In this approach there's also no way to enforce the lack of operands to standalone markup at the syntax level.
Instead, access to a custom registry is required to know that `:img` does not accept any operands.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this differently.

Suggested change
In this approach there's also no way to enforce the lack of operands to standalone markup at the syntax level.
Instead, access to a custom registry is required to know that `:img` does not accept any operands.
In this approach, there's no way to distinguish standalone markup for regular functions.
This also means that there is no way to enforce a lack of operands for standalone
(as we do for open/close).
The open/close placeholders are default ignorable in certain formatting operations
because they can be distinguished from normal placeholders.
With `:img`, the placeholder would appear in all formatting operations, possibly as the
replacement "logo" when unsupported.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this approach, there's no way to distinguish standalone markup for regular functions.

There is, though: the registry. This is why it's important to call out that there's no way at the syntax level (aka the data model level).

The open/close placeholders are default ignorable in certain formatting operations
because they can be distinguished from normal placeholders.

The ignoring is performed at runtime, which means that it can be implemented by functions rather than the implementation. For example, {:html:img} could be a function call which formats to an empty string, rather than be skipped by the implementation during formatting. So we can still distinguish standalone markup from regular function, but only by calling them.

So I think the real con here is that {:html:img} needs to be called at runtime for the implementation to determine that it's a standalone markup placeholder.

How about this:

Suggested change
In this approach there's also no way to enforce the lack of operands to standalone markup at the syntax level.
Instead, access to a custom registry is required to know that `:img` does not accept any operands.
In this approach there's also no way to enforce the lack of operands to standalone markup at the syntax level, as we do for open/close.
Instead, access to a custom registry is required to know that `:img` does not accept any operands.
Similarily, the only way to know at runtime that an expression produces markup would be to call the expression's function.
It's not possible to deduce this fact from the data model prior to the formatting operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call out that nothing would or should prevent developers from writing the function :html:img (or any other element), should they so desire. So what we're really debating is the addition of syntactically distinct spannable placeholders.

@aphillips aphillips added syntax Issues related with syntax or ABNF design Design document or issues related to design labels Dec 1, 2023
@aphillips aphillips changed the title Document the open questions around standalone markup (Design) Document the open questions around standalone markup Dec 3, 2023
@stasm stasm self-assigned this Dec 3, 2023
... but using Unicode code points and names instead of being super informal. This is strictly an editorial change.
@stasm
Copy link
Collaborator Author

stasm commented Dec 6, 2023

@aphillips @eemeli Are you satisfied with my answers to your comments? Can we merge this to better explain the controversy around the standalone markup before to the next meeting?

@stasm stasm mentioned this pull request Dec 6, 2023
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have two open questions here; see inline threads.

@@ -178,7 +183,7 @@ This is {#strong}bold{/strong} and this is {#img alt=|an image|}.
Markup names are _namespaced_ by their use of the pound sign `#` and the forward slash `/` sigils.
They are distinct from `$variables`, `:functions`, and `|literals|`.

This allows for placeholders like `{#b}`, `{#img}`, and `{#a title=|Link tooltip|}`.
This allows for placeholders like `{#b}`, and `{#a title=|Link tooltip|}`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still prefer to have this change reversed. The currently proposed design does factually allow for a placeholder like {#img}, even if it does not strictly communicate in the syntax that this markup element does not expect a closing element {/img}.

@@ -195,13 +200,40 @@ Markup is not valid in _declarations_ or _selectors_.

* Introduces two new sigils, the pound sign `#` and the forward slash `/`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remain somewhat confused by the desire to remove this documented "con" of the currently proposed design. This PR and our discussions around this issue have made it clear that there's still desire to discuss standalone markup, but the proposed design itself is not changed here. It still does not differentiate in syntax standalone and open placeholders. Why can't we mention that? Either with the original text, or that proposed by @aphillips or myself earlier in this thread.

On a somewhat meta level, in reply to @stasm from #542 (comment):

The whole point of this PR is to emphasize that the current proposal is incomplete without our answering these questions.

This approach makes this seem like this PR is attempting re-re-define how we do design and how we work with design documents. Thus far, we've been able to iterate on these docs and their proposed solutions while keeping them internally coherent. Witness for instance the changes to this document and its proposed design. Why can't we continue doing so?

We clearly intend to discuss standalone markup further; I don't see how leaving this out helps that. Or are you suggesting that "standalone" and "open" using the same syntax is not a disadvantage of the proposed design?

@mihnita
Copy link
Collaborator

mihnita commented Dec 9, 2023

I really think that we need a standalone concept.


Traditionally HTML got away with it because it was "permissive" by design.
And because every browser hard-coded somewhere the "knowledge" that img, and br, and hr and so on are in fact standalone.

And once you have a tree, then a node without kids is "standalone"


But this does not work for unknown tags. Which MF2 allows.
Even HTML allows custom tags now.
So what does This is <foo> means? Is foo standalone or not?

Does it matter?
Yes, it does.

For example This is <foo lang=ar> something or This is <foo dir=rtl> something ?
Is "something" Arabic, or RTL?
We don't know, because we don't know anything about <foo>
So we don't know how to post-process the text after <foo>, for example.

We also don't know how to leverage.
If we translated a week ago text1 <b> text2 </b> text3, and now we get for translation text1 <i> text2 </i> text3, we know it is safe to take the old translations and algorithmically change the <b>...</b> to <i>...</i>
Because we replace open-close with open-close.

But is unsafe to replace open with standalone, or standalone with open.


We know that with translation the sloppier we are the less we can trust that leveraging and validation work properly.

And we also know that we want to be able to use mf2 markup with more than html.
So we should not prevent certain use cases "by design", because we emulated html.

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stasm I merged from main to resolve a conflict, and have re-added below the two changes I've been asking for in previous threads. Applying them would let us merge this PR, and then follow this with a next one that proposes some actual change to the proposed design.

As is, this PR effectively blocks concrete standalone syntax proposals from being presented, and that seems contrary to what you're looking for?

@stasm
Copy link
Collaborator Author

stasm commented Dec 13, 2023

Thanks, @eemeli.

As I've expressed previously, I feel that something was lost in our communication or perhaps in translation — I don't really understand why the currently proposed solution (after your suggestions) doesn't include the standalone syntax. It was part of both the plus/minus alternative and the hash/slash alternative before #535.

However, it seems that discussing this further won't resolve this, and I'm more interested in seeing this PR land and then following up with updates to the design doc.

@aphillips aphillips merged commit 2a3b88e into main Dec 13, 2023
@aphillips aphillips deleted the standalone-open-questions branch December 13, 2023 23:07
@aphillips
Copy link
Member

From on- and offline conversation, merging these changes. Don't feel that we can't iterate the design doc from here 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design document or issues related to design syntax Issues related with syntax or ABNF
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants