Skip to content

Do we need to parse the grammar at all in the Word converter? #494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jskeet opened this issue Mar 15, 2022 · 2 comments
Closed

Do we need to parse the grammar at all in the Word converter? #494

jskeet opened this issue Mar 15, 2022 · 2 comments
Labels
meeting: discuss This issue should be discussed at the next TC49-TG2 meeting

Comments

@jskeet
Copy link
Contributor

jskeet commented Mar 15, 2022

C# code in the standard isn't parsed - it's just colorized.

The ANTLR grammar, however, is parsed. I suspect that's for historical reasons, back when the Word converter would also gather all the grammar rules together and perform a certain amount of validation.

I'm wondering whether we could fix #394 (and simplify the code!) just by not parsing the ANTLR at all. It might be tricky to get it colorized that way, given that Colorize doesn't have any built-in support, but I suspect that "leave it as formatted in Markdown, but as if it's plain text" is actually better than trying to parse and handle it ourselves.

Will give that a try...

@jskeet jskeet added the meeting: discuss This issue should be discussed at the next TC49-TG2 meeting label Mar 15, 2022
@jskeet
Copy link
Contributor Author

jskeet commented Mar 15, 2022

Looks like we also use the parsed grammar to link to bookmarks. We could potentially still just use the plain text and keep parsing it to get the production names... or lose the bookmarks. Will prototype both ways.

jskeet added a commit to jskeet/csharpstandard that referenced this issue Mar 15, 2022
(See dotnet#494)

This means we lose bookmarks, but it's really straightforward.

Note: the tests won't pass together yet as there's global state to
fix, but I'll sort that separately.
@jskeet
Copy link
Contributor Author

jskeet commented Mar 15, 2022

Ah - no, we only use the parsed grammar production if the language is "antlr" rather than "ANTLR", which it never is. (See #342.)

So we really don't lose anything by not parsing it at all, given that we've got a separate validator. Let's go with that simple option.

@jskeet jskeet closed this as completed Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meeting: discuss This issue should be discussed at the next TC49-TG2 meeting
Projects
None yet
Development

No branches or pull requests

1 participant