Closed
Description
It would be nice to have confidence about what sort of grammar Rust has. One possible way we could do that:
- Extract EBNF from the manual
- Use a script to convert it to whatever format antlr wants
- Run the entire test suite through the both the rustc parser and antlr parser. Where one fails so should the other.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
nikomatsakis commentedon Apr 18, 2012
How many negative parsing tests do we have? I'm guessing not many... but we
could make some.
brson commentedon Apr 18, 2012
I also would assume that coverage of failure cases in the parser is not great, so we could kill two birds with one stone.
brson commentedon Apr 18, 2012
Speaking of coverage, today I realized that #690 is unblocked. It might be possible for us to measure test coverage.
graydon commentedon Apr 18, 2012
Antlr is a possibility but I think it is willing to handle a lot more grammar ambiguity, and it tends to blur lexing and parsing rules. I want us to remain in the classical regular-lexing + LL(1)-parsing space.
I picked the EBNF in the manual for compatibility with llnextgen, http://os.ghalkes.nl/LLnextgen/ ; I got part way into wiring up the rules for extracting and testing the grammar but didn't finish in time for 0.1, haven't come back to it yet.
Lexical rules I figured we could feed to quex http://quex.sourceforge.net/ but other possibilities exist. It just seems like the current leader in the space we're interested in.
brson commentedon Apr 19, 2012
We can use the fuzzer to find arbitrary numbers of random samples to feed to both parsers.
nikomatsakis commentedon Apr 19, 2012
good idea...
graydon commentedon Apr 25, 2013
nominating for well-defined
|
with thepat
macro fragment specifier #4581emberian commentedon Jul 7, 2013
Still relevant
fhahn commentedon Sep 15, 2013
I had a look at this issue and started working on a parser using Flex and LLnextgen (https://github.com/fhahn/rust-grammer). Right now, the parser supports only a tiny tiny bit of the Rust grammer, but I wanted to make sure my approach is valid, before continuing.
One main difference to the grammer specification in the documentation is that flex uses regular expressions for token definitions, not ebnf, so I started converting the ebnf from section 3 "Lexical structure" to regular expressions for flex.
huonw commentedon Sep 15, 2013
Note that grammar in the manual is highly likely to be incorrect (which is presumably exactly what this issue is aiming to address).
Kimundi commentedon Sep 15, 2013
Note that someone already completed an grammar months ago, it just never got used for anything yet. No idea where to find it though.
22 remaining items