Grammar tests #2234

Closed

Closed

Grammar tests#2234

Labels

A-grammarA-testsuiteP-medium

Contributor

It would be nice to have confidence about what sort of grammar Rust has. One possible way we could do that:

Extract EBNF from the manual
Use a script to convert it to whatever format antlr wants
Run the entire test suite through the both the rustc parser and antlr parser. Where one fails so should the other.

Contributor

How many negative parsing tests do we have? I'm guessing not many... but we
could make some.

ContributorAuthor

I also would assume that coverage of failure cases in the parser is not great, so we could kill two birds with one stone.

ContributorAuthor

Speaking of coverage, today I realized that #690 is unblocked. It might be possible for us to measure test coverage.

Contributor

Antlr is a possibility but I think it is willing to handle a lot more grammar ambiguity, and it tends to blur lexing and parsing rules. I want us to remain in the classical regular-lexing + LL(1)-parsing space.

I picked the EBNF in the manual for compatibility with llnextgen, http://os.ghalkes.nl/LLnextgen/ ; I got part way into wiring up the rules for extracting and testing the grammar but didn't finish in time for 0.1, haven't come back to it yet.

Lexical rules I figured we could feed to quex http://quex.sourceforge.net/ but other possibilities exist. It just seems like the current leader in the space we're interested in.

ContributorAuthor

We can use the fuzzer to find arbitrary numbers of random samples to feed to both parsers.

Contributor

good idea...

mentioned this

on Jul 25, 2012

Generate gcov coverage data #690

mentioned this

on Sep 24, 2012

Miscellaneous Rust projects Mozilla-Student-Projects/Projects-Tracker#38

Contributor

nominating for well-defined

mentioned this

on Apr 25, 2013

Number grammar #1589

mentioned this

| with the pat macro fragment specifier #4581

Member

Still relevant

Contributor

I had a look at this issue and started working on a parser using Flex and LLnextgen (https://github.com/fhahn/rust-grammer). Right now, the parser supports only a tiny tiny bit of the Rust grammer, but I wanted to make sure my approach is valid, before continuing.

One main difference to the grammer specification in the documentation is that flex uses regular expressions for token definitions, not ebnf, so I started converting the ebnf from section 3 "Lexical structure" to regular expressions for flex.

Member

Note that grammar in the manual is highly likely to be incorrect (which is presumably exactly what this issue is aiming to address).

Member

Note that someone already completed an grammar months ago, it just never got used for anything yet. No idea where to find it though.

22 remaining items

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

A-grammarA-testsuiteP-medium

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants