Skip to content

Grammar tests #2234

Closed
Closed
@brson

Description

@brson
Contributor

It would be nice to have confidence about what sort of grammar Rust has. One possible way we could do that:

  • Extract EBNF from the manual
  • Use a script to convert it to whatever format antlr wants
  • Run the entire test suite through the both the rustc parser and antlr parser. Where one fails so should the other.

Activity

nikomatsakis

nikomatsakis commented on Apr 18, 2012

@nikomatsakis
Contributor

How many negative parsing tests do we have? I'm guessing not many... but we
could make some.

brson

brson commented on Apr 18, 2012

@brson
ContributorAuthor

I also would assume that coverage of failure cases in the parser is not great, so we could kill two birds with one stone.

brson

brson commented on Apr 18, 2012

@brson
ContributorAuthor

Speaking of coverage, today I realized that #690 is unblocked. It might be possible for us to measure test coverage.

graydon

graydon commented on Apr 18, 2012

@graydon
Contributor

Antlr is a possibility but I think it is willing to handle a lot more grammar ambiguity, and it tends to blur lexing and parsing rules. I want us to remain in the classical regular-lexing + LL(1)-parsing space.

I picked the EBNF in the manual for compatibility with llnextgen, http://os.ghalkes.nl/LLnextgen/ ; I got part way into wiring up the rules for extracting and testing the grammar but didn't finish in time for 0.1, haven't come back to it yet.

Lexical rules I figured we could feed to quex http://quex.sourceforge.net/ but other possibilities exist. It just seems like the current leader in the space we're interested in.

brson

brson commented on Apr 19, 2012

@brson
ContributorAuthor

We can use the fuzzer to find arbitrary numbers of random samples to feed to both parsers.

nikomatsakis

nikomatsakis commented on Apr 19, 2012

@nikomatsakis
Contributor

good idea...

graydon

graydon commented on Apr 25, 2013

@graydon
Contributor

nominating for well-defined

emberian

emberian commented on Jul 7, 2013

@emberian
Member

Still relevant

fhahn

fhahn commented on Sep 15, 2013

@fhahn
Contributor

I had a look at this issue and started working on a parser using Flex and LLnextgen (https://github.com/fhahn/rust-grammer). Right now, the parser supports only a tiny tiny bit of the Rust grammer, but I wanted to make sure my approach is valid, before continuing.

One main difference to the grammer specification in the documentation is that flex uses regular expressions for token definitions, not ebnf, so I started converting the ebnf from section 3 "Lexical structure" to regular expressions for flex.

huonw

huonw commented on Sep 15, 2013

@huonw
Member

Note that grammar in the manual is highly likely to be incorrect (which is presumably exactly what this issue is aiming to address).

Kimundi

Kimundi commented on Sep 15, 2013

@Kimundi
Member

Note that someone already completed an grammar months ago, it just never got used for anything yet. No idea where to find it though.

22 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-grammarArea: The grammar of RustA-testsuiteArea: The testsuite used to check the correctness of rustcP-mediumMedium priority

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @graydon@steveklabnik@Arcnor@brson@nikomatsakis

        Issue actions

          Grammar tests · Issue #2234 · rust-lang/rust