Configurations checking #2292

davidalber · 2017-12-19T08:34:00Z

This PR adds a test that extracts Rust code blocks from Configurations.md and attempts to format them. This is the automation part of #1845. If we get through this, the next step will be fixing the formatting failures.

Change Discussion

Some details on how this works:

Rust code blocks are identifed by lines beginning with "```rust".
One explicit configuration setting is supported per code block.
Rust code blocks with no configuration setting are illegal and cause an assertion failure.
Configuration names in Configurations.md must be in the form of "## `NAME`".
Configuration values in Configurations.md must be in the form of "#### `VALUE`".
Each extracted block is formatted.
Mismatches are prepended by a message pointing to the line in Configurations.md where the diff begins.
Failures due to parse errors are followed by a message indicating the first line of the block in Configurations.md that could not be parsed. Example:

☝🏽 Failed to format block starting at Line 1676 in Configurations.md

Message Printing

rustfmt potentially prints a couple different ways:

via writeln! to term::stdout
via println!

This changeset tries to ensure that mismatch and configuration failure messages are printed via the same approach (see the changes in rustfmt_diff.rs, for starters), but parsing errors do not follow the same logic as mismatch printing. When fancy diff printing isn't selected, the diff reporting function uses println!. Parsing failures are not falling back to println! in the same way, however, so in that case the failure message does not appear adjacent to the parsing failure message.

Examples

The test is currently marked #[ignore] because there's a lot of formatting failures at the moment. You can run the test with the --ignored flag.

cargo test -- --ignored

You can see the fancy diff printing case here. You can see a non-fancy diff printing case here. That was triggered by redirecting test output to a file (cargo test -- --ignored > foo.log 2>&1).

davidalber · 2017-12-19T08:53:19Z

Oh, and I am not necessarily looking for this to be merged right now, although I would be fine with that. Mostly, I want to get feedback on the direction before starting to fix the formatting failures.

davidalber · 2017-12-23T09:51:13Z

I've just noticed it appears that, on first glance, stdin_formatting_smoke_test provides an example of a way to avoid all of this messy temp file business. I'll look at that a little bit.

topecongiro · 2017-12-24T07:25:40Z

Thank you for the PR! It is nice if we could avoid using tempfile.

I think most of the format failures are caused by not updating the config option value (e.g. setting IndentStyle = "Visual").

davidalber · 2017-12-25T00:05:21Z

I've updated this to not use temp files for formatting the code blocks extracted from Configurations.md.

davidalber · 2017-12-25T00:06:00Z

I also updated the text at the top of the PR to reflect the new implementation.

nrc

Looks good, thanks! I left a few comments inline, mostly minor code style issues.

nrc · 2017-12-27T01:24:53Z

tests/system.rs

+    let config_value_regex = regex::Regex::new(r#"^#### `"?([^`"]+)"?`"#)
+        .expect("Failed creating configuration value pattern");
+
+    let mut code_blocks = Vec::new();


I think this whole parsing function could be nicely refactored into a struct and a set of functions, rather than being a big function with lots of local, mutable state.

nrc · 2017-12-27T01:27:23Z

tests/system.rs

+        }
+    }
+
+    // Display results.


Could you move the println and assertion to the caller please? That feels like it makes this a more self-contained function.

nrc · 2017-12-27T01:28:22Z

tests/system.rs

+fn check_blocks_idempotency(blocks: &Vec<CodeBlock>) {
+    let mut failures = 0;
+
+    for block in blocks {


Optional: this loop might work better as a fold if you can factor out some of the code into little functions, not sure though.

davidalber · 2018-01-01T19:08:45Z

I did a large refactor of the previous iteration. Almost all of the logic has been moved to be encapsulated in an enum and a struct.

Used enum to extract sections from Configurations.md.
Used struct to handle the extracted sections from Configurations.md, attempt formatting, and report errors.

Additional comments:

I would liked to have separated checking whether a given block fails the format test from the printing of the outcome, but the diff printing infrastructure doesn't have a way to just get the failure string without printing. There are alternatives, but I didn't think it was worth pursuing further right now.
Added lazy_static as an explicit dependency so we don't have to keep building the regexes. This is already a transitive dependency, so I thought this is probably all right.

nrc

Looks good, thanks! Could you add a few comments where I requested and squash the commits please?

nrc · 2018-01-04T02:03:17Z

src/rustfmt_diff.rs

@@ -97,15 +97,27 @@ pub fn make_diff(expected: &str, actual: &str, context_size: usize) -> Vec<Misma
    results
 }

+pub enum PrintType {


Could you document what these variants entale

nrc · 2018-01-04T02:04:41Z

tests/system.rs

@@ -485,3 +505,230 @@ fn string_eq_ignore_newline_repr_test() {
    assert!(string_eq_ignore_newline_repr("a\r\n\r\n\r\nb", "a\n\n\nb"));
    assert!(!string_eq_ignore_newline_repr("a\r\nbcd", "a\nbcdefghijk"));
 }
+
+enum ConfigurationSection {
+    CodeBlock((String, u32)),


Could you document what these two args mean please?

davidalber · 2018-01-04T05:07:21Z

tests/system.rs

+//   "## `NAME`".
+// - Configuration values in Configurations.md must be in the form of
+//   "#### `VALUE`".
+fn get_code_blocks() -> Vec<ConfigCodeBlock> {


@nrc: I don't expect to use this function elsewhere, which makes me want to limit its scope. My instinct is to nest it inside configuration_snippet_tests (below). What do you think about that? Is that a Rusty sort of thing to do? I haven't seen heavy use of nested functions in Rust code (not that I have seen a lot of Rust code).

nested functions are fine to use. They are not super-popular because unless the nested function is very small it tends to just make things hard to read.

davidalber · 2018-01-04T08:15:29Z

src/rustfmt_diff.rs

+// A representation of how to write output.
+pub enum PrintType {
+    Fancy, // want to output color and the terminal supports it
+    Basic, // do not want to output color or the terminal does not support color


@nrc: Here's the first set of comments you requested.

After this PR, I'm going to spin up a new one to merge print_diff_fancy and print_diff_basic by adding more implementation to the enum. I think the enum is too abstract like this.

davidalber · 2018-01-04T08:17:04Z

tests/system.rs

+// with its starting line number, the name of a rustfmt configuration option, or the value of a
+// rustfmt configuration option.
+enum ConfigurationSection {
+    CodeBlock((String, u32)), // (String: block of code, u32: line number of code block start)


@nrc: Here's the other comment you requested. I added commentary above the enum, as well.

davidalber · 2018-01-04T08:17:45Z

tests/system.rs

+    // - Configuration names in Configurations.md must be in the form of
+    //   "## `NAME`".
+    // - Configuration values in Configurations.md must be in the form of
+    //   "#### `VALUE`".


This commentary was moved here since pre-squash.

davidalber · 2018-01-04T08:18:46Z

tests/system.rs

+fn configuration_snippet_tests() {
+    // Read Configurations.md and build a `Vec` of `ConfigCodeBlock` structs with one
+    // entry for each Rust code block found.
+    fn get_code_blocks() -> Vec<ConfigCodeBlock> {


I've moved this function here (it was previously not nested) since pre-squash. I also modified the loop to be a while let.

davidalber · 2018-01-04T08:21:53Z

@nrc: I addressed your comments, squashed, and rebased. I added inline comments to point at the changes made.

I'm happy to rebase again after #2338 is resolved.

killercup · 2018-01-04T13:51:59Z

Uh, any reason not to use a real markdown parser? (Or am I missing something?)

davidalber · 2018-01-04T17:31:04Z

I don't think you are missing something. I thought that taking a dependency and using a full Markdown parser was overkill for this situation. Since most of the changeset is handling the semantics of how the file is organized, not Markdown syntax, I don't think using a parser will make the changes much more compact, although it may make them easier to read.

I'm completely willing to try it out, though. Maybe in a competing PR so we can compare. It looks like pulldown-cmark is the most popular option here, and looking at the documentation it seems like it will work here (there's some details that I don't see documented, so I need to try it out to be sure). Do you have other suggestions for Markdown parsers that you think would be a better way to go?

killercup · 2018-01-04T17:38:53Z

It's probably fine to not use an actual parser here, I was just wondering. And if this was my code I'd fear ending up with an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a markdown parser in the end (as the saying goes) ;) I'd personally use pulldown-cmark, it's pretty easy (it gives you an iterator of markdown events, so you can look for headlines and code blocks with the correct lang attribute; see my waltz and docstrings repo for examples). David Alber <[email protected]> schrieb am Do. 4. Jan. 2018 um 18:31:

…

I don't think you are missing something. I thought that taking a dependency and using a full Markdown parser was overkill for this situation. Since most of the changeset is handling the semantics of how the file is organized, not Markdown syntax, I don't think using a parser will make the changes much more compact, although it may make them easier to read. I'm completely willing to try it out, though. Maybe in a competing PR so we can compare. It looks like pulldown-cmark <https://github.com/google/pulldown-cmark> is the most popular option here, and looking at the documentation it seems like it will work here (there's some details that I don't see documented, so I need to try it out to be sure). Do you have other suggestions for Markdown parsers that you think would be a better way to go? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2292 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABOX6KuewDiZorhezJ4wwur6lKHFFBSks5tHQrZgaJpZM4RGmoO> .

davidalber · 2018-01-05T05:12:09Z

It does seem pretty easy. A hangup appears to be that I don't have a way to get the number of a line from which an event is extracted in pulldown-cmark. I need the line number to provide useful test failure messages. I guess it's not surprising, given normal use cases, that line numbers aren't returned. If I completely got that wrong and there's a way to get line numbers (or ranges), please let me know.

nrc · 2018-01-09T05:39:20Z

Thank you!

davidalber force-pushed the configurations-checking branch from ebbabe1 to 52df03a Compare December 24, 2017 23:46

davidalber mentioned this pull request Dec 26, 2017

Add options blank_lines_{lower|upper}_bound to Configurations.md #2313

Merged

nrc reviewed Dec 27, 2017

View reviewed changes

nrc reviewed Jan 4, 2018

View reviewed changes

davidalber commented Jan 4, 2018

View reviewed changes

Adding test to verify code block idempotency in Configurations.md

85ccb98

davidalber force-pushed the configurations-checking branch from 0cbc1d2 to 85ccb98 Compare January 4, 2018 08:13

davidalber commented Jan 4, 2018

View reviewed changes

nrc merged commit d60a695 into rust-lang:master Jan 9, 2018

davidalber deleted the configurations-checking branch January 9, 2018 07:53

This was referenced Jan 10, 2018

Modifying failure messages to be consistent with mismatch message #2349

Merged

Consolidating the logic for printing output #2353

Merged

Getting binop_separator="Back" snippet in Configurations.md to pass #2361

Merged

Configurations checking #2292

Configurations checking #2292

Uh oh!

Conversation

davidalber commented Dec 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Discussion

Message Printing

Examples

Uh oh!

davidalber commented Dec 19, 2017

Uh oh!

davidalber commented Dec 23, 2017

Uh oh!

topecongiro commented Dec 24, 2017

Uh oh!

davidalber commented Dec 25, 2017

Uh oh!

davidalber commented Dec 25, 2017

Uh oh!

nrc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidalber commented Jan 1, 2018

Uh oh!

nrc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidalber commented Jan 4, 2018

Uh oh!

killercup commented Jan 4, 2018

Uh oh!

davidalber commented Jan 4, 2018

Uh oh!

killercup commented Jan 4, 2018 via email

Uh oh!

davidalber commented Jan 5, 2018

Uh oh!

nrc commented Jan 9, 2018

Uh oh!

Uh oh!

davidalber commented Dec 19, 2017 •

edited

Loading