Skip to content

Configurations checking #2292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 9, 2018
Merged

Conversation

davidalber
Copy link
Contributor

@davidalber davidalber commented Dec 19, 2017

This PR adds a test that extracts Rust code blocks from Configurations.md and attempts to format them. This is the automation part of #1845. If we get through this, the next step will be fixing the formatting failures.

Change Discussion

Some details on how this works:

  • Rust code blocks are identifed by lines beginning with "```rust".
  • One explicit configuration setting is supported per code block.
  • Rust code blocks with no configuration setting are illegal and cause an assertion failure.
  • Configuration names in Configurations.md must be in the form of "## `NAME`".
  • Configuration values in Configurations.md must be in the form of "#### `VALUE`".
  • Each extracted block is formatted.
  • Mismatches are prepended by a message pointing to the line in Configurations.md where the diff begins.
  • Failures due to parse errors are followed by a message indicating the first line of the block in Configurations.md that could not be parsed. Example:

    ☝🏽 Failed to format block starting at Line 1676 in Configurations.md

Message Printing

rustfmt potentially prints a couple different ways:

  • via writeln! to term::stdout
  • via println!

This changeset tries to ensure that mismatch and configuration failure messages are printed via the same approach (see the changes in rustfmt_diff.rs, for starters), but parsing errors do not follow the same logic as mismatch printing. When fancy diff printing isn't selected, the diff reporting function uses println!. Parsing failures are not falling back to println! in the same way, however, so in that case the failure message does not appear adjacent to the parsing failure message.

Examples

The test is currently marked #[ignore] because there's a lot of formatting failures at the moment. You can run the test with the --ignored flag.

cargo test -- --ignored

You can see the fancy diff printing case here. You can see a non-fancy diff printing case here. That was triggered by redirecting test output to a file (cargo test -- --ignored > foo.log 2>&1).

@davidalber
Copy link
Contributor Author

Oh, and I am not necessarily looking for this to be merged right now, although I would be fine with that. Mostly, I want to get feedback on the direction before starting to fix the formatting failures.

@davidalber
Copy link
Contributor Author

I've just noticed it appears that, on first glance, stdin_formatting_smoke_test provides an example of a way to avoid all of this messy temp file business. I'll look at that a little bit.

@topecongiro
Copy link
Contributor

Thank you for the PR! It is nice if we could avoid using tempfile.

I think most of the format failures are caused by not updating the config option value (e.g. setting IndentStyle = "Visual").

@davidalber davidalber force-pushed the configurations-checking branch from ebbabe1 to 52df03a Compare December 24, 2017 23:46
@davidalber
Copy link
Contributor Author

I've updated this to not use temp files for formatting the code blocks extracted from Configurations.md.

@davidalber
Copy link
Contributor Author

I also updated the text at the top of the PR to reflect the new implementation.

Copy link
Member

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! I left a few comments inline, mostly minor code style issues.

tests/system.rs Outdated
let config_value_regex = regex::Regex::new(r#"^#### `"?([^`"]+)"?`"#)
.expect("Failed creating configuration value pattern");

let mut code_blocks = Vec::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole parsing function could be nicely refactored into a struct and a set of functions, rather than being a big function with lots of local, mutable state.

tests/system.rs Outdated
}
}

// Display results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the println and assertion to the caller please? That feels like it makes this a more self-contained function.

tests/system.rs Outdated
fn check_blocks_idempotency(blocks: &Vec<CodeBlock>) {
let mut failures = 0;

for block in blocks {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: this loop might work better as a fold if you can factor out some of the code into little functions, not sure though.

@davidalber
Copy link
Contributor Author

I did a large refactor of the previous iteration. Almost all of the logic has been moved to be encapsulated in an enum and a struct.

  • Used enum to extract sections from Configurations.md.
  • Used struct to handle the extracted sections from Configurations.md, attempt formatting, and report errors.

Additional comments:

  • I would liked to have separated checking whether a given block fails the format test from the printing of the outcome, but the diff printing infrastructure doesn't have a way to just get the failure string without printing. There are alternatives, but I didn't think it was worth pursuing further right now.
  • Added lazy_static as an explicit dependency so we don't have to keep building the regexes. This is already a transitive dependency, so I thought this is probably all right.

Copy link
Member

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Could you add a few comments where I requested and squash the commits please?

@@ -97,15 +97,27 @@ pub fn make_diff(expected: &str, actual: &str, context_size: usize) -> Vec<Misma
results
}

pub enum PrintType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document what these variants entale

tests/system.rs Outdated
@@ -485,3 +505,230 @@ fn string_eq_ignore_newline_repr_test() {
assert!(string_eq_ignore_newline_repr("a\r\n\r\n\r\nb", "a\n\n\nb"));
assert!(!string_eq_ignore_newline_repr("a\r\nbcd", "a\nbcdefghijk"));
}

enum ConfigurationSection {
CodeBlock((String, u32)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document what these two args mean please?

tests/system.rs Outdated
// "## `NAME`".
// - Configuration values in Configurations.md must be in the form of
// "#### `VALUE`".
fn get_code_blocks() -> Vec<ConfigCodeBlock> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrc: I don't expect to use this function elsewhere, which makes me want to limit its scope. My instinct is to nest it inside configuration_snippet_tests (below). What do you think about that? Is that a Rusty sort of thing to do? I haven't seen heavy use of nested functions in Rust code (not that I have seen a lot of Rust code).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested functions are fine to use. They are not super-popular because unless the nested function is very small it tends to just make things hard to read.

@davidalber davidalber force-pushed the configurations-checking branch from 0cbc1d2 to 85ccb98 Compare January 4, 2018 08:13
// A representation of how to write output.
pub enum PrintType {
Fancy, // want to output color and the terminal supports it
Basic, // do not want to output color or the terminal does not support color
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrc: Here's the first set of comments you requested.

After this PR, I'm going to spin up a new one to merge print_diff_fancy and print_diff_basic by adding more implementation to the enum. I think the enum is too abstract like this.

// with its starting line number, the name of a rustfmt configuration option, or the value of a
// rustfmt configuration option.
enum ConfigurationSection {
CodeBlock((String, u32)), // (String: block of code, u32: line number of code block start)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrc: Here's the other comment you requested. I added commentary above the enum, as well.

// - Configuration names in Configurations.md must be in the form of
// "## `NAME`".
// - Configuration values in Configurations.md must be in the form of
// "#### `VALUE`".
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commentary was moved here since pre-squash.

fn configuration_snippet_tests() {
// Read Configurations.md and build a `Vec` of `ConfigCodeBlock` structs with one
// entry for each Rust code block found.
fn get_code_blocks() -> Vec<ConfigCodeBlock> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this function here (it was previously not nested) since pre-squash. I also modified the loop to be a while let.

@davidalber
Copy link
Contributor Author

@nrc: I addressed your comments, squashed, and rebased. I added inline comments to point at the changes made.

I'm happy to rebase again after #2338 is resolved.

@killercup
Copy link
Member

Uh, any reason not to use a real markdown parser? (Or am I missing something?)

@davidalber
Copy link
Contributor Author

I don't think you are missing something. I thought that taking a dependency and using a full Markdown parser was overkill for this situation. Since most of the changeset is handling the semantics of how the file is organized, not Markdown syntax, I don't think using a parser will make the changes much more compact, although it may make them easier to read.

I'm completely willing to try it out, though. Maybe in a competing PR so we can compare. It looks like pulldown-cmark is the most popular option here, and looking at the documentation it seems like it will work here (there's some details that I don't see documented, so I need to try it out to be sure). Do you have other suggestions for Markdown parsers that you think would be a better way to go?

@killercup
Copy link
Member

killercup commented Jan 4, 2018 via email

@davidalber
Copy link
Contributor Author

It does seem pretty easy. A hangup appears to be that I don't have a way to get the number of a line from which an event is extracted in pulldown-cmark. I need the line number to provide useful test failure messages. I guess it's not surprising, given normal use cases, that line numbers aren't returned. If I completely got that wrong and there's a way to get line numbers (or ranges), please let me know.

@nrc nrc merged commit d60a695 into rust-lang:master Jan 9, 2018
@nrc
Copy link
Member

nrc commented Jan 9, 2018

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants