Skip to content

[WIP] Format macro def with repeat #2390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

topecongiro
Copy link
Contributor

@topecongiro topecongiro commented Jan 24, 2018

This PR tries to handle macro def with repeat (e.g. $( /* .. */ )+).

The basic strategy is similar to what we do against macro variables. We replace $( /* .. */ )+ with ({ /*.. */ }) and keep going. If the formatting succeeds, we put $( and )+ back to the right place.

I chose to replace $( and )+ with ({ and )} because this lets us format repeat with both expressions and statements. Downside is that this way we cannot format repeat with a trailing comma in it (for example, please take a look at failing test).

This PR is still a work in progress. Still, I would like to share this to avoid duplicated effort. Also, I would greatly appreciate it if you kindly give me some feedback 😄

cc #2388.

This commit lets rustfmt to format macro def with repeat in it.

The strategy is to  replace `$(..)*` or `$(..)+` with `({..})` and proceed
formatting. This works well when repeated content is or statements.
The downside is that we fail to format when the repeated content ends with
a trailing comma.

This commit is *VERY* messy...
I am plannig to refactor this and add more tests.
@RReverser
Copy link
Contributor

I'm starting to wonder whether it could be potentially simpler to just temporarily replace $ anywhere in macro with _? It would solve both vars and repetition cases in a much simpler ways.

@topecongiro
Copy link
Contributor Author

@RReverser We cannot do that because _() is invalid rust code.

@RReverser
Copy link
Contributor

Yeah thought about that after writing. What about z for such generic escaping? The biggest problem remains preserving width when escaping existing characters, but other than that, should work.

@RReverser
Copy link
Contributor

Actually I see what you mean, things with partial tokens like Err() $(.or_else($func))* still won't work with either of these approaches.

@RReverser
Copy link
Contributor

RReverser commented Jan 26, 2018

I wonder if there's a way to expand any repetition as one occurence - Err().or_else(func) for the example above - format, and then map $( back somehow...

If Rustfmt wouldn't be producing/removing significant tokens and would touch only whitespace, this could be relatively easy to achieve with reparsing and mapping token positions, but my understanding that it's not the case?

@nrc
Copy link
Member

nrc commented Jan 29, 2018

Hmm, so (...)* could be replaced by /* */ of the correct length, since it can be expanded zero times. In theory, that doesn't help with (...)+, however, I suspect that in practice most uses of the (...)+ tokens could occur zero times (at least for parsing) and so that would work in most cases.

I'm not sure how good the formatting would look though - I think we treat comments somewhat differently from code. OTOH, perhaps token interpolation also looks better with special treatment?

@RReverser
Copy link
Contributor

@nrc Actually I started thinking about comments yesterday too, but rather as start/end markers. Then we can still leverage single repetition (which should be syntactically allowed in its context pretty much always).

For example, if we do Err() $(.or_else($func))* - > Err() /*REP_START*/.or_else(zfunc)/*REP_END*/ (actual comment contents might be entirely different, these are just for demonstration purposes), then entire expression can be formatted as it would look like after expansion, and in the end you can restore $( groups from these preserved comments.

@RReverser
Copy link
Contributor

RReverser commented Feb 5, 2018

I've been thining about this a bit more, and came to realising it probably will never be possible to format any macro body as regular AST. You can substitute tokens in pretty much any position, and replacing them with identifiers or anything similar will often result in invalid code. Some examples from complex macros:

$(# $attrs)*
pub struct S {}
pub enum $enum_name { $($enum_variant,)* }
impl $(< $($param),* >)* $name $(< $($param),* >)*

Maybe it would be more efficient / less error-prone to try and format based on pure token trees, and use keywords we find in between to know when to split lines?

@nrc
Copy link
Member

nrc commented Feb 11, 2018

I think it is fine to solve some macro formatting issues without solving all of them. I'd be happy if we could address 80% of macros and leave the rest. I worry that a token-based approach would end up being subtly different from the rest of the rustfmt code and we'd end up with essentially two different formatters.

@RReverser
Copy link
Contributor

@nrc The problem is, maximal munch is quite popular in token macros as it's the only way to resolve various conflicts and build type definitions iteratively, so kinds of repetitions like above will appear in many macro declarations.

@topecongiro topecongiro deleted the macro2.0 branch May 23, 2018 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants