Skip to content

Ignore non-semantic tokens for 'probably_eq' streams. #55971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 19, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions src/libsyntax/parse/token.rs
Original file line number Diff line number Diff line change
Expand Up @@ -570,8 +570,9 @@ impl Token {
//
// Instead the "probably equal" check here is "does each token
// recursively have the same discriminant?" We basically don't look at
// the token values here and assume that such fine grained modifications
// of token streams doesn't happen.
// the token values here and assume that such fine grained token stream
// modifications, including adding/removing typically non-semantic
// tokens such as extra braces and commas, don't happen.
if let Some(tokens) = tokens {
if tokens.probably_equal_for_proc_macro(&tokens_for_real) {
return tokens
Expand Down
30 changes: 26 additions & 4 deletions src/libsyntax/tokenstream.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ use syntax_pos::{BytePos, Mark, Span, DUMMY_SP};
use ext::base;
use ext::tt::{macro_parser, quoted};
use parse::Directory;
use parse::token::{self, Token};
use parse::token::{self, DelimToken, Token};
use print::pprust;
use serialize::{Decoder, Decodable, Encoder, Encodable};
use util::RcVec;
Expand All @@ -38,7 +38,7 @@ use std::{fmt, iter, mem};
#[derive(Clone, PartialEq, RustcEncodable, RustcDecodable, Debug)]
pub struct Delimited {
/// The type of delimiter
pub delim: token::DelimToken,
pub delim: DelimToken,
/// The delimited sequence of token trees
pub tts: ThinTokenStream,
}
Expand Down Expand Up @@ -368,8 +368,30 @@ impl TokenStream {
// This is otherwise the same as `eq_unspanned`, only recursing with a
// different method.
pub fn probably_equal_for_proc_macro(&self, other: &TokenStream) -> bool {
let mut t1 = self.trees();
let mut t2 = other.trees();
// When checking for `probably_eq`, we ignore certain tokens that aren't
// preserved in the AST. Because they are not preserved, the pretty
// printer arbitrarily adds or removes them when printing as token
// streams, making a comparison between a token stream generated from an
// AST and a token stream which was parsed into an AST more reliable.
fn semantic_tree(tree: &TokenTree) -> bool {
match tree {
// The pretty printer tends to add trailing commas to
// everything, and in particular, after struct fields.
| TokenTree::Token(_, Token::Comma)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you throw a comment here for why we're ignoring all comma tokens?

One case this may get tripped up is (x) vs (x,), though :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be the most commonly hit span-loss in practice (see rwf2/Rocket#811, my own comment in #43081 (comment), and dtolnay/syn#482). In general, the pretty printer inserts trailing commas everywhere, and users don't. So all three of the following lose span information:

struct Foo { a: usize }
let x = Foo { a: 10 };
match x { Foo(..) => 10 }

// The pretty printer emits `NoDelim` as whitespace.
| TokenTree::Token(_, Token::OpenDelim(DelimToken::NoDelim))
| TokenTree::Token(_, Token::CloseDelim(DelimToken::NoDelim))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be slightly worried about ignoring these because couldn't they affect precedence? The token streams 1 + (2 * 3) vs (1 + 2) * 3 for example (subbing parens for "no delimiter"). Does this come up in practice though for one of the tests below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that be a DelimToken::Paren? I didn't filter those as I was worried about the same thing.

// The pretty printer collapses many semicolons into one.
| TokenTree::Token(_, Token::Semi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, could a comment be added here why this isn't necessary to compare?

I'm also a little uneasy about this one like commas though because it seems like we're saying that semicolons are optional in Rust, but I feel like there's something that can parse one way or another with and without a semicolon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but the thing to consider here is whether there's going to be a modification to the AST that add/removes semicolons without preserving semantics. I'd be surprised if that was the case.

// The pretty printer collapses whitespace arbitrarily and can
// introduce whitespace from `NoDelim`.
| TokenTree::Token(_, Token::Whitespace) => false,
_ => true
}
}

let mut t1 = self.trees().filter(semantic_tree);
let mut t2 = other.trees().filter(semantic_tree);
for (t1, t2) in t1.by_ref().zip(t2.by_ref()) {
if !t1.probably_equal_for_proc_macro(&t2) {
return false;
Expand Down
12 changes: 12 additions & 0 deletions src/test/ui-fulldeps/proc-macro/auxiliary/span-preservation.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// no-prefer-dynamic

#![crate_type = "proc-macro"]

extern crate proc_macro;

use proc_macro::TokenStream;

#[proc_macro_attribute]
pub fn foo(_: TokenStream, input: TokenStream) -> TokenStream {
input.into_iter().collect()
}
51 changes: 51 additions & 0 deletions src/test/ui-fulldeps/proc-macro/span-preservation.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
// aux-build:span-preservation.rs

// For each of these, we should get the appropriate type mismatch error message,
// and the function should be echoed.

extern crate span_preservation as foo;

use foo::foo;

#[foo]
fn a() {
let x: usize = "hello";;;;;
}

#[foo]
fn b(x: Option<isize>) -> usize {
match x {
Some(x) => { return x },
None => 10
}
}

#[foo]
fn c() {
struct Foo {
a: usize
}

struct Bar {
a: usize,
b: usize
}

let x = Foo { a: 10isize };
let y = Foo { a: 10, b: 10isize };
}

// FIXME: This doesn't work at the moment. See the one below. The pretty-printer
// injects a "C" between `extern` and `fn` which causes a "probably_eq"
// `TokenStream` mismatch. The lack of `"C"` should be preserved in the AST.
#[foo]
extern fn bar() {
0
}

#[foo]
extern "C" fn baz() {
0
}

fn main() {}
49 changes: 49 additions & 0 deletions src/test/ui-fulldeps/proc-macro/span-preservation.stderr
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
error[E0308]: mismatched types
|
= note: expected type `()`
found type `{integer}`

error[E0308]: mismatched types
--> $DIR/span-preservation.rs:12:20
|
LL | let x: usize = "hello";;;;;
| ^^^^^^^ expected usize, found reference
|
= note: expected type `usize`
found type `&'static str`

error[E0308]: mismatched types
--> $DIR/span-preservation.rs:18:29
|
LL | Some(x) => { return x },
| ^ expected usize, found isize

error[E0308]: mismatched types
--> $DIR/span-preservation.rs:34:22
|
LL | let x = Foo { a: 10isize };
| ^^^^^^^ expected usize, found isize

error[E0560]: struct `c::Foo` has no field named `b`
--> $DIR/span-preservation.rs:35:26
|
LL | let y = Foo { a: 10, b: 10isize };
| ^ `c::Foo` does not have this field
|
= note: available fields are: `a`

error[E0308]: mismatched types
--> $DIR/span-preservation.rs:48:5
|
LL | extern "C" fn baz() {
| - possibly return type missing here?
LL | 0
| ^ expected (), found integral variable
|
= note: expected type `()`
found type `{integer}`

error: aborting due to 6 previous errors

Some errors occurred: E0308, E0560.
For more information about an error, try `rustc --explain E0308`.