Skip to content

Preconvert in Color::parse to avoid codesize impact #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Manishearth
Copy link
Member

@Manishearth Manishearth commented Feb 2, 2017

Do not land.

We currently ascii-lowercase each time. We shouldn't do that in large matches.

We should probably have a COW-returning method instead of an unconditional allocation here; so that we retain the speed of not allocating in the common case.

cc @mbrubeck does this improve codesize?

r? @SimonSapin


This change is Reviewable

@SimonSapin
Copy link
Member

I don’t think repeatedly lowercasing is what hurts code size. (Though it might hurt runtime perf.) It’s rather generating code for each color keyword. I think having a static table and a loop that uses eq_ignore_ascii_case can also reduce code size. I’ll make a PR now.

@Manishearth
Copy link
Member Author

Yeah, that's what dmajor suggested. I also wanted to try this because this may improve perf (once used with COW, that is)

@SimonSapin
Copy link
Member

Here is the static table: https://github.com/servo/rust-cssparser/compare/color-keyword-static-table

I’ve written the Cow lower case thing before but didn’t end up using it. Here it is:

/// Like AsciiExt::to_ascii_lowercase, but avoids allocating when the input is already lower-case.
pub fn cow_into_ascii_lowercase<'a, S: Into<Cow<'a, str>>>(s: S) -> Cow<'a, str> {
    let mut cow = s.into();
    match cow.bytes().position(|byte| byte >= b'A' && byte <= b'Z') {
        Some(first_uppercase) => cow.to_mut()[first_uppercase..].make_ascii_lowercase(),
        None => {}
    }
    cow
}

@Manishearth
Copy link
Member Author

Should we merge that branch instead?

@Manishearth
Copy link
Member Author

r=me if you want to land it

@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #118) made this pull request unmergeable. Please resolve the merge conflicts.

bors-servo pushed a commit that referenced this pull request Feb 25, 2017
Make match_ignore_ascii_case more efficient, add ascii_case_insensitive_phf_map

This improves the performance of `match_ignore_ascii_case!` by replacing calls to `str::eq_ignore_ascii_case` with plain string equality. The input string is converted to ASCII lower-case once, using a stack allocated buffer. A `proc_macro_derive` is used internally to ensure that string patterns are already lower-case, and to computed the size of the buffer (the length of the longest pattern).

This should improve runtime performance, it the amount of generated MIR or LLVM IR in release mode still looks proportional to the number of patterns, so this by itself won’t help much with the code bloat in `parse_color_keyword`.

To deal with that, this PR also adds a `ascii_case_insensitive_phf_map!` macro that reuses the same stack-allocated buffer mechanism to lower-case an input string, and combines it with a [`phf`](https://github.com/sfackler/rust-phf) static hash map.

The two macros are similar but make different treadoffs. PHF probably generates faster and less bloated code, but the map’s values need to be given as a string that contains Rust syntax, due to limitations of procedural macros in current stable Rust. On the other hand, generating a `match` expression allows using control flow statements like `return` or `continue` in its match arms.

Fixes #115.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-cssparser/122)
<!-- Reviewable:end -->
bors-servo pushed a commit that referenced this pull request Feb 25, 2017
Make match_ignore_ascii_case more efficient, add ascii_case_insensitive_phf_map

This improves the performance of `match_ignore_ascii_case!` by replacing calls to `str::eq_ignore_ascii_case` with plain string equality. The input string is converted to ASCII lower-case once, using a stack allocated buffer. A `proc_macro_derive` is used internally to ensure that string patterns are already lower-case, and to computed the size of the buffer (the length of the longest pattern).

This should improve runtime performance, it the amount of generated MIR or LLVM IR in release mode still looks proportional to the number of patterns, so this by itself won’t help much with the code bloat in `parse_color_keyword`.

To deal with that, this PR also adds a `ascii_case_insensitive_phf_map!` macro that reuses the same stack-allocated buffer mechanism to lower-case an input string, and combines it with a [`phf`](https://github.com/sfackler/rust-phf) static hash map.

The two macros are similar but make different treadoffs. PHF probably generates faster and less bloated code, but the map’s values need to be given as a string that contains Rust syntax, due to limitations of procedural macros in current stable Rust. On the other hand, generating a `match` expression allows using control flow statements like `return` or `continue` in its match arms.

Fixes #115.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-cssparser/122)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants