Skip to content

unicode.py refactor part 1 #50922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

clarfonthey
Copy link
Contributor

@clarfonthey clarfonthey commented May 20, 2018

I will follow this up with more changes in future PRs, although I figure that this work so far is enough to have an initial PR. The first commit here is just #50920, which is simple enough that it can be merged immediately; as such, this should be merged after #50920.

The goal of this is to simplify this file so that it's easier to understand and maintain. Moving the conversion logic into mapping_table.rs should make it easier to modify the case mapping tables for size and/or performance, which I also plan to do in a future PR.

The contents of tables.rs don't actually change until the last commit.

@clarfonthey clarfonthey force-pushed the unicodepy-refactor1 branch from 0633b35 to 33ebfa4 Compare May 20, 2018 18:47
@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:03:06]    Compiling cc v1.0.15
[00:03:06]    Compiling core v0.0.0 (file:///checkout/src/libcore)
[00:03:06]    Compiling build_helper v0.1.0 (file:///checkout/src/build_helper)
[00:03:06]    Compiling unwind v0.0.0 (file:///checkout/src/libunwind)
[00:03:06] error: incorrect close delimiter: `)`
[00:03:06]    --> libcore/char/methods.rs:782:57
[00:03:06]     |
[00:03:06] 782 |         ToLowercase(conversions::Lowercase.lookup(self)))
[00:03:06]     |
[00:03:06] note: unclosed delimiter
[00:03:06]    --> libcore/char/methods.rs:781:46
[00:03:06]     |
[00:03:06]     |
[00:03:06] 781 |     pub fn to_lowercase(self) -> ToLowercase {
[00:03:06] 
[00:03:06] 
[00:03:06] error: incorrect close delimiter: `)`
[00:03:06]    --> libcore/char/methods.rs:868:57
[00:03:06]     |
[00:03:06] 868 |         ToUppercase(conversions::Uppercase.lookup(self)))
[00:03:06]     |
[00:03:06] note: unclosed delimiter
[00:03:06]    --> libcore/char/methods.rs:867:46
[00:03:06]     |
[00:03:06]     |
[00:03:06] 867 |     pub fn to_uppercase(self) -> ToUppercase {
[00:03:06] 
[00:03:06] 
[00:03:06] error: unexpected close delimiter: `}`
[00:03:06]    --> libcore/char/methods.rs:869:5
[00:03:06] 869 |     }
[00:03:06]     |     ^
[00:03:06] 
[00:03:06] error: aborting due to 3 previous errors
[00:03:06] error: aborting due to 3 previous errors
[00:03:06] 
[00:03:06] error: Could not compile `core`.
[00:03:06] 
[00:03:06] Caused by:
[00:03:06]   process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc --crate-name core libcore/lib.rs --color always --error-format json --crate-type lib --emit=dep-info,link -C opt-level=3 -C metadata=0d1ebef792b1d9ca -C extra-filename=-0d1ebef792b1d9ca --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps --target x86_64-unknown-linux-gnu -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/release/deps` (exit code: 101)
[00:03:06] warning: build failed, waiting for other jobs to finish...
[00:03:12] error: build failed
[00:03:12] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json"
[00:03:12] expected success, got: exit code: 101
[00:03:12] thread 'main' panicked at 'cargo must succeed', bootstrap/compile.rs:1091:9

[00:03:12] travis_time:end:stage0-std:start=1526842376752063012,finish=1526842384083215718,duration=7331152706

[00:03:12] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:03:12] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:03:12] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:12] Build completed unsuccessfully in 0:00:08
[00:03:12] Makefile:79: recipe for target 'tidy' failed
[00:03:12] make: *** [tidy] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:08c4f195
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@clarfonthey clarfonthey force-pushed the unicodepy-refactor1 branch from 33ebfa4 to ee35e1a Compare May 20, 2018 18:59
@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:02:54]    Compiling cc v1.0.15
[00:02:54]    Compiling core v0.0.0 (file:///checkout/src/libcore)
[00:02:54]    Compiling build_helper v0.1.0 (file:///checkout/src/build_helper)
[00:02:54]    Compiling unwind v0.0.0 (file:///checkout/src/libunwind)
[00:02:56] error[E0432]: unresolved import `unicode::mapping_table`
[00:02:56]   --> libcore/char/mod.rs:61:14
[00:02:56] 61 | use unicode::mapping_table::Lookup;
[00:02:56] 61 | use unicode::mapping_table::Lookup;
[00:02:56]    |              ^^^^^^^^^^^^^ Could not find `mapping_table` in `unicode`
[00:02:56] 
[00:02:56] error[E0432]: unresolved import `unicode::mapping_table`
[00:02:56]   --> libcore/unicode/tables.rs:17:14
[00:02:56]    |
[00:02:56] 17 | use unicode::mapping_table::MappingTable;
[00:02:56]    |              ^^^^^^^^^^^^^ Could not find `mapping_table` in `unicode`
[00:02:56] 
[00:02:56] error[E0412]: cannot find type `CaseMappingIter` in this scope
[00:02:56]    --> libcore/char/mod.rs:436:24
[00:02:56]     |
[00:02:56] 436 | pub struct ToUppercase(CaseMappingIter);
[00:02:56] 
[00:02:56] 
[00:02:56] error[E0422]: cannot find struct, variant or union type `MappingTable` in this scope
[00:02:56]     --> libcore/unicode/tables.rs:1143:44
[00:02:56]      |
[00:02:56] 1143 |     const Lowercase: super::MappingTable = MappingTable {
[00:02:56] 
[00:02:56] 
[00:02:56] error[E0422]: cannot find struct, variant or union type `MappingTable` in this scope
[00:02:56]     --> libcore/unicode/tables.rs:1773:44
[00:02:56]      |
[00:02:56] 1773 |     const Uppercase: super::MappingTable = MappingTable {
[00:02:56] 
[00:02:56] 
[00:02:56] error[E0603]: constant `Lowercase` is private
[00:02:56]    --> libcore/char/methods.rs:782:21
[00:02:56]     |
[00:02:56] 782 |         ToLowercase(conversions::Lowercase.lookup(self))
[00:02:56] 
[00:02:56] 
[00:02:56] error[E0603]: constant `Uppercase` is private
[00:02:56]    --> libcore/char/methods.rs:868:21
[00:02:56]     |
[00:02:56] 868 |         ToUppercase(conversions::Uppercase.lookup(self))
[00:02:56] 
[00:02:56]    Compiling std v0.0.0 (file:///checkout/src/libstd)
[00:03:00]    Compiling compiler_builtins v0.0.0 (file:///checkout/src/rustc/compiler_builtins_shim)
[00:03:00]    Compiling cmake v0.1.30
---
[00:03:08] Caused by:
[00:03:08]   process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc --crate-name core libcore/lib.rs --color always --error-format json --crate-type lib --emit=dep-info,link -C opt-level=3 -C metadata=0d1ebef792b1d9ca -C extra-filename=-0d1ebef792b1d9ca --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps --target x86_64-unknown-linux-gnu -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/release/deps` (exit code: 101)
[00:03:08] warning: build failed, waiting for other jobs to finish...
[00:03:18] error: build failed
[00:03:18] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json"
[00:03:18] expected success, got: exit code: 101
[00:03:18] thread 'main' panicked at 'cargo must succeed', bootstrap/compile.rs:1091:9
[00:03:18] travis_fold:end:stage0-std

[00:03:18] travis_time:end:stage0-std:start=1526843051271625051,finish=1526843075740658075,duration=24469033024


[00:03:18] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:18] Build completed unsuccessfully in 0:00:25
[00:03:18] make: *** [tidy] Error 1
[00:03:18] Makefile:79: recipe for target 'tidy' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:1c22feb8
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@clarfonthey clarfonthey force-pushed the unicodepy-refactor1 branch 2 times, most recently from fd0b699 to dc4902a Compare May 20, 2018 19:17
@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:03:22]    Compiling rustc_lsan v0.0.0 (file:///checkout/src/librustc_lsan)
[00:03:22]    Compiling rustc_asan v0.0.0 (file:///checkout/src/librustc_asan)
[00:03:23]    Compiling rustc_msan v0.0.0 (file:///checkout/src/librustc_msan)
[00:03:23]    Compiling rustc_tsan v0.0.0 (file:///checkout/src/librustc_tsan)
[00:03:25] error[E0599]: no variant named `same` found for type `unicode::mapping_table::LookupInner` in the current scope
[00:03:25]   --> libcore/unicode/mapping_table.rs:22:28
[00:03:25]    |
[00:03:25] 22 |             None => Lookup(LookupInner::same(c)),
[00:03:25]    |                            ^^^^^^^^^^^^^^^^^ variant not found in `unicode::mapping_table::LookupInner`
[00:03:25] ...
[00:03:25] 35 | pub enum LookupInner {
[00:03:25]    | -------------------- variant `same` not found here
[00:03:25]    = note: did you mean `variant::Same`?
[00:03:25] 
[00:03:25] error[E0308]: mismatched types
[00:03:25]   --> libcore/unicode/mapping_table.rs:25:52
[00:03:25]   --> libcore/unicode/mapping_table.rs:25:52
[00:03:25]    |
[00:03:25] 25 |                 match s.iter().rposition(|&c| c == 0) {
[00:03:25]    |                                                    ^ expected char, found u8
[00:03:25] 
[00:03:25] error[E0599]: no associated item named `Same` found for type `unicode::mapping_table::Lookup` in the current scope
[00:03:25]   --> libcore/unicode/mapping_table.rs:50:13
[00:03:25]    |
[00:03:25] 42 | pub struct Lookup(LookupInner);
[00:03:25]    | ------------------------------- associated item `Same` not found for this
[00:03:25] ...
[00:03:25] 50 |             Lookup::Same(c) => {
[00:03:25]    |             ^^^^^^^^^^^^^^^ associated item not found in `unicode::mapping_table::Lookup`
[00:03:25] 
[00:03:25] error[E0599]: no associated item named `Iter` found for type `unicode::mapping_table::Lookup` in the current scope
[00:03:25]   --> libcore/unicode/mapping_table.rs:54:13
[00:03:25]    |
[00:03:25] 42 | pub struct Lookup(LookupInner);
[00:03:25]    | ------------------------------- associated item `Iter` not found for this
[00:03:25] ...
[00:03:25] 54 |             Lookup::Iter(iter) => iter.next(),
[00:03:25]    |             ^^^^^^^^^^^^^^^^^^ associated item not found in `unicode::mapping_table::Lookup`
[00:03:25] 
[00:03:25] error[E0599]: no associated item named `Iter` found for type `unicode::mapping_table::Lookup` in the current scope
[00:03:25]   --> libcore/unicode/mapping_table.rs:51:25
[00:03:25]    |
[00:03:25] 42 | pub struct Lookup(LookupInner);
[00:03:25]    | ------------------------------- associated item `Iter` not found for this
[00:03:25] ...
[00:03:25] 51 |                 *self = Lookup::Iter([].iter());
[00:03:25]    |                         ^^^^^^^^^^^^ associated item not found in `unicode::mapping_table::Lookup`
[00:03:25] 
[00:03:25] error[E0599]: no associated item named `Same` found for type `unicode::mapping_table::Lookup` in the current scope
[00:03:25]   --> libcore/unicode/mapping_table.rs:61:13
[00:03:25]    |
[00:03:25] 42 | pub struct Lookup(LookupInner);
[00:03:25]    | ------------------------------- associated item `Same` not found for this
[00:03:25] ...
[00:03:25] 61 |             Lookup::Same(_) => (1, Some(1)),
[00:03:25]    |             ^^^^^^^^^^^^^^^ associated item not found in `unicode::mapping_table::Lookup`
[00:03:25] 
[00:03:25] error[E0599]: no associated item named `Iter` found for type `unicode::mapping_table::Lookup` in the current scope
[00:03:25]   --> libcore/unicode/mapping_table.rs:62:13
[00:03:25]    |
[00:03:25] 42 | pub struct Lookup(LookupInner);
[00:03:25]    | ------------------------------- associated item `Iter` not found for this
[00:03:25] ...
[00:03:25] 62 |             Lookup::Iter(iter) => iter.size_hint(),
[00:03:25]    |             ^^^^^^^^^^^^^^^^^^ associated item not found in `unicode::mapping_table::Lookup`
[00:03:25] error: aborting due to 7 previous errors
[00:03:25] 
[00:03:25] Some errors occurred: E0308, E0599.
[00:03:25] For more information about an error, try `rustc --explain E0308`.
[00:03:25] For more information about an error, try `rustc --explain E0308`.
[00:03:25] error: Could not compile `core`.
[00:03:25] 
[00:03:25] Caused by:
[00:03:25]   process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc --crate-name core libcore/lib.rs --color always --error-format json --crate-type lib --emit=dep-info,link -C opt-level=3 -C metadata=0d1ebef792b1d9ca -C extra-filename=-0d1ebef792b1d9ca --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps --target x86_64-unknown-linux-gnu -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/release/deps` (exit code: 101)
[00:03:25] warning: build failed, waiting for other jobs to finish...
[00:03:35] error: build failed
[00:03:35] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json"
[00:03:35] expected success, got: exit code: 101
[00:03:35] thread 'main' panicked at 'cargo must succeed', bootstrap/compile.rs:1091:9
[00:03:35] travis_fold:end:stage0-std

[00:03:35] travis_time:end:stage0-std:start=1526844149130064434,finish=1526844175583372211,duration=26453307777


[00:03:35] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:35] Build completed unsuccessfully in 0:00:27
[00:03:35] Makefile:79: recipe for target 'tidy' failed
[00:03:35] make: *** [tidy] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:002961fc
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@clarfonthey clarfonthey force-pushed the unicodepy-refactor1 branch from dc4902a to 08a0f6c Compare May 20, 2018 19:24
@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[00:04:59] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:04:59] tidy error: /checkout/src/libcore/unicode/mapping_table.rs: missing trailing newline
[00:05:01] some tidy checks failed
[00:05:01] 
[00:05:01] 
[00:05:01] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:05:01] 
[00:05:01] 
[00:05:01] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:05:01] Build completed unsuccessfully in 0:01:58
[00:05:01] Build completed unsuccessfully in 0:01:58
[00:05:01] Makefile:79: recipe for target 'tidy' failed
[00:05:01] make: *** [tidy] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:199109b8
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@clarfonthey clarfonthey force-pushed the unicodepy-refactor1 branch from 08a0f6c to 8eee38c Compare May 20, 2018 20:54
@pietroalbini
Copy link
Member

Picking someone from libs, r? @alexcrichton

@pietroalbini pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 21, 2018
@SimonSapin
Copy link
Contributor

Sorry to only say this after you’ve done all this work, but what would you think of removing this Python code entirely and using https://github.com/BurntSushi/ucd-generate instead? Parts of the Python script have been copy-pasted at different points into a number of different crates, it would be nice to have instead a shared solution for the ecosystem.

@alexcrichton
Copy link
Member

r? @SimonSapin

@clarfonthey
Copy link
Contributor Author

@SimonSapin I agree that a fully-Rust version is ideal long term, although the main purpose of this is to make smaller changes that are easier to merge than a complete rewrite. I knew of ucd-generate and unic's build scripts but still decided to go with this anyway, as it's a better short term solution.

@SimonSapin
Copy link
Contributor

In short, I don’t get the point of refactoring code if we have plans to entirely replace that same code.

@bors
Copy link
Collaborator

bors commented May 22, 2018

☔ The latest upstream changes (presumably #49283) made this pull request unmergeable. Please resolve the merge conflicts.

@pietroalbini
Copy link
Member

Ping from triage @SimonSapin! What should we do with this PR then? Close it?

@SimonSapin
Copy link
Contributor

I was hoping to discuss this some more with @clarcharr so that we find consensus, rather than closing unilaterally. @clarcharr should we hop on IRC at some point?

@clarfonthey
Copy link
Contributor Author

I'd definitely be interested in talking more about this on IRC. I've been a bit busy this week and haven't had much time to get around to this.

I'll open another PR after having some more discussion about this, but I'm not sure how long that'll be. I'll close this for now.

@clarfonthey clarfonthey closed this Jun 1, 2018
@clarfonthey clarfonthey deleted the unicodepy-refactor1 branch January 29, 2022 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants