Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 35cfde1

Browse files
authoredJun 15, 2020
Rollup merge of #73341 - jonas-schievink:matchdoc, r=davidtwco
_match.rs: fix module doc comment It was applied to a `use` item, not to the module
2 parents ec0cb57 + c7ad3ad commit 35cfde1

File tree

1 file changed

+271
-271
lines changed

1 file changed

+271
-271
lines changed
 

‎src/librustc_mir_build/hair/pattern/_match.rs

Lines changed: 271 additions & 271 deletions
Original file line numberDiff line numberDiff line change
@@ -1,274 +1,274 @@
1-
/// Note: most of the tests relevant to this file can be found (at the time of writing) in
2-
/// src/tests/ui/pattern/usefulness.
3-
///
4-
/// This file includes the logic for exhaustiveness and usefulness checking for
5-
/// pattern-matching. Specifically, given a list of patterns for a type, we can
6-
/// tell whether:
7-
/// (a) the patterns cover every possible constructor for the type [exhaustiveness]
8-
/// (b) each pattern is necessary [usefulness]
9-
///
10-
/// The algorithm implemented here is a modified version of the one described in:
11-
/// http://moscova.inria.fr/~maranget/papers/warn/index.html
12-
/// However, to save future implementors from reading the original paper, we
13-
/// summarise the algorithm here to hopefully save time and be a little clearer
14-
/// (without being so rigorous).
15-
///
16-
/// # Premise
17-
///
18-
/// The core of the algorithm revolves about a "usefulness" check. In particular, we
19-
/// are trying to compute a predicate `U(P, p)` where `P` is a list of patterns (we refer to this as
20-
/// a matrix). `U(P, p)` represents whether, given an existing list of patterns
21-
/// `P_1 ..= P_m`, adding a new pattern `p` will be "useful" (that is, cover previously-
22-
/// uncovered values of the type).
23-
///
24-
/// If we have this predicate, then we can easily compute both exhaustiveness of an
25-
/// entire set of patterns and the individual usefulness of each one.
26-
/// (a) the set of patterns is exhaustive iff `U(P, _)` is false (i.e., adding a wildcard
27-
/// match doesn't increase the number of values we're matching)
28-
/// (b) a pattern `P_i` is not useful if `U(P[0..=(i-1), P_i)` is false (i.e., adding a
29-
/// pattern to those that have come before it doesn't increase the number of values
30-
/// we're matching).
31-
///
32-
/// # Core concept
33-
///
34-
/// The idea that powers everything that is done in this file is the following: a value is made
35-
/// from a constructor applied to some fields. Examples of constructors are `Some`, `None`, `(,)`
36-
/// (the 2-tuple constructor), `Foo {..}` (the constructor for a struct `Foo`), and `2` (the
37-
/// constructor for the number `2`). Fields are just a (possibly empty) list of values.
38-
///
39-
/// Some of the constructors listed above might feel weird: `None` and `2` don't take any
40-
/// arguments. This is part of what makes constructors so general: we will consider plain values
41-
/// like numbers and string literals to be constructors that take no arguments, also called "0-ary
42-
/// constructors"; they are the simplest case of constructors. This allows us to see any value as
43-
/// made up from a tree of constructors, each having a given number of children. For example:
44-
/// `(None, Ok(0))` is made from 4 different constructors.
45-
///
46-
/// This idea can be extended to patterns: a pattern captures a set of possible values, and we can
47-
/// describe this set using constructors. For example, `Err(_)` captures all values of the type
48-
/// `Result<T, E>` that start with the `Err` constructor (for some choice of `T` and `E`). The
49-
/// wildcard `_` captures all values of the given type starting with any of the constructors for
50-
/// that type.
51-
///
52-
/// We use this to compute whether different patterns might capture a same value. Do the patterns
53-
/// `Ok("foo")` and `Err(_)` capture a common value? The answer is no, because the first pattern
54-
/// captures only values starting with the `Ok` constructor and the second only values starting
55-
/// with the `Err` constructor. Do the patterns `Some(42)` and `Some(1..10)` intersect? They might,
56-
/// since they both capture values starting with `Some`. To be certain, we need to dig under the
57-
/// `Some` constructor and continue asking the question. This is the main idea behind the
58-
/// exhaustiveness algorithm: by looking at patterns constructor-by-constructor, we can efficiently
59-
/// figure out if some new pattern might capture a value that hadn't been captured by previous
60-
/// patterns.
61-
///
62-
/// Constructors are represented by the `Constructor` enum, and its fields by the `Fields` enum.
63-
/// Most of the complexity of this file resides in transforming between patterns and
64-
/// (`Constructor`, `Fields`) pairs, handling all the special cases correctly.
65-
///
66-
/// Caveat: this constructors/fields distinction doesn't quite cover every Rust value. For example
67-
/// a value of type `Rc<u64>` doesn't fit this idea very well, nor do various other things.
68-
/// However, this idea covers most of the cases that are relevant to exhaustiveness checking.
69-
///
70-
///
71-
/// # Algorithm
72-
///
73-
/// Recall that `U(P, p)` represents whether, given an existing list of patterns (aka matrix) `P`,
74-
/// adding a new pattern `p` will cover previously-uncovered values of the type.
75-
/// During the course of the algorithm, the rows of the matrix won't just be individual patterns,
76-
/// but rather partially-deconstructed patterns in the form of a list of fields. The paper
77-
/// calls those pattern-vectors, and we will call them pattern-stacks. The same holds for the
78-
/// new pattern `p`.
79-
///
80-
/// For example, say we have the following:
81-
/// ```
82-
/// // x: (Option<bool>, Result<()>)
83-
/// match x {
84-
/// (Some(true), _) => {}
85-
/// (None, Err(())) => {}
86-
/// (None, Err(_)) => {}
87-
/// }
88-
/// ```
89-
/// Here, the matrix `P` starts as:
90-
/// [
91-
/// [(Some(true), _)],
92-
/// [(None, Err(()))],
93-
/// [(None, Err(_))],
94-
/// ]
95-
/// We can tell it's not exhaustive, because `U(P, _)` is true (we're not covering
96-
/// `[(Some(false), _)]`, for instance). In addition, row 3 is not useful, because
97-
/// all the values it covers are already covered by row 2.
98-
///
99-
/// A list of patterns can be thought of as a stack, because we are mainly interested in the top of
100-
/// the stack at any given point, and we can pop or apply constructors to get new pattern-stacks.
101-
/// To match the paper, the top of the stack is at the beginning / on the left.
102-
///
103-
/// There are two important operations on pattern-stacks necessary to understand the algorithm:
104-
/// 1. We can pop a given constructor off the top of a stack. This operation is called
105-
/// `specialize`, and is denoted `S(c, p)` where `c` is a constructor (like `Some` or
106-
/// `None`) and `p` a pattern-stack.
107-
/// If the pattern on top of the stack can cover `c`, this removes the constructor and
108-
/// pushes its arguments onto the stack. It also expands OR-patterns into distinct patterns.
109-
/// Otherwise the pattern-stack is discarded.
110-
/// This essentially filters those pattern-stacks whose top covers the constructor `c` and
111-
/// discards the others.
112-
///
113-
/// For example, the first pattern above initially gives a stack `[(Some(true), _)]`. If we
114-
/// pop the tuple constructor, we are left with `[Some(true), _]`, and if we then pop the
115-
/// `Some` constructor we get `[true, _]`. If we had popped `None` instead, we would get
116-
/// nothing back.
117-
///
118-
/// This returns zero or more new pattern-stacks, as follows. We look at the pattern `p_1`
119-
/// on top of the stack, and we have four cases:
120-
/// 1.1. `p_1 = c(r_1, .., r_a)`, i.e. the top of the stack has constructor `c`. We
121-
/// push onto the stack the arguments of this constructor, and return the result:
122-
/// r_1, .., r_a, p_2, .., p_n
123-
/// 1.2. `p_1 = c'(r_1, .., r_a')` where `c ≠ c'`. We discard the current stack and
124-
/// return nothing.
125-
/// 1.3. `p_1 = _`. We push onto the stack as many wildcards as the constructor `c` has
126-
/// arguments (its arity), and return the resulting stack:
127-
/// _, .., _, p_2, .., p_n
128-
/// 1.4. `p_1 = r_1 | r_2`. We expand the OR-pattern and then recurse on each resulting
129-
/// stack:
130-
/// S(c, (r_1, p_2, .., p_n))
131-
/// S(c, (r_2, p_2, .., p_n))
132-
///
133-
/// 2. We can pop a wildcard off the top of the stack. This is called `D(p)`, where `p` is
134-
/// a pattern-stack.
135-
/// This is used when we know there are missing constructor cases, but there might be
136-
/// existing wildcard patterns, so to check the usefulness of the matrix, we have to check
137-
/// all its *other* components.
138-
///
139-
/// It is computed as follows. We look at the pattern `p_1` on top of the stack,
140-
/// and we have three cases:
141-
/// 1.1. `p_1 = c(r_1, .., r_a)`. We discard the current stack and return nothing.
142-
/// 1.2. `p_1 = _`. We return the rest of the stack:
143-
/// p_2, .., p_n
144-
/// 1.3. `p_1 = r_1 | r_2`. We expand the OR-pattern and then recurse on each resulting
145-
/// stack.
146-
/// D((r_1, p_2, .., p_n))
147-
/// D((r_2, p_2, .., p_n))
148-
///
149-
/// Note that the OR-patterns are not always used directly in Rust, but are used to derive the
150-
/// exhaustive integer matching rules, so they're written here for posterity.
151-
///
152-
/// Both those operations extend straightforwardly to a list or pattern-stacks, i.e. a matrix, by
153-
/// working row-by-row. Popping a constructor ends up keeping only the matrix rows that start with
154-
/// the given constructor, and popping a wildcard keeps those rows that start with a wildcard.
155-
///
156-
///
157-
/// The algorithm for computing `U`
158-
/// -------------------------------
159-
/// The algorithm is inductive (on the number of columns: i.e., components of tuple patterns).
160-
/// That means we're going to check the components from left-to-right, so the algorithm
161-
/// operates principally on the first component of the matrix and new pattern-stack `p`.
162-
/// This algorithm is realised in the `is_useful` function.
163-
///
164-
/// Base case. (`n = 0`, i.e., an empty tuple pattern)
165-
/// - If `P` already contains an empty pattern (i.e., if the number of patterns `m > 0`),
166-
/// then `U(P, p)` is false.
167-
/// - Otherwise, `P` must be empty, so `U(P, p)` is true.
168-
///
169-
/// Inductive step. (`n > 0`, i.e., whether there's at least one column
170-
/// [which may then be expanded into further columns later])
171-
/// We're going to match on the top of the new pattern-stack, `p_1`.
172-
/// - If `p_1 == c(r_1, .., r_a)`, i.e. we have a constructor pattern.
173-
/// Then, the usefulness of `p_1` can be reduced to whether it is useful when
174-
/// we ignore all the patterns in the first column of `P` that involve other constructors.
175-
/// This is where `S(c, P)` comes in:
176-
/// `U(P, p) := U(S(c, P), S(c, p))`
177-
/// This special case is handled in `is_useful_specialized`.
178-
///
179-
/// For example, if `P` is:
180-
/// [
181-
/// [Some(true), _],
182-
/// [None, 0],
183-
/// ]
184-
/// and `p` is [Some(false), 0], then we don't care about row 2 since we know `p` only
185-
/// matches values that row 2 doesn't. For row 1 however, we need to dig into the
186-
/// arguments of `Some` to know whether some new value is covered. So we compute
187-
/// `U([[true, _]], [false, 0])`.
188-
///
189-
/// - If `p_1 == _`, then we look at the list of constructors that appear in the first
190-
/// component of the rows of `P`:
191-
/// + If there are some constructors that aren't present, then we might think that the
192-
/// wildcard `_` is useful, since it covers those constructors that weren't covered
193-
/// before.
194-
/// That's almost correct, but only works if there were no wildcards in those first
195-
/// components. So we need to check that `p` is useful with respect to the rows that
196-
/// start with a wildcard, if there are any. This is where `D` comes in:
197-
/// `U(P, p) := U(D(P), D(p))`
198-
///
199-
/// For example, if `P` is:
200-
/// [
201-
/// [_, true, _],
202-
/// [None, false, 1],
203-
/// ]
204-
/// and `p` is [_, false, _], the `Some` constructor doesn't appear in `P`. So if we
205-
/// only had row 2, we'd know that `p` is useful. However row 1 starts with a
206-
/// wildcard, so we need to check whether `U([[true, _]], [false, 1])`.
207-
///
208-
/// + Otherwise, all possible constructors (for the relevant type) are present. In this
209-
/// case we must check whether the wildcard pattern covers any unmatched value. For
210-
/// that, we can think of the `_` pattern as a big OR-pattern that covers all
211-
/// possible constructors. For `Option`, that would mean `_ = None | Some(_)` for
212-
/// example. The wildcard pattern is useful in this case if it is useful when
213-
/// specialized to one of the possible constructors. So we compute:
214-
/// `U(P, p) := ∃(k ϵ constructors) U(S(k, P), S(k, p))`
215-
///
216-
/// For example, if `P` is:
217-
/// [
218-
/// [Some(true), _],
219-
/// [None, false],
220-
/// ]
221-
/// and `p` is [_, false], both `None` and `Some` constructors appear in the first
222-
/// components of `P`. We will therefore try popping both constructors in turn: we
223-
/// compute U([[true, _]], [_, false]) for the `Some` constructor, and U([[false]],
224-
/// [false]) for the `None` constructor. The first case returns true, so we know that
225-
/// `p` is useful for `P`. Indeed, it matches `[Some(false), _]` that wasn't matched
226-
/// before.
227-
///
228-
/// - If `p_1 == r_1 | r_2`, then the usefulness depends on each `r_i` separately:
229-
/// `U(P, p) := U(P, (r_1, p_2, .., p_n))
230-
/// || U(P, (r_2, p_2, .., p_n))`
231-
///
232-
/// Modifications to the algorithm
233-
/// ------------------------------
234-
/// The algorithm in the paper doesn't cover some of the special cases that arise in Rust, for
235-
/// example uninhabited types and variable-length slice patterns. These are drawn attention to
236-
/// throughout the code below. I'll make a quick note here about how exhaustive integer matching is
237-
/// accounted for, though.
238-
///
239-
/// Exhaustive integer matching
240-
/// ---------------------------
241-
/// An integer type can be thought of as a (huge) sum type: 1 | 2 | 3 | ...
242-
/// So to support exhaustive integer matching, we can make use of the logic in the paper for
243-
/// OR-patterns. However, we obviously can't just treat ranges x..=y as individual sums, because
244-
/// they are likely gigantic. So we instead treat ranges as constructors of the integers. This means
245-
/// that we have a constructor *of* constructors (the integers themselves). We then need to work
246-
/// through all the inductive step rules above, deriving how the ranges would be treated as
247-
/// OR-patterns, and making sure that they're treated in the same way even when they're ranges.
248-
/// There are really only four special cases here:
249-
/// - When we match on a constructor that's actually a range, we have to treat it as if we would
250-
/// an OR-pattern.
251-
/// + It turns out that we can simply extend the case for single-value patterns in
252-
/// `specialize` to either be *equal* to a value constructor, or *contained within* a range
253-
/// constructor.
254-
/// + When the pattern itself is a range, you just want to tell whether any of the values in
255-
/// the pattern range coincide with values in the constructor range, which is precisely
256-
/// intersection.
257-
/// Since when encountering a range pattern for a value constructor, we also use inclusion, it
258-
/// means that whenever the constructor is a value/range and the pattern is also a value/range,
259-
/// we can simply use intersection to test usefulness.
260-
/// - When we're testing for usefulness of a pattern and the pattern's first component is a
261-
/// wildcard.
262-
/// + If all the constructors appear in the matrix, we have a slight complication. By default,
263-
/// the behaviour (i.e., a disjunction over specialised matrices for each constructor) is
264-
/// invalid, because we want a disjunction over every *integer* in each range, not just a
265-
/// disjunction over every range. This is a bit more tricky to deal with: essentially we need
266-
/// to form equivalence classes of subranges of the constructor range for which the behaviour
267-
/// of the matrix `P` and new pattern `p` are the same. This is described in more
268-
/// detail in `split_grouped_constructors`.
269-
/// + If some constructors are missing from the matrix, it turns out we don't need to do
270-
/// anything special (because we know none of the integers are actually wildcards: i.e., we
271-
/// can't span wildcards using ranges).
1+
//! Note: most of the tests relevant to this file can be found (at the time of writing) in
2+
//! src/tests/ui/pattern/usefulness.
3+
//!
4+
//! This file includes the logic for exhaustiveness and usefulness checking for
5+
//! pattern-matching. Specifically, given a list of patterns for a type, we can
6+
//! tell whether:
7+
//! (a) the patterns cover every possible constructor for the type [exhaustiveness]
8+
//! (b) each pattern is necessary [usefulness]
9+
//!
10+
//! The algorithm implemented here is a modified version of the one described in:
11+
//! http://moscova.inria.fr/~maranget/papers/warn/index.html
12+
//! However, to save future implementors from reading the original paper, we
13+
//! summarise the algorithm here to hopefully save time and be a little clearer
14+
//! (without being so rigorous).
15+
//!
16+
//! # Premise
17+
//!
18+
//! The core of the algorithm revolves about a "usefulness" check. In particular, we
19+
//! are trying to compute a predicate `U(P, p)` where `P` is a list of patterns (we refer to this as
20+
//! a matrix). `U(P, p)` represents whether, given an existing list of patterns
21+
//! `P_1 ..= P_m`, adding a new pattern `p` will be "useful" (that is, cover previously-
22+
//! uncovered values of the type).
23+
//!
24+
//! If we have this predicate, then we can easily compute both exhaustiveness of an
25+
//! entire set of patterns and the individual usefulness of each one.
26+
//! (a) the set of patterns is exhaustive iff `U(P, _)` is false (i.e., adding a wildcard
27+
//! match doesn't increase the number of values we're matching)
28+
//! (b) a pattern `P_i` is not useful if `U(P[0..=(i-1), P_i)` is false (i.e., adding a
29+
//! pattern to those that have come before it doesn't increase the number of values
30+
//! we're matching).
31+
//!
32+
//! # Core concept
33+
//!
34+
//! The idea that powers everything that is done in this file is the following: a value is made
35+
//! from a constructor applied to some fields. Examples of constructors are `Some`, `None`, `(,)`
36+
//! (the 2-tuple constructor), `Foo {..}` (the constructor for a struct `Foo`), and `2` (the
37+
//! constructor for the number `2`). Fields are just a (possibly empty) list of values.
38+
//!
39+
//! Some of the constructors listed above might feel weird: `None` and `2` don't take any
40+
//! arguments. This is part of what makes constructors so general: we will consider plain values
41+
//! like numbers and string literals to be constructors that take no arguments, also called "0-ary
42+
//! constructors"; they are the simplest case of constructors. This allows us to see any value as
43+
//! made up from a tree of constructors, each having a given number of children. For example:
44+
//! `(None, Ok(0))` is made from 4 different constructors.
45+
//!
46+
//! This idea can be extended to patterns: a pattern captures a set of possible values, and we can
47+
//! describe this set using constructors. For example, `Err(_)` captures all values of the type
48+
//! `Result<T, E>` that start with the `Err` constructor (for some choice of `T` and `E`). The
49+
//! wildcard `_` captures all values of the given type starting with any of the constructors for
50+
//! that type.
51+
//!
52+
//! We use this to compute whether different patterns might capture a same value. Do the patterns
53+
//! `Ok("foo")` and `Err(_)` capture a common value? The answer is no, because the first pattern
54+
//! captures only values starting with the `Ok` constructor and the second only values starting
55+
//! with the `Err` constructor. Do the patterns `Some(42)` and `Some(1..10)` intersect? They might,
56+
//! since they both capture values starting with `Some`. To be certain, we need to dig under the
57+
//! `Some` constructor and continue asking the question. This is the main idea behind the
58+
//! exhaustiveness algorithm: by looking at patterns constructor-by-constructor, we can efficiently
59+
//! figure out if some new pattern might capture a value that hadn't been captured by previous
60+
//! patterns.
61+
//!
62+
//! Constructors are represented by the `Constructor` enum, and its fields by the `Fields` enum.
63+
//! Most of the complexity of this file resides in transforming between patterns and
64+
//! (`Constructor`, `Fields`) pairs, handling all the special cases correctly.
65+
//!
66+
//! Caveat: this constructors/fields distinction doesn't quite cover every Rust value. For example
67+
//! a value of type `Rc<u64>` doesn't fit this idea very well, nor do various other things.
68+
//! However, this idea covers most of the cases that are relevant to exhaustiveness checking.
69+
//!
70+
//!
71+
//! # Algorithm
72+
//!
73+
//! Recall that `U(P, p)` represents whether, given an existing list of patterns (aka matrix) `P`,
74+
//! adding a new pattern `p` will cover previously-uncovered values of the type.
75+
//! During the course of the algorithm, the rows of the matrix won't just be individual patterns,
76+
//! but rather partially-deconstructed patterns in the form of a list of fields. The paper
77+
//! calls those pattern-vectors, and we will call them pattern-stacks. The same holds for the
78+
//! new pattern `p`.
79+
//!
80+
//! For example, say we have the following:
81+
//! ```
82+
//! // x: (Option<bool>, Result<()>)
83+
//! match x {
84+
//! (Some(true), _) => {}
85+
//! (None, Err(())) => {}
86+
//! (None, Err(_)) => {}
87+
//! }
88+
//! ```
89+
//! Here, the matrix `P` starts as:
90+
//! [
91+
//! [(Some(true), _)],
92+
//! [(None, Err(()))],
93+
//! [(None, Err(_))],
94+
//! ]
95+
//! We can tell it's not exhaustive, because `U(P, _)` is true (we're not covering
96+
//! `[(Some(false), _)]`, for instance). In addition, row 3 is not useful, because
97+
//! all the values it covers are already covered by row 2.
98+
//!
99+
//! A list of patterns can be thought of as a stack, because we are mainly interested in the top of
100+
//! the stack at any given point, and we can pop or apply constructors to get new pattern-stacks.
101+
//! To match the paper, the top of the stack is at the beginning / on the left.
102+
//!
103+
//! There are two important operations on pattern-stacks necessary to understand the algorithm:
104+
//! 1. We can pop a given constructor off the top of a stack. This operation is called
105+
//! `specialize`, and is denoted `S(c, p)` where `c` is a constructor (like `Some` or
106+
//! `None`) and `p` a pattern-stack.
107+
//! If the pattern on top of the stack can cover `c`, this removes the constructor and
108+
//! pushes its arguments onto the stack. It also expands OR-patterns into distinct patterns.
109+
//! Otherwise the pattern-stack is discarded.
110+
//! This essentially filters those pattern-stacks whose top covers the constructor `c` and
111+
//! discards the others.
112+
//!
113+
//! For example, the first pattern above initially gives a stack `[(Some(true), _)]`. If we
114+
//! pop the tuple constructor, we are left with `[Some(true), _]`, and if we then pop the
115+
//! `Some` constructor we get `[true, _]`. If we had popped `None` instead, we would get
116+
//! nothing back.
117+
//!
118+
//! This returns zero or more new pattern-stacks, as follows. We look at the pattern `p_1`
119+
//! on top of the stack, and we have four cases:
120+
//! 1.1. `p_1 = c(r_1, .., r_a)`, i.e. the top of the stack has constructor `c`. We
121+
//! push onto the stack the arguments of this constructor, and return the result:
122+
//! r_1, .., r_a, p_2, .., p_n
123+
//! 1.2. `p_1 = c'(r_1, .., r_a')` where `c ≠ c'`. We discard the current stack and
124+
//! return nothing.
125+
//! 1.3. `p_1 = _`. We push onto the stack as many wildcards as the constructor `c` has
126+
//! arguments (its arity), and return the resulting stack:
127+
//! _, .., _, p_2, .., p_n
128+
//! 1.4. `p_1 = r_1 | r_2`. We expand the OR-pattern and then recurse on each resulting
129+
//! stack:
130+
//! S(c, (r_1, p_2, .., p_n))
131+
//! S(c, (r_2, p_2, .., p_n))
132+
//!
133+
//! 2. We can pop a wildcard off the top of the stack. This is called `D(p)`, where `p` is
134+
//! a pattern-stack.
135+
//! This is used when we know there are missing constructor cases, but there might be
136+
//! existing wildcard patterns, so to check the usefulness of the matrix, we have to check
137+
//! all its *other* components.
138+
//!
139+
//! It is computed as follows. We look at the pattern `p_1` on top of the stack,
140+
//! and we have three cases:
141+
//! 1.1. `p_1 = c(r_1, .., r_a)`. We discard the current stack and return nothing.
142+
//! 1.2. `p_1 = _`. We return the rest of the stack:
143+
//! p_2, .., p_n
144+
//! 1.3. `p_1 = r_1 | r_2`. We expand the OR-pattern and then recurse on each resulting
145+
//! stack.
146+
//! D((r_1, p_2, .., p_n))
147+
//! D((r_2, p_2, .., p_n))
148+
//!
149+
//! Note that the OR-patterns are not always used directly in Rust, but are used to derive the
150+
//! exhaustive integer matching rules, so they're written here for posterity.
151+
//!
152+
//! Both those operations extend straightforwardly to a list or pattern-stacks, i.e. a matrix, by
153+
//! working row-by-row. Popping a constructor ends up keeping only the matrix rows that start with
154+
//! the given constructor, and popping a wildcard keeps those rows that start with a wildcard.
155+
//!
156+
//!
157+
//! The algorithm for computing `U`
158+
//! -------------------------------
159+
//! The algorithm is inductive (on the number of columns: i.e., components of tuple patterns).
160+
//! That means we're going to check the components from left-to-right, so the algorithm
161+
//! operates principally on the first component of the matrix and new pattern-stack `p`.
162+
//! This algorithm is realised in the `is_useful` function.
163+
//!
164+
//! Base case. (`n = 0`, i.e., an empty tuple pattern)
165+
//! - If `P` already contains an empty pattern (i.e., if the number of patterns `m > 0`),
166+
//! then `U(P, p)` is false.
167+
//! - Otherwise, `P` must be empty, so `U(P, p)` is true.
168+
//!
169+
//! Inductive step. (`n > 0`, i.e., whether there's at least one column
170+
//! [which may then be expanded into further columns later])
171+
//! We're going to match on the top of the new pattern-stack, `p_1`.
172+
//! - If `p_1 == c(r_1, .., r_a)`, i.e. we have a constructor pattern.
173+
//! Then, the usefulness of `p_1` can be reduced to whether it is useful when
174+
//! we ignore all the patterns in the first column of `P` that involve other constructors.
175+
//! This is where `S(c, P)` comes in:
176+
//! `U(P, p) := U(S(c, P), S(c, p))`
177+
//! This special case is handled in `is_useful_specialized`.
178+
//!
179+
//! For example, if `P` is:
180+
//! [
181+
//! [Some(true), _],
182+
//! [None, 0],
183+
//! ]
184+
//! and `p` is [Some(false), 0], then we don't care about row 2 since we know `p` only
185+
//! matches values that row 2 doesn't. For row 1 however, we need to dig into the
186+
//! arguments of `Some` to know whether some new value is covered. So we compute
187+
//! `U([[true, _]], [false, 0])`.
188+
//!
189+
//! - If `p_1 == _`, then we look at the list of constructors that appear in the first
190+
//! component of the rows of `P`:
191+
//! + If there are some constructors that aren't present, then we might think that the
192+
//! wildcard `_` is useful, since it covers those constructors that weren't covered
193+
//! before.
194+
//! That's almost correct, but only works if there were no wildcards in those first
195+
//! components. So we need to check that `p` is useful with respect to the rows that
196+
//! start with a wildcard, if there are any. This is where `D` comes in:
197+
//! `U(P, p) := U(D(P), D(p))`
198+
//!
199+
//! For example, if `P` is:
200+
//! [
201+
//! [_, true, _],
202+
//! [None, false, 1],
203+
//! ]
204+
//! and `p` is [_, false, _], the `Some` constructor doesn't appear in `P`. So if we
205+
//! only had row 2, we'd know that `p` is useful. However row 1 starts with a
206+
//! wildcard, so we need to check whether `U([[true, _]], [false, 1])`.
207+
//!
208+
//! + Otherwise, all possible constructors (for the relevant type) are present. In this
209+
//! case we must check whether the wildcard pattern covers any unmatched value. For
210+
//! that, we can think of the `_` pattern as a big OR-pattern that covers all
211+
//! possible constructors. For `Option`, that would mean `_ = None | Some(_)` for
212+
//! example. The wildcard pattern is useful in this case if it is useful when
213+
//! specialized to one of the possible constructors. So we compute:
214+
//! `U(P, p) := ∃(k ϵ constructors) U(S(k, P), S(k, p))`
215+
//!
216+
//! For example, if `P` is:
217+
//! [
218+
//! [Some(true), _],
219+
//! [None, false],
220+
//! ]
221+
//! and `p` is [_, false], both `None` and `Some` constructors appear in the first
222+
//! components of `P`. We will therefore try popping both constructors in turn: we
223+
//! compute U([[true, _]], [_, false]) for the `Some` constructor, and U([[false]],
224+
//! [false]) for the `None` constructor. The first case returns true, so we know that
225+
//! `p` is useful for `P`. Indeed, it matches `[Some(false), _]` that wasn't matched
226+
//! before.
227+
//!
228+
//! - If `p_1 == r_1 | r_2`, then the usefulness depends on each `r_i` separately:
229+
//! `U(P, p) := U(P, (r_1, p_2, .., p_n))
230+
//! || U(P, (r_2, p_2, .., p_n))`
231+
//!
232+
//! Modifications to the algorithm
233+
//! ------------------------------
234+
//! The algorithm in the paper doesn't cover some of the special cases that arise in Rust, for
235+
//! example uninhabited types and variable-length slice patterns. These are drawn attention to
236+
//! throughout the code below. I'll make a quick note here about how exhaustive integer matching is
237+
//! accounted for, though.
238+
//!
239+
//! Exhaustive integer matching
240+
//! ---------------------------
241+
//! An integer type can be thought of as a (huge) sum type: 1 | 2 | 3 | ...
242+
//! So to support exhaustive integer matching, we can make use of the logic in the paper for
243+
//! OR-patterns. However, we obviously can't just treat ranges x..=y as individual sums, because
244+
//! they are likely gigantic. So we instead treat ranges as constructors of the integers. This means
245+
//! that we have a constructor *of* constructors (the integers themselves). We then need to work
246+
//! through all the inductive step rules above, deriving how the ranges would be treated as
247+
//! OR-patterns, and making sure that they're treated in the same way even when they're ranges.
248+
//! There are really only four special cases here:
249+
//! - When we match on a constructor that's actually a range, we have to treat it as if we would
250+
//! an OR-pattern.
251+
//! + It turns out that we can simply extend the case for single-value patterns in
252+
//! `specialize` to either be *equal* to a value constructor, or *contained within* a range
253+
//! constructor.
254+
//! + When the pattern itself is a range, you just want to tell whether any of the values in
255+
//! the pattern range coincide with values in the constructor range, which is precisely
256+
//! intersection.
257+
//! Since when encountering a range pattern for a value constructor, we also use inclusion, it
258+
//! means that whenever the constructor is a value/range and the pattern is also a value/range,
259+
//! we can simply use intersection to test usefulness.
260+
//! - When we're testing for usefulness of a pattern and the pattern's first component is a
261+
//! wildcard.
262+
//! + If all the constructors appear in the matrix, we have a slight complication. By default,
263+
//! the behaviour (i.e., a disjunction over specialised matrices for each constructor) is
264+
//! invalid, because we want a disjunction over every *integer* in each range, not just a
265+
//! disjunction over every range. This is a bit more tricky to deal with: essentially we need
266+
//! to form equivalence classes of subranges of the constructor range for which the behaviour
267+
//! of the matrix `P` and new pattern `p` are the same. This is described in more
268+
//! detail in `split_grouped_constructors`.
269+
//! + If some constructors are missing from the matrix, it turns out we don't need to do
270+
//! anything special (because we know none of the integers are actually wildcards: i.e., we
271+
//! can't span wildcards using ranges).
272272
use self::Constructor::*;
273273
use self::SliceKind::*;
274274
use self::Usefulness::*;

0 commit comments

Comments
 (0)
Please sign in to comment.