diff --git a/README.md b/README.md index 9a0b5856700..c956700f0bd 100644 --- a/README.md +++ b/README.md @@ -254,12 +254,12 @@ consensus and community norms, not impose more structure than necessary. ## License [License]: #license -This repository is currently in the process of being licensed under either of +Licensed under either of * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0) * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT) -at your option. Some parts of the repository are already licensed according to those terms. For more see [RFC 2044](https://github.com/rust-lang/rfcs/pull/2044) and its [tracking issue](https://github.com/rust-lang/rust/issues/43461). +at your option. ### Contributions diff --git a/text/0066-better-temporary-lifetimes.md b/text/0066-better-temporary-lifetimes.md index 7f2fdad2b24..a64e56dee1c 100644 --- a/text/0066-better-temporary-lifetimes.md +++ b/text/0066-better-temporary-lifetimes.md @@ -1,62 +1 @@ -- Start Date: 2014-05-04 -- RFC PR: [rust-lang/rfcs#66](https://github.com/rust-lang/rfcs/pull/66) -- Rust Issue: [rust-lang/rust#15023](https://github.com/rust-lang/rust/issues/15023) - -# Summary - -Temporaries live for the enclosing block when found in a let-binding. This only -holds when the reference to the temporary is taken directly. This logic should -be extended to extend the cleanup scope of any temporary whose lifetime ends up -in the let-binding. - -For example, the following doesn't work now, but should: - -```rust -use std::os; - -fn main() { - let x = os::args().slice_from(1); - println!("{}", x); -} -``` - -# Motivation - -Temporary lifetimes are a bit confusing right now. Sometimes you can keep -references to them, and sometimes you get the dreaded "borrowed value does not -live long enough" error. Sometimes one operation works but an equivalent -operation errors, e.g. autoref of `~[T]` to `&[T]` works but calling -`.as_slice()` doesn't. In general it feels as though the compiler is simply -being overly restrictive when it decides the temporary doesn't live long -enough. - -# Drawbacks - -I can't think of any drawbacks. - -# Detailed design - -When a reference to a temporary is passed to a function (either as a regular -argument or as the `self` argument of a method), and the function returns a -value with the same lifetime as the temporary reference, the lifetime of the -temporary should be extended the same way it would if the function was not -invoked. - -For example, `~[T].as_slice()` takes `&'a self` and returns `&'a [T]`. Calling -`as_slice()` on a temporary of type `~[T]` will implicitly take a reference -`&'a ~[T]` and return a value `&'a [T]` This return value should be considered -to extend the lifetime of the `~[T]` temporary just as taking an explicit -reference (and skipping the method call) would. - -# Alternatives - -Don't do this. We live with the surprising borrowck errors and the ugly workarounds that look like - -```rust -let x = os::args(); -let x = x.slice_from(1); -``` - -# Unresolved questions - -None that I know of. +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0066-better-temporary-lifetimes.md). diff --git a/text/0114-closures.md b/text/0114-closures.md index adc63f44f0c..9cd1954f0d8 100644 --- a/text/0114-closures.md +++ b/text/0114-closures.md @@ -80,7 +80,7 @@ invocation of one of the following traits: } trait FnOnce { - fn call_once(self, args: A) -> R; + fn call_once(&self, args: A) -> R; } Essentially, `a(b, c, d)` becomes sugar for one of the following: @@ -199,8 +199,8 @@ more programs successfully typecheck. ### By-reference closures A *by-reference* closure is a convenience form in which values used in -the closure are converted into references before being captured. -By-reference closures are always rewritable into by-value closures if +the closure are converted into references before being captured. By +reference closures are always rewritable into by value closures if desired, but the rewrite can often be cumbersome and annoying. Here is a (rather artificial) example of a by-reference closure in @@ -368,7 +368,7 @@ TBD. pcwalton is working furiously as we speak. # Unresolved questions -**What relationship should there be between the closure +**What if any relationship should there be between the closure traits?** On the one hand, there is clearly a relationship between the traits. For example, given a `FnShare`, one can easily implement `Fn`: diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index 4540b3a3904..178b20b1ca5 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -1,1444 +1 @@ -- Start Date: 2014-08-04 -- RFC PR #: [rust-lang/rfcs#195](https://github.com/rust-lang/rfcs/pull/195) -- Rust Issue #: [rust-lang/rust#17307](https://github.com/rust-lang/rust/issues/17307) - -# Summary - -This RFC extends traits with *associated items*, which make generic programming -more convenient, scalable, and powerful. In particular, traits will consist of a -set of methods, together with: - -* Associated functions (already present as "static" functions) -* Associated consts -* Associated types -* Associated lifetimes - -These additions make it much easier to group together a set of related types, -functions, and constants into a single package. - -This RFC also provides a mechanism for *multidispatch* traits, where the `impl` -is selected based on multiple types. The connection to associated items will -become clear in the detailed text below. - -*Note: This RFC was originally accepted before RFC 246 introduced the -distinction between const and static items. The text has been updated to clarify -that associated consts will be added rather than statics, and to provide a -summary of restrictions on the initial implementation of associated -consts. Other than that modification, the proposal has not been changed to -reflect newer Rust features or syntax.* - -# Motivation - -A typical example where associated items are helpful is data structures like -graphs, which involve at least three types: nodes, edges, and the graph itself. - -In today's Rust, to capture graphs as a generic trait, you have to take the -additional types associated with a graph as _parameters_: - -```rust -trait Graph { - fn has_edge(&self, &N, &N) -> bool; - ... -} -``` - -The fact that the node and edge types are parameters is confusing, since any -concrete graph type is associated with a *unique* node and edge type. It is also -inconvenient, because code working with generic graphs is likewise forced to -parameterize, even when not all of the types are relevant: - -```rust -fn distance>(graph: &G, start: &N, end: &N) -> uint { ... } -``` - -With associated types, the graph trait can instead make clear that the node and -edge types are determined by any `impl`: - -```rust -trait Graph { - type N; - type E; - fn has_edge(&self, &N, &N) -> bool; -} -``` - -and clients can abstract over them all at once, referring to them through the -graph type: - -```rust -fn distance(graph: &G, start: &G::N, end: &G::N) -> uint { ... } -``` - -The following subsections expand on the above benefits of associated items, as -well as some others. - -## Associated types: engineering benefits for generics - -As the graph example above illustrates, associated _types_ do not increase the -expressiveness of traits _per se_, because you can always use extra type -parameters to a trait instead. However, associated types provide several -engineering benefits: - -* **Readability and scalability** - - Associated types make it possible to abstract over a whole family of types at - once, without having to separately name each of them. This improves the - readability of generic code (like the `distance` function above). It also - makes generics more "scalable": traits can incorporate additional associated - types without imposing an extra burden on clients that don't care about those - types. - - In today's Rust, by contrast, adding additional generic parameters to a - trait often feels like a very "heavyweight" move. - -* **Ease of refactoring/evolution** - - Because users of a trait do not have to separately parameterize over its - associated types, new associated types can be added without breaking all - existing client code. - - In today's Rust, by contrast, associated types can only be added by adding - more type parameters to a trait, which breaks all code mentioning the trait. - -## Clearer trait matching - -Type parameters to traits can either be "inputs" or "outputs": - -* **Inputs**. An "input" type parameter is used to _determine_ which `impl` to - use. - -* **Outputs**. An "output" type parameter is uniquely determined _by_ the - `impl`, but plays no role in selecting the `impl`. - -Input and output types play an important role for type inference and trait -coherence rules, which is described in more detail later on. - -In the vast majority of current libraries, the only input type is the `Self` -type implementing the trait, and all other trait type parameters are outputs. -For example, the trait `Iterator` takes a type parameter `A` for the elements -being iterated over, but this type is always determined by the concrete `Self` -type (e.g. `Items`) implementing the trait: `A` is typically an output. - -Additional input type parameters are useful for cases like binary operators, -where you may want the `impl` to depend on the types of *both* -arguments. For example, you might want a trait - -```rust -trait Add { - fn add(&self, rhs: &Rhs) -> Sum; -} -``` - -to view the `Self` and `Rhs` types as inputs, and the `Sum` type as an output -(since it is uniquely determined by the argument types). This would allow -`impl`s to vary depending on the `Rhs` type, even though the `Self` type is the same: - -```rust -impl Add for int { ... } -impl Add for int { ... } -``` - -Today's Rust does not make a clear distinction between input and output type -parameters to traits. If you attempted to provide the two `impl`s above, you -would receive an error like: - -``` -error: conflicting implementations for trait `Add` -``` - -This RFC clarifies trait matching by: - -* Treating all trait type parameters as *input* types, and -* Providing associated types, which are *output* types. - -In this design, the `Add` trait would be written and implemented as follows: - -```rust -// Self and Rhs are *inputs* -trait Add { - type Sum; // Sum is an *output* - fn add(&self, &Rhs) -> Sum; -} - -impl Add for int { - type Sum = int; - fn add(&self, rhs: &int) -> int { ... } -} - -impl Add for int { - type Sum = Complex; - fn add(&self, rhs: &Complex) -> Complex { ... } -} -``` - -With this approach, a trait declaration like `trait Add { ... }` is really -defining a *family* of traits, one for each choice of `Rhs`. One can then -provide a distinct `impl` for every member of this family. - -## Expressiveness - -Associated types, lifetimes, and functions can already be expressed in today's -Rust, though it is unwieldy to do so (as argued above). - -But associated _consts_ cannot be expressed using today's traits. - -For example, today's Rust includes a variety of numeric traits, including -`Float`, which must currently expose constants as static functions: - -```rust -trait Float { - fn nan() -> Self; - fn infinity() -> Self; - fn neg_infinity() -> Self; - fn neg_zero() -> Self; - fn pi() -> Self; - fn two_pi() -> Self; - ... -} -``` - -Because these functions cannot be used in constant expressions, the modules for -float types _also_ export a separate set of constants as consts, not using -traits. - -Associated constants would allow the consts to live directly on the traits: - -```rust -trait Float { - const NAN: Self; - const INFINITY: Self; - const NEG_INFINITY: Self; - const NEG_ZERO: Self; - const PI: Self; - const TWO_PI: Self; - ... -} -``` - -## Why now? - -The above motivations aside, it may not be obvious why adding associated types -*now* (i.e., pre-1.0) is important. There are essentially two reasons. - -First, the design presented here is *not* backwards compatible, because it -re-interprets trait type parameters as inputs for the purposes of trait -matching. The input/output distinction has several ramifications on coherence -rules, type inference, and resolution, which are all described later on in the -RFC. - -Of course, it might be possible to give a somewhat less ideal design where -associated types can be added later on without changing the interpretation of -existing trait type parameters. For example, type parameters could be explicitly -marked as inputs, and otherwise assumed to be outputs. That would be -unfortunate, since associated types would *also* be outputs -- leaving the -language with two ways of specifying output types for traits. - -But the second reason is for the library stabilization process: - -* Since most existing uses of trait type parameters are intended as outputs, - they should really be associated types instead. Making promises about these APIs - as they currently stand risks locking the libraries into a design that will seem - obsolete as soon as associated items are added. Again, this risk could probably - be mitigated with a different, backwards-compatible associated item design, but - at the cost of cruft in the language itself. - -* The binary operator traits (e.g. `Add`) should be multidispatch. It does not - seem possible to stabilize them *now* in a way that will support moving to - multidispatch later. - -* There are some thorny problems in the current libraries, such as the `_equiv` - methods accumulating in `HashMap`, that can be solved using associated - items. (See "Defaults" below for more on this specific example.) Additional - examples include traits for error propagation and for conversion (to be - covered in future RFCs). Adding these traits would improve the quality and - consistency of our 1.0 library APIs. - -# Detailed design - -## Trait headers - -Trait headers are written according to the following grammar: - -``` -TRAIT_HEADER = - 'trait' IDENT [ '<' INPUT_PARAMS '>' ] [ ':' BOUNDS ] [ WHERE_CLAUSE ] - -INPUT_PARAMS = INPUT_TY { ',' INPUT_TY }* [ ',' ] -INPUT_PARAM = IDENT [ ':' BOUNDS ] - -BOUNDS = BOUND { '+' BOUND }* [ '+' ] -BOUND = IDENT [ '<' ARGS '>' ] - -ARGS = INPUT_ARGS - | OUTPUT_CONSTRAINTS - | INPUT_ARGS ',' OUTPUT_CONSTRAINTS - -INPUT_ARGS = TYPE { ',' TYPE }* - -OUTPUT_CONSTRAINTS = OUTPUT_CONSTRAINT { ',' OUTPUT_CONSTRAINT }* -OUTPUT_CONSTRAINT = IDENT '=' TYPE -``` - -**NOTE**: The grammar for `WHERE_CLAUSE` and `BOUND` is explained in detail in - the subsection "Constraining associated types" below. - -All type parameters to a trait are considered inputs, and can be used to select -an `impl`; conceptually, each distinct instantiation of the types yields a -distinct trait. More details are given in the section "The input/output type -distinction" below. - -## Trait bodies: defining associated items - -Trait bodies are expanded to include three new kinds of items: consts, types, -and lifetimes: - -``` -TRAIT = TRAIT_HEADER '{' TRAIT_ITEM* '}' -TRAIT_ITEM = - ... - | 'const' IDENT ':' TYPE [ '=' CONST_EXP ] ';' - | 'type' IDENT [ ':' BOUNDS ] [ WHERE_CLAUSE ] [ '=' TYPE ] ';' - | 'lifetime' LIFETIME_IDENT ';' -``` - -Traits already support associated functions, which had previously been called -"static" functions. - -The `BOUNDS` and `WHERE_CLAUSE` on associated types are *obligations* for the -implementor of the trait, and *assumptions* for users of the trait: - -```rust -trait Graph { - type N: Show + Hash; - type E: Show + Hash; - ... -} - -impl Graph for MyGraph { - // Both MyNode and MyEdge must implement Show and Hash - type N = MyNode; - type E = MyEdge; - ... -} - -fn print_nodes(g: &G) { - // here, can assume G::N implements Show - ... -} -``` - -### Namespacing/shadowing for associated types - -Associated types may have the same name as existing types in scope, *except* for -type parameters to the trait: - -```rust -struct Foo { ... } - -trait Bar { - type Foo; // this is allowed - fn into_foo(self) -> Foo; // this refers to the trait's Foo - - type Input; // this is NOT allowed -} -``` - -By not allowing name clashes between input and output types, -keep open the possibility of later allowing syntax like: - -```rust -Bar -``` - -where both input and output parameters are constrained by name. And anyway, -there is no use for clashing input/output names. - -In the case of a name clash like `Foo` above, if the trait needs to refer to the -outer `Foo` for some reason, it can always do so by using a `type` alias -external to the trait. - -### Defaults - -Notice that associated consts and types both permit defaults, just as trait -methods and functions can provide defaults. - -Defaults are useful both as a code reuse mechanism, and as a way to expand the -items included in a trait without breaking all existing implementors of the -trait. - -Defaults for associated types, however, present an interesting question: can -default methods assume the default type? In other words, is the following -allowed? - -```rust -trait ContainerKey : Clone + Hash + Eq { - type Query: Hash = Self; - fn compare(&self, other: &Query) -> bool { self == other } - fn query_to_key(q: &Query) -> Self { q.clone() }; -} - -impl ContainerKey for String { - type Query = str; - fn compare(&self, other: &str) -> bool { - self.as_slice() == other - } - fn query_to_key(q: &str) -> String { - q.into_string() - } -} - -impl HashMap where K: ContainerKey { - fn find(&self, q: &K::Query) -> &V { ... } -} -``` - -In this example, the `ContainerKey` trait is used to associate a "`Query`" type -(for lookups) with an owned key type. This resolves the thorny "equiv" problem -in `HashMap`, where the hash map keys are `String`s but you want to index the -hash map with `&str` values rather than `&String` values, i.e. you want the -following to work: - -```rust -// H: HashMap -H.find("some literal") -``` - -rather than having to write - -```rust -H.find(&"some literal".to_string())` -``` - -The current solution involves duplicating the API surface with `_equiv` methods -that use the somewhat subtle `Equiv` trait, but the associated type approach -makes it easy to provide a simple, single API that covers the same use cases. - -The defaults for `ContainerKey` just assume that the owned key and lookup key -types are the same, but the default methods have to assume the default -associated types in order to work. - -For this to work, it must *not* be possible for an implementor of `ContainerKey` -to override the default `Query` type while leaving the default methods in place, -since those methods may no longer typecheck. - -We deal with this in a very simple way: - -* If a trait implementor overrides any default associated types, they must also - override *all* default functions and methods. - -* Otherwise, a trait implementor can selectively override individual default - methods/functions, as they can today. - -## Trait implementations - -Trait `impl` syntax is much the same as before, except that const, type, and -lifetime items are allowed: - -``` -IMPL_ITEM = - ... - | 'const' IDENT ':' TYPE '=' CONST_EXP ';' - | 'type' IDENT' '=' 'TYPE' ';' - | 'lifetime' LIFETIME_IDENT '=' LIFETIME_REFERENCE ';' -``` - -Any `type` implementation must satisfy all bounds and where clauses in the -corresponding trait item. - -## Referencing associated items - -Associated items are referenced through paths. The expression path grammar was -updated as part of [UFCS](https://github.com/rust-lang/rfcs/pull/132), but to -accommodate associated types and lifetimes we need to update the type path -grammar as well. - -The full grammar is as follows: - -``` -EXP_PATH - = EXP_ID_SEGMENT { '::' EXP_ID_SEGMENT }* - | TYPE_SEGMENT { '::' EXP_ID_SEGMENT }+ - | IMPL_SEGMENT { '::' EXP_ID_SEGMENT }+ -EXP_ID_SEGMENT = ID [ '::' '<' TYPE { ',' TYPE }* '>' ] - -TY_PATH - = TY_ID_SEGMENT { '::' TY_ID_SEGMENT }* - | TYPE_SEGMENT { '::' TY_ID_SEGMENT }* - | IMPL_SEGMENT { '::' TY_ID_SEGMENT }+ - -TYPE_SEGMENT = '<' TYPE '>' -IMPL_SEGMENT = '<' TYPE 'as' TRAIT_REFERENCE '>' -TRAIT_REFERENCE = ID [ '<' TYPE { ',' TYPE * '>' ] -``` - -Here are some example paths, along with what they might be referencing - -```rust -// Expression paths /////////////////////////////////////////////////////////////// - -a::b::c // reference to a function `c` in module `a::b` -a:: // the function `a` instantiated with type arguments `T1`, `T2` -Vec::::new // reference to the function `new` associated with `Vec` - as SomeTrait>::some_fn - // reference to the function `some_fn` associated with `SomeTrait`, - // as implemented by `Vec` -T::size_of // the function `size_of` associated with the type or trait `T` -::size_of // the function `size_of` associated with `T` _viewed as a type_ -::size_of - // the function `size_of` associated with `T`'s impl of `SizeOf` - -// Type paths ///////////////////////////////////////////////////////////////////// - -a::b::C // reference to a type `C` in module `a::b` -A // type A instantiated with type arguments `T1`, `T2` -Vec::Iter // reference to the type `Iter` associated with `Vec - as SomeTrait>::SomeType - // reference to the type `SomeType` associated with `SomeTrait`, - // as implemented by `Vec` -``` - -### Ways to reference items - -Next, we'll go into more detail on the meaning of each kind of path. - -For the sake of discussion, we'll suppose we've defined a trait like the -following: - -```rust -trait Container { - type E; - fn empty() -> Self; - fn insert(&mut self, E); - fn contains(&self, &E) -> bool where E: PartialEq; - ... -} - -impl Container for Vec { - type E = T; - fn empty() -> Vec { Vec::new() } - ... -} -``` - -#### Via an `ID_SEGMENT` prefix - -##### When the prefix resolves to a type - -The most common way to get at an associated item is through a type parameter -with a trait bound: - -```rust -fn pick(c: &C) -> Option<&C::E> { ... } - -fn mk_with_two() -> C where C: Container, C::E = uint { - let mut cont = C::empty(); // reference to associated function - cont.insert(0); - cont.insert(1); - cont -} -``` - -For these references to be valid, the type parameter must be known to implement -the relevant trait: - -```rust -// Knowledge via bounds -fn pick(c: &C) -> Option<&C::E> { ... } - -// ... or equivalently, where clause -fn pick(c: &C) -> Option<&C::E> where C: Container { ... } - -// Knowledge via ambient constraints -struct TwoContainers(C1, C2); -impl TwoContainers { - fn pick_one(&self) -> Option<&C1::E> { ... } - fn pick_other(&self) -> Option<&C2::E> { ... } -} -``` - -Note that `Vec::E` and `Vec::::empty` are also valid type and function -references, respectively. - -For cases like `C::E` or `Vec::E`, the path begins with an `ID_SEGMENT` -prefix that itself resolves to a _type_: both `C` and `Vec` are types. In -general, a path `PREFIX::REST_OF_PATH` where `PREFIX` resolves to a type is -equivalent to using a `TYPE_SEGMENT` prefix `::REST_OF_PATH`. So, for -example, following are all equivalent: - -```rust -fn pick(c: &C) -> Option<&C::E> { ... } -fn pick(c: &C) -> Option<&::E> { ... } -fn pick(c: &C) -> Option<&<::E>> { ... } -``` - -The behavior of `TYPE_SEGMENT` prefixes is described in the next subsection. - -##### When the prefix resolves to a trait - -However, it is possible for an `ID_SEGMENT` prefix to resolve to a *trait*, -rather than a type. In this case, the behavior of an `ID_SEGMENT` varies from -that of a `TYPE_SEGMENT` in the following way: - -```rust -// a reference Container::insert is roughly equivalent to: -fn trait_insert(c: &C, e: C::E); - -// a reference ::insert is roughly equivalent to: -fn object_insert(c: &Container, e: E); -``` - -That is, if `PREFIX` is an `ID_SEGMENT` that -resolves to a trait `Trait`: - -* A path `PREFIX::REST` resolves to the item/path `REST` defined within - `Trait`, while treating the type implementing the trait as a type parameter. - -* A path `::REST` treats `PREFIX` as a (DST-style) *type*, and is - hence usable only with trait objects. See the - [UFCS RFC](https://github.com/rust-lang/rfcs/pull/132) for more detail. - -Note that a path like `Container::E`, while grammatically valid, will fail to -resolve since there is no way to tell which `impl` to use. A path like -`Container::empty`, however, resolves to a function roughly equivalent to: - -```rust -fn trait_empty() -> C; -``` - -#### Via a `TYPE_SEGMENT` prefix - -> The following text is *slightly changed* from the -> [UFCS RFC](https://github.com/rust-lang/rfcs/pull/132). - -When a path begins with a `TYPE_SEGMENT`, it is a type-relative path. If this is -the complete path (e.g., ``), then the path resolves to the specified -type. If the path continues (e.g., `::size_of`) then the next segment is -resolved using the following procedure. The procedure is intended to mimic -method lookup, and hence any changes to method lookup may also change the -details of this lookup. - -Given a path `::m::...`: - -1. Search for members of inherent impls defined on `T` (if any) with - the name `m`. If any are found, the path resolves to that item. - -2. Otherwise, let `IN_SCOPE_TRAITS` be the set of traits that are in - scope and which contain a member named `m`: - - Let `IMPLEMENTED_TRAITS` be those traits from `IN_SCOPE_TRAITS` - for which an implementation exists that (may) apply to `T`. - - There can be ambiguity in the case that `T` contains type inference - variables. - - If `IMPLEMENTED_TRAITS` is not a singleton set, report an ambiguity - error. Otherwise, let `TRAIT` be the member of `IMPLEMENTED_TRAITS`. - - If `TRAIT` is ambiguously implemented for `T`, report an - ambiguity error and request further type information. - - Otherwise, rewrite the path to `::m::...` and - continue. - -#### Via a `IMPL_SEGMENT` prefix - -> The following text is *somewhat different* from the -> [UFCS RFC](https://github.com/rust-lang/rfcs/pull/132). - -When a path begins with an `IMPL_SEGMENT`, it is a reference to an item defined -from a trait. Note that such paths must always have a follow-on member `m` (that -is, `` is not a complete path, but `::m` is). - -To resolve the path, first search for an applicable implementation of `Trait` -for `T`. If no implementation can be found -- or the result is ambiguous -- then -report an error. Note that when `T` is a type parameter, a bound `T: Trait` -guarantees that there is such an implementation, but does not count for -ambiguity purposes. - -Otherwise, resolve the path to the member of the trait with the substitution -`Self => T` and continue. - -This apparently straightforward algorithm has some subtle consequences, as -illustrated by the following example: - -```rust -trait Foo { - type T; - fn as_T(&self) -> &T; -} - -// A blanket impl for any Show type T -impl Foo for T { - type T = T; - fn as_T(&self) -> &T { self } -} - -fn bounded(u: U) where U::T: Show { - // Here, we just constrain the associated type directly - println!("{}", u.as_T()) -} - -fn blanket(u: U) { - // the blanket impl applies to U, so we know that `U: Foo` and - // ::T = U (and, of course, U: Show) - println!("{}", u.as_T()) -} - -fn not_allowed(u: U) { - // this will not compile, since ::T is not known to - // implement Show - println!("{}", u.as_T()) -} -``` - -This example includes three generic functions that make use of an associated -type; the first two will typecheck, while the third will not. - -* The first case, `bounded`, places a `Show` constraint directly on the - otherwise-abstract associated type `U::T`. Hence, it is allowed to assume that - `U::T: Show`, even though it does not know the concrete implementation of - `Foo` for `U`. - -* The second case, `blanket`, places a `Show` constraint on the type `U`, which - means that the blanket `impl` of `Foo` applies even though we do not know the - *concrete* type that `U` will be. That fact means, moreover, that we can - compute exactly what the associated type `U::T` will be, and know that it will - satisfy `Show. Coherence guarantees that that the blanket `impl` is the only - one that could apply to `U`. (See the section "Impl specialization" under - "Unresolved questions" for a deeper discussion of this point.) - -* The third case assumes only that `U: Foo`, and therefore nothing is known - about the associated type `U::T`. In particular, the function cannot assume - that `U::T: Show`. - -The resolution rules also interact with instantiation of type parameters in an -intuitive way. For example: - -```rust -trait Graph { - type N; - type E; - ... -} - -impl Graph for MyGraph { - type N = MyNode; - type E = MyEdge; - ... -} - -fn pick_node(t: &G) -> &G::N { - // the type G::N is abstract here - ... -} - -let G = MyGraph::new(); -... -pick_node(G) // has type: ::N = MyNode -``` - -Assuming there are no blanket implementations of `Graph`, the `pick_node` -function knows nothing about the associated type `G::N`. However, a *client* of -`pick_node` that instantiates it with a particular concrete graph type will also -know the concrete type of the value returned from the function -- here, `MyNode`. - -## Scoping of `trait` and `impl` items - -Associated types are frequently referred to in the signatures of a trait's -methods and associated functions, and it is natural and convenient to refer to -them directly. - -In other words, writing this: - -```rust -trait Graph { - type N; - type E; - fn has_edge(&self, &N, &N) -> bool; - ... -} -``` - -is more appealing than writing this: - -```rust -trait Graph { - type N; - type E; - fn has_edge(&self, &Self::N, &Self::N) -> bool; - ... -} -``` - -This RFC proposes to treat both `trait` and `impl` bodies (both -inherent and for traits) the same way we treat `mod` bodies: *all* -items being defined are in scope. In particular, methods are in scope -as UFCS-style functions: - -```rust -trait Foo { - type AssocType; - lifetime 'assoc_lifetime; - const ASSOC_CONST: uint; - fn assoc_fn() -> Self; - - // Note: 'assoc_lifetime and AssocType in scope: - fn method(&self, Self) -> &'assoc_lifetime AssocType; - - fn default_method(&self) -> uint { - // method in scope UFCS-style, assoc_fn in scope - let _ = method(self, assoc_fn()); - ASSOC_CONST // in scope - } -} - -// Same scoping rules for impls, including inherent impls: -struct Bar; -impl Bar { - fn foo(&self) { ... } - fn bar(&self) { - foo(self); // foo in scope UFCS-style - ... - } -} -``` - -Items from super traits are *not* in scope, however. See -[the discussion on super traits below](#super-traits) for more detail. - -These scope rules provide good ergonomics for associated types in -particular, and a consistent scope model for language constructs that -can contain items (like traits, impls, and modules). In the long run, -we should also explore imports for trait items, i.e. `use -Trait::some_method`, but that is out of scope for this RFC. - -Note that, according to this proposal, associated types/lifetimes are *not* in -scope for the optional `where` clause on the trait header. For example: - -```rust -trait Foo - // type parameters in scope, but associated types are not: - where Bar: Encodable { - - type Output; - ... -} -``` - -This setup seems more intuitive than allowing the trait header to refer directly -to items defined within the trait body. - -It's also worth noting that *trait-level* `where` clauses are never needed for -constraining associated types anyway, because associated types also have `where` -clauses. Thus, the above example could (and should) instead be written as -follows: - -```rust -trait Foo { - type Output where Bar: Encodable; - ... -} -``` - -## Constraining associated types - -Associated types are not treated as parameters to a trait, but in some cases a -function will want to constrain associated types in some way. For example, as -explained in the Motivation section, the `Iterator` trait should treat the -element type as an output: - -```rust -trait Iterator { - type A; - fn next(&mut self) -> Option; - ... -} -``` - -For code that works with iterators generically, there is no need to constrain -this type: - -```rust -fn collect_into_vec(iter: I) -> Vec { ... } -``` - -But other code may have requirements for the element type: - -* That it implements some traits (bounds). -* That it unifies with a particular type. - -These requirements can be imposed via `where` clauses: - -```rust -fn print_iter(iter: I) where I: Iterator, I::A: Show { ... } -fn sum_uints(iter: I) where I: Iterator, I::A = uint { ... } -``` - -In addition, there is a shorthand for equality constraints: - -```rust -fn sum_uints>(iter: I) { ... } -``` - -In general, a trait like: - -```rust -trait Foo { - type Output1; - type Output2; - lifetime 'a; - const C: bool; - ... -} -``` - -can be written in a bound like: - -``` -T: Foo -T: Foo -T: Foo -T: Foo -T: Foo>(t: T) // this is valid -fn consume_obj(t: Box>) // this is NOT valid - -// but this IS valid: -fn consume_obj(t: Box; // what is the lifetime here? - fn iter<'a>(&'a self) -> I; // and how to connect it to self? -} -``` - -The problem is that, when implementing this trait, the return type `I` of `iter` -must generally depend on the *lifetime* of self. For example, the corresponding -method in `Vec` looks like the following: - -```rust -impl Vec { - fn iter(&'a self) -> Items<'a, T> { ... } -} -``` - -This means that, given a `Vec`, there isn't a *single* type `Items` for -iteration -- rather, there is a *family* of types, one for each input lifetime. -In other words, the associated type `I` in the `Iterable` needs to be -"higher-kinded": not just a single type, but rather a family: - -```rust -trait Iterable { - type A; - type I<'a>: Iterator<&'a A>; - fn iter<'a>(&self) -> I<'a>; -} -``` - -In this case, `I` is parameterized by a lifetime, but in other cases (like -`map`) an associated type needs to be parameterized by a type. - -In general, such higher-kinded types (HKTs) are a much-requested feature for -Rust, and they would extend the reach of associated types. But the design and -implementation of higher-kinded types is, by itself, a significant investment. -The point of view of this RFC is that associated items bring the most important -changes needed to stabilize our existing traits (and add a few key others), -while HKTs will allow us to define important traits in the future but are not -necessary for 1.0. - -### Encoding higher-kinded types - -That said, it's worth pointing out that variants of higher-kinded types can be -encoded in the system being proposed here. - -For example, the `Iterable` example above can be written in the following -somewhat contorted style: - -```rust -trait IterableOwned { - type A; - type I: Iterator; - fn iter_owned(self) -> I; -} - -trait Iterable { - fn iter<'a>(&'a self) -> <&'a Self>::I where &'a Self: IterableOwned { - IterableOwned::iter_owned(self) - } -} -``` - -The idea here is to define a trait that takes, as input type/lifetimes -parameters, the parameters to any HKTs. In this case, the trait is implemented -on the type `&'a Self`, which includes the lifetime parameter. - -We can in fact generalize this technique to encode arbitrary HKTs: - -```rust -// The kind * -> * -trait TypeToType { - type Output; -} -type Apply where Name: TypeToType = Name::Output; - -struct Vec_; -struct DList_; - -impl TypeToType for Vec_ { - type Output = Vec; -} - -impl TypeToType for DList_ { - type Output = DList; -} - -trait Mappable -{ - type E; - type HKT where Apply = Self; - - fn map(self, f: E -> F) -> Apply; -} -``` - -While the above demonstrates the versatility of associated types and `where` -clauses, it is probably too much of a hack to be viable for use in `libstd`. - -### Associated consts in generic code - -If the value of an associated const depends on a type parameter (including -`Self`), it cannot be used in a constant expression. This restriction will -almost certainly be lifted in the future, but this raises questions outside the -scope of this RFC. - -# Staging - -Associated lifetimes are probably not necessary for the 1.0 timeframe. While we -currently have a few traits that are parameterized by lifetimes, most of these -can go away once DST lands. - -On the other hand, associated lifetimes are probably trivial to implement once -associated types have been implemented. - -# Other interactions - -## Interaction with implied bounds - -As part of the -[implied bounds](http://smallcultfollowing.com/babysteps/blog/2014/07/06/implied-bounds/) -idea, it may be desirable for this: - -```rust -fn pick_node(g: &G) -> &::N -``` - -to be sugar for this: - -```rust -fn pick_node(g: &G) -> &::N -``` - -But this feature can easily be added later, as part of a general implied bounds RFC. - -## Future-proofing: specialization of `impl`s - -In the future, we may wish to relax the "overlapping instances" rule so that one -can provide "blanket" trait implementations and then "specialize" them for -particular types. For example: - -```rust -trait Sliceable { - type Slice; - // note: not using &self here to avoid need for HKT - fn as_slice(self) -> Slice; -} - -impl<'a, T> Sliceable for &'a T { - type Slice = &'a T; - fn as_slice(self) -> &'a T { self } -} - -impl<'a, T> Sliceable for &'a Vec { - type Slice = &'a [T]; - fn as_slice(self) -> &'a [T] { self.as_slice() } -} -``` - -But then there's a difficult question: - -``` -fn dice(a: &A) -> &A::Slice where &A: Slicable { - a // is this allowed? -} -``` - -Here, the blanket and specialized implementations provide incompatible -associated types. When working with the trait generically, what can we assume -about the associated type? If we assume it is the blanket one, the type may -change during monomorphization (when specialization takes effect)! - -The RFC *does* allow generic code to "see" associated types provided by blanket -implementations, so this is a potential problem. - -Our suggested strategy is the following. If at some later point we wish to add -specialization, traits would have to *opt in* explicitly. For such traits, we -would *not* allow generic code to "see" associated types for blanket -implementations; instead, output types would only be visible when all input -types were concretely known. This approach is backwards-compatible with the RFC, -and is probably a good idea in any case. - -# Alternatives - -## Multidispatch through tuple types - -This RFC clarifies trait matching by making trait type parameters inputs to -matching, and associated types outputs. - -A more radical alternative would be to *remove type parameters from traits*, and -instead support multiple input types through a separate multidispatch mechanism. - -In this design, the `Add` trait would be written and implemented as follows: - -```rust -// Lhs and Rhs are *inputs* -trait Add for (Lhs, Rhs) { - type Sum; // Sum is an *output* - fn add(&Lhs, &Rhs) -> Sum; -} - -impl Add for (int, int) { - type Sum = int; - fn add(left: &int, right: &int) -> int { ... } -} - -impl Add for (int, Complex) { - type Sum = Complex; - fn add(left: &int, right: &Complex) -> Complex { ... } -} -``` - -The `for` syntax in the trait definition is used for multidispatch traits, here -saying that `impl`s must be for pairs of types which are bound to `Lhs` and -`Rhs` respectively. The `add` function can then be invoked in UFCS style by -writing - -```rust -Add::add(some_int, some_complex) -``` - -*Advantages of the tuple approach*: - -- It does not force a distinction between `Self` and other input types, which in - some cases (including binary operators like `Add`) can be artificial. - -- Makes it possible to specify input types without specifying the trait: - `<(A, B)>::Sum` rather than `>::Sum`. - -*Disadvantages of the tuple approach*: - -- It's more painful when you *do* want a method rather than a function. - -- Requires `where` clauses when used in bounds: `where (A, B): Trait` rather - than `A: Trait`. - -- It gives two ways to write single dispatch: either without `for`, or using - `for` with a single-element tuple. - -- There's a somewhat jarring distinction between single/multiple dispatch - traits, making the latter feel "bolted on". - -- The tuple syntax is unusual in acting as a binder of its types, as opposed to - the `Trait` syntax. - -- Relatedly, the generics syntax for traits is immediately understandable (a - family of traits) based on other uses of generics in the language, while the - tuple notation stands alone. - -- Less clear story for trait objects (although the fact that `Self` is the only - erased input type in this RFC may seem somewhat arbitrary). - -On balance, the generics-based approach seems like a better fit for the language -design, especially in its interaction with methods and the object system. - -## A backwards-compatible version - -Yet another alternative would be to allow trait type parameters to be either -inputs or outputs, marking the inputs with a keyword `in`: - -```rust -trait Add { - fn add(&Lhs, &Rhs) -> Sum; -} -``` - -This would provide a way of adding multidispatch now, and then adding associated -items later on without breakage. If, in addition, output types had to come after -all input types, it might even be possible to migrate output type parameters -like `Sum` above into associated types later. - -This is perhaps a reasonable fallback, but it seems better to introduce a clean -design with both multidispatch and associated items together. - -# Unresolved questions - -## Super traits - -This RFC largely ignores super traits. - -Currently, the implementation of super traits treats them identically to a -`where` clause that bounds `Self`, and this RFC does not propose to change -that. However, a follow-up RFC should clarify that this is the intended -semantics for super traits. - -Note that this treatment of super traits is, in particular, consistent with the -proposed scoping rules, which do not bring items from super traits into scope in -the body of a subtrait; they must be accessed via `Self::item_name`. - -## Equality constraints in `where` clauses - -This RFC allows equality constraints on types for associated types, but does not -propose a similar feature for `where` clauses. That will be the subject of a -follow-up RFC. - -## Multiple trait object bounds for the same trait - -The design here makes it possible to write bounds or trait objects that mention -the same trait, multiple times, with different inputs: - -```rust -fn mulit_add + Add>(t: T) -> T { ... } -fn mulit_add_obj(t: Box + Add>) -> Box + Add> { ... } -``` - -This seems like a potentially useful feature, and should be unproblematic for -bounds, but may have implications for vtables that make it problematic for trait -objects. Whether or not such trait combinations are allowed will likely depend -on implementation concerns, which are not yet clear. - -## Generic associated consts in match patterns - -It seems desirable to allow constants that depend on type parameters in match -patterns, but it's not clear how to do so while still checking exhaustiveness -and reachability of the match arms. Most likely this requires new forms of -where clause, to constrain associated constant values. - -For now, we simply defer the question. - -## Generic associated consts in array sizes - -It would be useful to be able to use trait-associated constants in generic code. - -```rust -// Shouldn't this be OK? -const ALIAS_N: usize = ::N; -let x: [u8; ::N] = [0u8; ALIAS_N]; -// Or... -let x: [u8; T::N + 1] = [0u8; T::N + 1]; -``` - -However, this causes some problems. What should we do with the following case in -type checking, where we need to prove that a generic is valid for any `T`? - -```rust -let x: [u8; T::N + T::N] = [0u8; 2 * T::N]; -``` - -We would like to handle at least some obvious cases (e.g. proving that -`T::N == T::N`), but without trying to prove arbitrary statements about -arithmetic. The question of how to do this is deferred. +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0195-associated-items.md). diff --git a/text/0469-feature-gate-box-patterns.md b/text/0469-feature-gate-box-patterns.md index a3a0b39c68b..d48ee436150 100644 --- a/text/0469-feature-gate-box-patterns.md +++ b/text/0469-feature-gate-box-patterns.md @@ -1,33 +1 @@ -- Start Date: 2014-11-17 -- RFC PR: [rust-lang/rfcs#469](https://github.com/rust-lang/rfcs/pull/469) -- Rust Issue: [rust-lang/rust#21931](https://github.com/rust-lang/rust/issues/21931) - -# Summary - -Move `box` patterns behind a feature gate. - -# Motivation - -A recent RFC (https://github.com/rust-lang/rfcs/pull/462) proposed renaming `box` patterns to `deref`. The discussion that followed indicates that while the language community may be in favour of some sort of renaming, there is no significant consensus around any concrete proposal, including the original one or any that emerged from the discussion. - -This RFC proposes moving `box` patterns behind a feature gate to postpone that discussion and decision to when it becomes more clear how `box` patterns should interact with types other than `Box`. - -In addition, in the future `box` patterns are expected to be made more general by enabling them to destructure any type that implements one of the `Deref` family of traits. As such a generalisation may potentially lead to some currently valid programs being rejected due to the interaction with type inference or other language features, it is desirable that this particular feature stays feature gated until then. - -# Detailed design - -A feature gate `box_patterns` will be defined and all uses of the `box` pattern will require said gate to be enabled. - -# Drawbacks - -Some currently valid Rust programs will have to opt in to another feature gate. - -# Alternatives - -Pursue https://github.com/rust-lang/rfcs/pull/462 before 1.0 and stabilise `box patterns` without a feature gate. - -Leave `box` patterns as-is without putting them behind a feature gate. - -# Unresolved questions - -None. +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0469-feature-gate-box-patterns.md). diff --git a/text/0509-collections-reform-part-2.md b/text/0509-collections-reform-part-2.md index eb43c9c6062..f770f32e5fe 100644 --- a/text/0509-collections-reform-part-2.md +++ b/text/0509-collections-reform-part-2.md @@ -1,362 +1 @@ -- Start Date: 2014-12-18 -- RFC PR: https://github.com/rust-lang/rfcs/pull/509 -- Rust Issue: https://github.com/rust-lang/rust/issues/19986 - -# Summary - -This RFC shores up the finer details of collections reform. In particular, where the -[previous RFC][part1] -focused on general conventions and patterns, this RFC focuses on specific APIs. It also patches -up any errors that were found during implementation of [part 1][part1]. Some of these changes -have already been implemented, and simply need to be ratified. - -# Motivation - -Collections reform stabilizes "standard" interfaces, but there's a lot that still needs to be -hashed out. - -# Detailed design - -## The fate of entire collections: - -* Stable: Vec, RingBuf, HashMap, HashSet, BTreeMap, BTreeSet, DList, BinaryHeap -* Unstable: Bitv, BitvSet, VecMap -* Move to [collect-rs](https://github.com/Gankro/collect-rs/) for incubation: -EnumSet, bitflags!, LruCache, TreeMap, TreeSet, TrieMap, TrieSet - -The stable collections have solid implementations, well-maintained APIs, are non-trivial, -fundamental, and clearly useful. - -The unstable collections are effectively "on probation". They're ok, but they need some TLC and -further consideration before we commit to having them in the standard library *forever*. Bitv in -particular won't have *quite* the right API without IndexGet *and* IndexSet. - -The collections being moved out are in poor shape. EnumSet is weird/trivial, bitflags is awkward, -LruCache is niche. Meanwhile Tree\* and Trie\* have simply bit-rotted for too long, without anyone -clearly stepping up to maintain them. Their code is scary, and their APIs are out of date. Their -functionality can also already reasonably be obtained through either HashMap or BTreeMap. - -Of course, instead of moving them out-of-tree, they could be left `experimental`, but that would -perhaps be a fate *worse* than death, as it would mean that these collections would only be -accessible to those who opt into running the Rust nightly. This way, these collections will be -available for everyone through the cargo ecosystem. Putting them in `collect-rs` also gives them -a chance to still benefit from a network effect and active experimentation. If they thrive there, -they may still return to the standard library at a later time. - -## Add the following methods: - -* To all collections -``` -/// Moves all the elements of `other` into `Self`, leaving `other` empty. -pub fn append(&mut self, other: &mut Self) -``` - -Collections know everything about themselves, and can therefore move data more -efficiently than any more generic mechanism. Vec's can safely trust their own capacity -and length claims. DList and TreeMap can also reuse nodes, avoiding allocating. - -This is by-ref instead of by-value for a couple reasons. First, it adds symmetry (one doesn't have -to be owned). Second, in the case of array-based structures, it allows `other`'s capacity to be -reused. This shouldn't have much expense in the way of making `other` valid, as almost all of our -collections are basically a no-op to make an empty version of if necessary (usually it amounts to -zeroing a few words of memory). BTree is the only exception the author is aware of (root is pre- -allocated -to avoid an Option). - -* To DList, Vec, RingBuf, BitV: -``` -/// Splits the collection into two at the given index. Useful for similar reasons as `append`. -pub fn split_off(&mut self, at: uint) -> Self; -``` - -* To all other "sorted" collections -``` -/// Splits the collection into two at the given key. Returns everything after the given key, -/// including the key. -pub fn split_off>(&mut self, at: B) -> Self; -``` - -Similar reasoning to `append`, although perhaps even more needed, as there's *no* other mechanism -for moving an entire subrange of a collection efficiently like this. `into_iterator` consumes -the whole collection, and using `remove` methods will do a lot of unnecessary work. For instance, -in the case of `Vec`, using `pop` and `push` will involve many length changes, bounds checks, -unwraps, and ultimately produce a *reversed* Vec. - -* To BitvSet, VecMap: - -``` -/// Reserves capacity for an element to be inserted at `len - 1` in the given -/// collection. The collection may reserve more space to avoid frequent reallocations. -pub fn reserve_len(&mut self, len: uint) - -/// Reserves the minimum capacity for an element to be inserted at `len - 1` in the given -/// collection. -pub fn reserve_len_exact(&mut self, len: uint) -``` - -The "capacity" of these two collections isn't really strongly related to the -number of elements they hold, but rather the largest index an element is stored at. -See Errata and Alternatives for extended discussion of this design. - -* For Ringbuf: -``` -/// Gets two slices that cover the whole range of the RingBuf. -/// The second one may be empty. Otherwise, it continues *after* the first. -pub fn as_slices(&'a self) -> (&'a [T], &'a [T]) -``` - -This provides some amount of support for viewing the RingBuf like a slice. Unfortunately -the RingBuf may be wrapped, making this impossible. See Alternatives for other designs. - -There is an implementation of this at rust-lang/rust#19903. - -* For Vec: -``` -/// Resizes the `Vec` in-place so that `len()` equals to `new_len`. -/// -/// Calls either `grow()` or `truncate()` depending on whether `new_len` -/// is larger than the current value of `len()` or not. -pub fn resize(&mut self, new_len: uint, value: T) where T: Clone -``` - -This is actually easy to implement out-of-tree on top of the current Vec API, but it has -been frequently requested. - -* For Vec, RingBuf, BinaryHeap, HashMap and HashSet: -``` -/// Clears the container, returning its owned contents as an iterator, but keeps the -/// allocated memory for reuse. -pub fn drain(&mut self) -> Drain; -``` - -This provides a way to grab elements out of a collection by value, without -deallocating the storage for the collection itself. - -There is a partial implementation of this at rust-lang/rust#19946. - -============== -## Deprecate - -* `Vec::from_fn(n, f)` use `(0..n).map(f).collect()` -* `Vec::from_elem(n, v)` use `repeat(v).take(n).collect()` -* `Vec::grow` use `extend(repeat(v).take(n))` -* `Vec::grow_fn` use `extend((0..n).map(f))` -* `dlist::ListInsertion` in favour of inherent methods on the iterator - -============== - -## Misc Stabilization: - -* Rename `BinaryHeap::top` to `BinaryHeap::peek`. `peek` is a more clear name than `top`, and is -already used elsewhere in our APIs. - -* `Bitv::get`, `Bitv::set`, where `set` panics on OOB, and `get` returns an Option. `set` may want -to wait on IndexSet being a thing (see Alternatives). - -* Rename SmallIntMap to VecMap. (already done) - -* Stabilize `front`/`back`/`front_mut`/`back_mut` for peeking on the ends of Deques - -* Explicitly specify HashMap's iterators to be non-deterministic between iterations. This would -allow e.g. `next_back` to be implemented as `next`, reducing code complexity. This can be undone -in the future backwards-compatibly, but the reverse does not hold. - -* Move `Vec` from `std::vec` to `std::collections::vec`. - -* Stabilize RingBuf::swap - -============== - -## Clarifications and Errata from Part 1 - -* Not every collection can implement every kind of iterator. This RFC simply wishes to clarify -that iterator implementation should be a "best effort" for what makes sense for the collection. - -* Bitv was marked as having *explicit* growth capacity semantics, when in fact it is implicit -growth. It has the same semantics as Vec. - -* BitvSet and VecMap are part of a surprise *fourth* capacity class, which isn't really based on -the number of elements contained, but on the maximum index stored. This RFC proposes the name of -*maximum growth*. - -* `reserve(x)` should specifically reserve space for `x + len()` elements, as opposed to e.g. `x + -capacity()` elements. - -* Capacity methods should be based on a "best effort" model: - - * `capacity()` can be regarded as a *lower bound* on the number of elements that can be - inserted before a resize occurs. It is acceptable for more elements to be insertable. A - collection may also randomly resize before capacity is met if highly degenerate behaviour - occurs. This is relevant to HashMap, which due to its use of integer multiplication cannot - precisely compute its "true" capacity. It also may wish to resize early if a long chain of - collisions occurs. Note that Vec should make *clear* guarantees about the precision of - capacity, as this is important for `unsafe` usage. - - * `reserve_exact` may be subverted by the collection's own requirements (e.g. many collections - require a capacity related to a power of two for fast modular arithmetic). The allocator may - also give the collection more space than it requests, in which case it may as well use that - space. It will still give you at least as much capacity as you request. - - * `shrink_to_fit` may not shrink to the true minimum size for similar reasons as - `reserve_exact`. - - * Neither `reserve` nor `reserve_exact` can be trusted to reliably produce a specific - capacity. At best you can guarantee that there will be space for the number you ask for. - Although even then `capacity` itself may return a smaller number due to its own fuzziness. - -============== - -## Entry API V2.0 - -The old Entry API: -``` -impl Map { - fn entry<'a>(&'a mut self, key: K) -> Entry<'a, K, V> -} - -pub enum Entry<'a, K: 'a, V: 'a> { - Occupied(OccupiedEntry<'a, K, V>), - Vacant(VacantEntry<'a, K, V>), -} - -impl<'a, K, V> VacantEntry<'a, K, V> { - fn set(self, value: V) -> &'a mut V -} - -impl<'a, K, V> OccupiedEntry<'a, K, V> { - fn get(&self) -> &V - fn get_mut(&mut self) -> &mut V - fn into_mut(self) -> &'a mut V - fn set(&mut self, value: V) -> V - fn take(self) -> V -} -``` - -Based on feedback and collections reform landing, this RFC proposes the following new API: - -``` -impl Map { - fn entry<'a, O: ToOwned>(&'a mut self, key: &O) -> Entry<'a, O, V> -} - -pub enum Entry<'a, O: 'a, V: 'a> { - Occupied(OccupiedEntry<'a, O, V>), - Vacant(VacantEntry<'a, O, V>), -} - -impl Entry<'a, O: 'a, V:'a> { - fn get(self) -> Result<&'a mut V, VacantEntry<'a, O, V>> -} - -impl<'a, K, V> VacantEntry<'a, K, V> { - fn insert(self, value: V) -> &'a mut V -} - -impl<'a, K, V> OccupiedEntry<'a, K, V> { - fn get(&self) -> &V - fn get_mut(&mut self) -> &mut V - fn into_mut(self) -> &'a mut V - fn insert(&mut self, value: V) -> V - fn remove(self) -> V -} -``` - -Replacing get/get_mut with Deref is simply a nice ergonomic improvement. Renaming `set` and `take` -to `insert` and `remove` brings the API more inline with other collection APIs, and makes it -more clear what they do. The convenience method on Entry itself makes it just nicer to use. -Permitting the following `map.entry(key).get().or_else(|vacant| vacant.insert(Vec::new()))`. - -This API should be stabilized for 1.0 with the exception of the impl on Entry itself. - -# Alternatives - -## Traits vs Inherent Impls on Entries -The Entry API as proposed would leave Entry and its two variants defined by each collection. We -could instead make the actual concrete VacantEntry/OccupiedEntry implementors implement a trait. -This would allow Entry to be hoisted up to root of collections, with utility functions implemented -once, as well as only requiring one import when using multiple collections. This *would* require -that the traits be imported, unless we get inherent trait implementations. - -These traits can of course be introduced later. - -============== - -## Alternatives to ToOwned on Entries -The Entry API currently is a bit wasteful in the by-value key case. If, for instance, a user of a -`HashMap` happens to have a String they don't mind losing, they can't pass the String by --value to the Map. They must pass it by-reference, and have it get cloned. - -One solution to this is to actually have the bound be IntoCow. This will potentially have some -runtime overhead, but it should be dwarfed by the cost of an insertion anyway, and would be a -clear win in the by-value case. - -Another alternative would be an *IntoOwned* trait, which would have the signature `(self) -> -Owned`, as opposed to the current ToOwned `(&self) -> Owned`. IntoOwned more closely matches the -semantics we actually want for our entry keys, because we really don't care about preserving them -after the conversion. This would allow us to dispatch to either a no-op or a full clone as -necessary. This trait would also be appropriate for the CoW type, and in fact all of our current -uses of the type. However the relationship between FromBorrow and IntoOwned is currently awkward -to express with our type system, as it would have to be implemented e.g. for `&str` instead of -`str`. IntoOwned also has trouble co-existing "fully" with ToOwned due to current lack of negative -bounds in where clauses. That is, we would want a blanket impl of IntoOwned for ToOwned, but this -can't be properly expressed for coherence reasons. - -This RFC does not propose either of these designs in favour of choosing the conservative ToOwned -now, with the possibility of "upgrading" into IntoOwned, IntoCow, or something else when we have a -better view of the type-system landscape. - -============== - -## Don't stabilize `Bitv::set` - -We could wait for IndexSet, Or make `set` return a result. -`set` really is redundant with an IndexSet implementation, and we -don't like to provide redundant APIs. On the other hand, it's kind of weird to have only `get`. - -============== - -## `reserve_index` vs `reserve_len` - -`reserve_len` is primarily motivated by BitvSet and VecMap, whose capacity semantics are largely -based around the largest index they have set, and not the number of elements they contain. This -design was chosen for its equivalence to `with_capacity`, as well as possible -future-proofing for adding it to other collections like `Vec` or `RingBuf`. - -However one could instead opt for `reserve_index`, which are effectively the same method, -but with an off-by-one. That is, `reserve_len(x) == reserve_index(x - 1)`. This more closely -matches the intent (let me have index `7`), but has tricky off-by-one with `capacity`. - -Alternatively `reserve_len` could just be called `reserve_capacity`. - -============== - -## RingBuf `as_slice` - -Other designs for this usecase were considered: - -``` -/// Attempts to get a slice over all the elements in the RingBuf, but may instead -/// have to return two slices, in the case that the elements aren't contiguous. -pub fn as_slice(&'a self) -> RingBufSlice<'a, T> - -enum RingBufSlice<'a, T> { - Contiguous(&'a [T]), - Split((&'a [T], &'a [T])), -} -``` - -``` -/// Gets a slice over all the elements in the RingBuf. This may require shifting -/// all the elements to make this possible. -pub fn to_slice(&mut self) -> &[T] -``` - -The one settled on had the benefit of being the simplest. In particular, having the enum wasn't -very helpful, because most code would just create an empty slice anyway in the contiguous case -to avoid code-duplication. - -# Unresolved questions - -`reserve_index` vs `reserve_len` and `Ringbuf::as_slice` are the two major ones. - -[part1]: https://github.com/rust-lang/rfcs/blob/master/text/0235-collections-conventions.md +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0509-collections-reform-part-2.md). diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index a7f21e2440b..62ea53d4b51 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1,1914 +1 @@ -- Start Date: 2014-12-07 -- RFC PR: [rust-lang/rfcs#517](https://github.com/rust-lang/rfcs/pull/517) -- Rust Issue: [rust-lang/rust#21070](https://github.com/rust-lang/rust/issues/21070) - -# Summary -[Summary]: #summary - -This RFC proposes a significant redesign of the `std::io` and `std::os` modules -in preparation for API stabilization. The specific problems addressed by the -redesign are given in the [Problems] section below, and the key ideas of the -design are given in [Vision for IO]. - -# Note about RFC structure - -This RFC was originally posted as a single monolithic file, which made -it difficult to discuss different parts separately. - -It has now been split into a skeleton that covers (1) the problem -statement, (2) the overall vision and organization, and (3) the -`std::os` module. - -Other parts of the RFC are marked with `(stub)` and will be filed as -follow-up PRs against this RFC. - -# Table of contents -[Table of contents]: #table-of-contents -* [Summary] -* [Table of contents] -* [Problems] - * [Atomicity and the `Reader`/`Writer` traits] - * [Timeouts] - * [Posix and libuv bias] - * [Unicode] - * [stdio] - * [Overly high-level abstractions] - * [The error chaining pattern] -* [Detailed design] - * [Vision for IO] - * [Goals] - * [Design principles] - * [What cross-platform means] - * [Relation to the system-level APIs] - * [Platform-specific opt-in] - * [Proposed organization] - * [Revising `Reader` and `Writer`] - * [Read] - * [Write] - * [String handling] - * [Key observations] - * [The design: `os_str`] - * [The future] - * [Deadlines] (stub) - * [Splitting streams and cancellation] (stub) - * [Modules] - * [core::io] - * [Adapters] - * [Free functions] - * [Seeking] - * [Buffering] - * [Cursor] - * [The std::io facade] - * [Errors] - * [Channel adapters] - * [stdin, stdout, stderr] - * [Printing functions] - * [std::env] - * [std::fs] - * [Free functions] - * [Files] - * [File kinds] - * [File permissions] - * [std::net] - * [TCP] - * [UDP] - * [Addresses] - * [std::process] - * [Command] - * [Child] - * [std::os] - * [Odds and ends] - * [The io prelude] -* [Drawbacks] -* [Alternatives] -* [Unresolved questions] - -# Problems -[Problems]: #problems - -The `io` and `os` modules are the last large API surfaces of `std` that need to -be stabilized. While the basic functionality offered in these modules is -*largely* traditional, many problems with the APIs have emerged over time. The -RFC discusses the most significant problems below. - -This section only covers specific problems with the current library; see -[Vision for IO] for a higher-level view. section. - -## Atomicity and the `Reader`/`Writer` traits -[Atomicity and the `Reader`/`Writer` traits]: #atomicity-and-the-readerwriter-traits - -One of the most pressing -- but also most subtle -- problems with `std::io` is -the lack of *atomicity* in its `Reader` and `Writer` traits. - -For example, the `Reader` trait offers a `read_to_end` method: - -```rust -fn read_to_end(&mut self) -> IoResult> -``` - -Executing this method may involve many calls to the underlying `read` -method. And it is possible that the first several calls succeed, and then a call -returns an `Err` -- which, like `TimedOut`, could represent a transient -problem. Unfortunately, given the above signature, there is no choice but to -simply _throw this data away_. - -The `Writer` trait suffers from a more fundamental problem, since its primary -method, `write`, may actually involve several calls to the underlying system -- -and if a failure occurs, there is no indication of how much was written. - -Existing blocking APIs all have to deal with this problem, and Rust -can and should follow the existing tradition here. See -[Revising `Reader` and `Writer`] for the proposed solution. - -## Timeouts -[Timeouts]: #timeouts - -The `std::io` module supports "timeouts" on virtually all IO objects via a -`set_timeout` method. In this design, every IO object (file, socket, etc.) has -an optional timeout associated with it, and `set_timeout` mutates the associated -timeout. All subsequent blocking operations are implicitly subject to this timeout. - -This API choice suffers from two problems, one cosmetic and the other deeper: - -* The "timeout" is - [actually a *deadline*](https://github.com/rust-lang/rust/issues/15802) and - should be named accordingly. - -* The stateful API has poor composability: when passing a mutable reference of - an IO object to another function, it's possible that the deadline has been - changed. In other words, users of the API can easily interfere with each other - by accident. - -See [Deadlines] for the proposed solution. - -## Posix and libuv bias -[Posix and libuv bias]: #posix-and-libuv-bias - -The current `io` and `os` modules were originally designed when `librustuv` was -providing IO support, and to some extent they reflect the capabilities and -conventions of `libuv` -- which in turn are loosely based on Posix. - -As such, the modules are not always ideal from a cross-platform standpoint, both -in terms of forcing Windows programmings into a Posix mold, and also of offering -APIs that are not actually usable on all platforms. - -The modules have historically also provided *no* platform-specific APIs. - -Part of the goal of this RFC is to set out a clear and extensible story for both -cross-platform and platform-specific APIs in `std`. See [Design principles] for -the details. - -## Unicode -[Unicode]: #unicode - -Rust has followed the [utf8 everywhere](http://utf8everywhere.org/) approach to -its strings. However, at the borders to platform APIs, it is revealed that the -world is not, in fact, UTF-8 (or even Unicode) everywhere. - -Currently our story for platform APIs is that we either assume they can take or -return Unicode strings (suitably encoded) or an uninterpreted byte -sequence. Sadly, this approach does *not* actually cover all platform needs, and -is also not highly ergonomic as presently implemented. (Consider `os::getenv` -which introduces replacement characters (!) versus `os::getenv_as_bytes` which -yields a `Vec`; neither is ideal.) - -This topic was covered in some detail in the -[Path Reform RFC](https://github.com/rust-lang/rfcs/pull/474), but this RFC -gives a more general account in [String handling]. - -## `stdio` -[stdio]: #stdio - -The `stdio` module provides access to readers/writers for `stdin`, `stdout` and -`stderr`, which is essential functionality. However, it *also* provides a means -of changing e.g. "stdout" -- but there is no connection between these two! In -particular, `set_stdout` affects only the writer that `println!` and friends -use, while `set_stderr` affects `panic!`. - -This module needs to be clarified. See [The std::io facade] and -[Functionality moved elsewhere] for the detailed design. - -## Overly high-level abstractions -[Overly high-level abstractions]: #overly-high-level-abstractions - -There are a few places where `io` provides high-level abstractions over system -services without also providing more direct access to the service as-is. For example: - -* The `Writer` trait's `write` method -- a cornerstone of IO -- actually - corresponds to an unbounded number of invocations of writes to the underlying - IO object. This RFC changes `write` to follow more standard, lower-level - practice; see [Revising `Reader` and `Writer`]. - -* Objects like `TcpStream` are `Clone`, which involves a fair amount of - supporting infrastructure. This RFC tackles the problems that `Clone` was - trying to solve more directly; see [Splitting streams and cancellation]. - -The motivation for going lower-level is described in [Design principles] below. - -## The error chaining pattern -[The error chaining pattern]: #the-error-chaining-pattern - -The `std::io` module is somewhat unusual in that most of the functionality it -proves are used through a few key traits (like `Reader`) and these traits are in -turn "lifted" over `IoResult`: - -```rust -impl Reader for IoResult { ... } -``` - -This lifting and others makes it possible to chain IO operations that might -produce errors, without any explicit mention of error handling: - -```rust -File::open(some_path).read_to_end() - ^~~~~~~~~~~ can produce an error - ^~~~ can produce an error -``` - -The result of such a chain is either `Ok` of the outcome, or `Err` of the first -error. - -While this pattern is highly ergonomic, it does not fit particularly well into -our evolving error story -([interoperation](https://github.com/rust-lang/rfcs/pull/201) or -[try blocks](https://github.com/rust-lang/rfcs/pull/243)), and it is the only -module in `std` to follow this pattern. - -Eventually, we would like to write - -```rust -File::open(some_path)?.read_to_end() -``` - -to take advantage of the `FromError` infrastructure, hook into error handling -control flow, and to provide good chaining ergonomics throughout *all* Rust APIs --- all while keeping this handling a bit more explicit via the `?` -operator. (See https://github.com/rust-lang/rfcs/pull/243 for the rough direction). - -In the meantime, this RFC proposes to phase out the use of impls for -`IoResult`. This will require use of `try!` for the time being. - -(Note: this may put some additional pressure on at least landing the basic use -of `?` instead of today's `try!` before 1.0 final.) - -# Detailed design -[Detailed design]: #detailed-design - -There's a lot of material here, so the RFC starts with high-level goals, -principles, and organization, and then works its way through the various modules -involved. - -## Vision for IO -[Vision for IO]: #vision-for-io - -Rust's IO story had undergone significant evolution, starting from a -`libuv`-style pure green-threaded model to a dual green/native model and now to -a [pure native model](https://github.com/rust-lang/rfcs/pull/230). Given that -history, it's worthwhile to set out explicitly what is, and is not, in scope for -`std::io` - -### Goals -[Goals]: #goals - -For Rust 1.0, the aim is to: - -* Provide a *blocking* API based directly on the services provided by the native - OS for native threads. - - These APIs should cover the basics (files, basic networking, basic process - management, etc) and suffice to write servers following the classic Apache - thread-per-connection model. They should impose essentially zero cost over the - underlying OS services; the core APIs should map down to a single syscall - unless more are needed for cross-platform compatibility. - -* Provide basic blocking abstractions and building blocks (various stream and - buffer types and adapters) based on traditional blocking IO models but adapted - to fit well within Rust. - -* Provide hooks for integrating with low-level and/or platform-specific APIs. - -* Ensure reasonable forwards-compatibility with future async IO models. - -It is explicitly *not* a goal at this time to support asynchronous programming -models or nonblocking IO, nor is it a goal for the blocking APIs to eventually -be used in a nonblocking "mode" or style. - -Rather, the hope is that the basic abstractions of files, paths, sockets, and so -on will eventually be usable directly within an async IO programing model and/or -with nonblocking APIs. This is the case for most existing languages, which offer -multiple interoperating IO models. - -The *long term* intent is certainly to support async IO in some form, -but doing so will require new research and experimentation. - -### Design principles -[Design principles]: #design-principles - -Now that the scope has been clarified, it's important to lay out some broad -principles for the `io` and `os` modules. Many of these principles are already -being followed to some extent, but this RFC makes them more explicit and applies -them more uniformly. - -#### What cross-platform means -[What cross-platform means]: #what-cross-platform-means - -Historically, Rust's `std` has always been "cross-platform", but as discussed in -[Posix and libuv bias] this hasn't always played out perfectly. The proposed -policy is below. **With this policies, the APIs should largely feel like part of -"Rust" rather than part of any legacy, and they should enable truly portable -code**. - -Except for an explicit opt-in (see [Platform-specific opt-in] below), all APIs -in `std` should be cross-platform: - -* The APIs should **only expose a service or a configuration if it is supported on - all platforms**, and if the semantics on those platforms is or can be made - loosely equivalent. (The latter requires exercising some - judgment). Platform-specific functionality can be handled separately - ([Platform-specific opt-in]) and interoperate with normal `std` abstractions. - - This policy rules out functions like `chown` which have a clear meaning on - Unix and no clear interpretation on Windows; the ownership and permissions - models are *very* different. - -* The APIs should **follow Rust's conventions**, including their naming, which - should be platform-neutral. - - This policy rules out names like `fstat` that are the legacy of a particular - platform family. - -* The APIs should **never directly expose the representation** of underlying - platform types, even if they happen to coincide on the currently-supported - platforms. Cross-platform types in `std` should be newtyped. - - This policy rules out exposing e.g. error numbers directly as an integer type. - -The next subsection gives detail on what these APIs should look like in relation -to system services. - -#### Relation to the system-level APIs -[Relation to the system-level APIs]: #relation-to-the-system-level-apis - -How should Rust APIs map into system services? This question breaks down along -several axes which are in tension with one another: - -* **Guarantees**. The APIs provided in the mainline `io` modules should be - predominantly safe, aside from the occasional `unsafe` function. In - particular, the representation should be sufficiently hidden that most use - cases are safe by construction. Beyond memory safety, though, the APIs should - strive to provide a clear multithreaded semantics (using the `Send`/`Sync` - kinds), and should use Rust's type system to rule out various kinds of bugs - when it is reasonably ergonomic to do so (following the usual Rust - conventions). - -* **Ergonomics**. The APIs should present a Rust view of things, making use of - the trait system, newtypes, and so on to make system services fit well with - the rest of Rust. - -* **Abstraction/cost**. On the other hand, the abstractions introduced in `std` - must not induce significant costs over the system services -- or at least, - there must be a way to safely access the services directly without incurring - this penalty. When useful abstractions would impose an extra cost, they must - be pay-as-you-go. - -Putting the above bullets together, **the abstractions must be safe, and they -should be as high-level as possible without imposing a tax**. - -* **Coverage**. Finally, the `std` APIs should over time strive for full - coverage of non-niche, cross-platform capabilities. - -#### Platform-specific opt-in -[Platform-specific opt-in]: #platform-specific-opt-in - -Rust is a systems language, and as such it should expose seamless, no/low-cost -access to system services. In many cases, however, this cannot be done in a -cross-platform way, either because a given service is only available on some -platforms, or because providing a cross-platform abstraction over it would be -costly. - -This RFC proposes *platform-specific opt-in*: submodules of `os` that are named -by platform, and made available via `#[cfg]` switches. For example, `os::unix` -can provide APIs only available on Unix systems, and `os::linux` can drill -further down into Linux-only APIs. (You could even imagine subdividing by OS -versions.) This is "opt-in" in the sense that, like the `unsafe` keyword, it is -very easy to audit for potential platform-specificity: just search for -`os::anyplatform`. Moreover, by separating out subsets like `linux`, it's clear -exactly how specific the platform dependency is. - -The APIs in these submodules are intended to have the same flavor as other `io` -APIs and should interoperate seamlessly with cross-platform types, but: - -* They should be named according to the underlying system services when there is - a close correspondence. - -* They may reveal the underlying OS type if there is nothing to be gained by - hiding it behind an abstraction. - -For example, the `os::unix` module could provide a `stat` function that takes a -standard `Path` and yields a custom struct. More interestingly, `os::linux` -might include an `epoll` function that could operate *directly* on many `io` -types (e.g. various socket types), without any explicit conversion to a file -descriptor; that's what "seamless" means. - -Each of the platform modules will offer a custom `prelude` submodule, -intended for glob import, that includes all of the extension traits -applied to standard IO objects. - -The precise design of these modules is in the very early stages and will likely -remain `#[unstable]` for some time. - -### Proposed organization -[Proposed organization]: #proposed-organization - -The `io` module is currently the biggest in `std`, with an entire hierarchy -nested underneath; it mixes general abstractions/tools with specific IO objects. -The `os` module is currently a bit of a dumping ground for facilities that don't -fit into the `io` category. - -This RFC proposes the revamp the organization by flattening out the hierarchy -and clarifying the role of each module: - -``` -std - env environment manipulation - fs file system - io core io abstractions/adapters - prelude the io prelude - net networking - os - unix platform-specific APIs - linux .. - windows .. - os_str platform-sensitive string handling - process process management -``` - -In particular: - -* The contents of `os` will largely move to `env`, a new module for -inspecting and updating the "environment" (including environment variables, CPU -counts, arguments to `main`, and so on). - -* The `io` module will include things like `Reader` and `BufferedWriter` -- - cross-cutting abstractions that are needed throughout IO. - - The `prelude` submodule will export all of the traits and most of the types - for IO-related APIs; a single glob import should suffice to set you up for - working with IO. (Note: this goes hand-in-hand with *removing* the bits of - `io` currently in the prelude, as - [recently proposed](https://github.com/rust-lang/rfcs/pull/503).) - -* The root `os` module is used purely to house the platform submodules discussed - [above](#platform-specific-opt-in). - -* The `os_str` module is part of the solution to the Unicode problem; see - [String handling] below. - -* The `process` module over time will grow to include querying/manipulating - already-running processes, not just spawning them. - -## Revising `Reader` and `Writer` -[Revising `Reader` and `Writer`]: #revising-reader-and-writer - -The `Reader` and `Writer` traits are the backbone of IO, representing -the ability to (respectively) pull bytes from and push bytes to an IO -object. The core operations provided by these traits follows a very -long tradition for blocking IO, but they are still surprisingly subtle --- and they need to be revised. - -* **Atomicity and data loss**. As discussed - [above](#atomicity-and-the-reader-writer-traits), the `Reader` and - `Writer` traits currently expose methods that involve multiple - actual reads or writes, and data is lost when an error occurs after - some (but not all) operations have completed. - - The proposed strategy for `Reader` operations is to (1) separate out - various deserialization methods into a distinct framework, (2) - *never* have the internal `read` implementations loop on errors, (3) - cut down on the number of non-atomic read operations and (4) adjust - the remaining operations to provide more flexibility when possible. - - For writers, the main - change is to make `write` only perform a single underlying write - (returning the number of bytes written on success), and provide a - separate `write_all` method. - -* **Parsing/serialization**. The `Reader` and `Writer` traits - currently provide a large number of default methods for - (de)serialization of various integer types to bytes with a given - endianness. Unfortunately, these operations pose atomicity problems - as well (e.g., a read could fail after reading two of the bytes - needed for a `u32` value). - - Rather than complicate the signatures of these methods, the - (de)serialization infrastructure is removed entirely -- in favor of - instead eventually introducing a much richer - parsing/formatting/(de)serialization framework that works seamlessly - with `Reader` and `Writer`. - - Such a framework is out of scope for this RFC, but the - endian-sensitive functionality will be provided elsewhere - (likely out of tree). - -With those general points out of the way, let's look at the details. - -### `Read` -[Read]: #read - -The updated `Reader` trait (and its extension) is as follows: - -```rust -trait Read { - fn read(&mut self, buf: &mut [u8]) -> Result; - - fn read_to_end(&mut self, buf: &mut Vec) -> Result<(), Error> { ... } - fn read_to_string(&self, buf: &mut String) -> Result<(), Error> { ... } -} - -// extension trait needed for object safety -trait ReadExt: Read { - fn bytes(&mut self) -> Bytes { ... } - - ... // more to come later in the RFC -} -impl ReadExt for R {} -``` - -Following the -[trait naming conventions](https://github.com/rust-lang/rfcs/pull/344), -the trait is renamed to `Read` reflecting the clear primary method it -provides. - -The `read` method should not involve internal looping (even over -errors like `EINTR`). It is intended to faithfully represent a single -call to an underlying system API. - -The `read_to_end` and `read_to_string` methods now take explicit -buffers as input. This has multiple benefits: - -* Performance. When it is known that reading will involve some large - number of bytes, the buffer can be preallocated in advance. - -* "Atomicity" concerns. For `read_to_end`, it's possible to use this - API to retain data collected so far even when a `read` fails in the - middle. For `read_to_string`, this is not the case, because UTF-8 - validity cannot be ensured in such cases; but if intermediate - results are wanted, one can use `read_to_end` and convert to a - `String` only at the end. - -Convenience methods like these will retry on `EINTR`. This is partly -under the assumption that in practice, EINTR will *most often* arise -when interfacing with other code that changes a signal handler. Due to -the global nature of these interactions, such a change can suddenly -cause your own code to get an error irrelevant to it, and the code -should probably just retry in those cases. In the case where you are -using EINTR explicitly, `read` and `write` will be available to handle -it (and you can always build your own abstractions on top). - -#### Removed methods - -The proposed `Read` trait is much slimmer than today's `Reader`. The vast -majority of removed methods are parsing/deserialization, which were -discussed above. - -The remaining methods (`read_exact`, `read_at_least`, `push`, -`push_at_least`) were removed for various reasons: - -* `read_exact`, `read_at_least`: these are somewhat more obscure - conveniences that are not particularly robust due to lack of - atomicity. - -* `push`, `push_at_least`: these are special-cases for working with - `Vec`, which this RFC proposes to replace with a more general - mechanism described next. - -To provide some of this functionality in a more composition way, -extend `Vec` with an unsafe method: - -```rust -unsafe fn with_extra(&mut self, n: uint) -> &mut [T]; -``` - -This method is equivalent to calling `reserve(n)` and then providing a -slice to the memory starting just after `len()` entries. Using this -method, clients of `Read` can easily recover the `push` method. - -### `Write` -[Write]: #write - -The `Writer` trait is cut down to even smaller size: - -```rust -trait Write { - fn write(&mut self, buf: &[u8]) -> Result; - fn flush(&mut self) -> Result<(), Error>; - - fn write_all(&mut self, buf: &[u8]) -> Result<(), Error> { .. } - fn write_fmt(&mut self, fmt: &fmt::Arguments) -> Result<(), Error> { .. } -} -``` - -The biggest change here is to the semantics of `write`. Instead of -repeatedly writing to the underlying IO object until all of `buf` is -written, it attempts a *single* write and on success returns the -number of bytes written. This follows the long tradition of blocking -IO, and is a more fundamental building block than the looping write we -currently have. Like `read`, it will propagate EINTR. - -For convenience, `write_all` recovers the behavior of today's `write`, -looping until either the entire buffer is written or an error -occurs. To meaningfully recover from an intermediate error and keep -writing, code should work with `write` directly. Like the `Read` -conveniences, `EINTR` results in a retry. - -The `write_fmt` method, like `write_all`, will loop until its entire -input is written or an error occurs. - -The other methods include endian conversions (covered by -serialization) and a few conveniences like `write_str` for other basic -types. The latter, at least, is already uniformly (and extensibly) -covered via the `write!` macro. The other helpers, as with `Read`, -should migrate into a more general (de)serialization library. - -## String handling -[String handling]: #string-handling - -The fundamental problem with Rust's full embrace of UTF-8 strings is that not -all strings taken or returned by system APIs are Unicode, let alone UTF-8 -encoded. - -In the past, `std` has assumed that all strings are *either* in some form of -Unicode (Windows), *or* are simply `u8` sequences (Unix). Unfortunately, this is -wrong, and the situation is more subtle: - -* Unix platforms do indeed work with arbitrary `u8` sequences (without interior - nulls) and today's platforms usually interpret them as UTF-8 when displayed. - -* Windows, however, works with *arbitrary `u16` sequences* that are roughly - interpreted at UTF-16, but may not actually be valid UTF-16 -- an "encoding" - often called UCS-2; see http://justsolve.archiveteam.org/wiki/UCS-2 for a bit - more detail. - -What this means is that all of Rust's platforms go beyond Unicode, but they do -so in different and incompatible ways. - -The current solution of providing both `str` and `[u8]` versions of -APIs is therefore problematic for multiple reasons. For one, **the -`[u8]` versions are not actually cross-platform** -- even today, they -panic on Windows when given non-UTF-8 data, a platform-specific -behavior. But they are also incomplete, because on Windows you should -be able to work directly with UCS-2 data. - -### Key observations -[Key observations]: #key-observations - -Fortunately, there is a solution that fits well with Rust's UTF-8 strings *and* -offers the possibility of platform-specific APIs. - -**Observation 1**: it is possible to re-encode UCS-2 data in a way that is also - compatible with UTF-8. This is the - [WTF-8 encoding format](http://simonsapin.github.io/wtf-8/) proposed by Simon - Sapin. This encoding has some remarkable properties: - -* Valid UTF-8 data is valid WTF-8 data. When decoded to UCS-2, the result is - exactly what would be produced by going straight from UTF-8 to UTF-16. In - other words, making up some methods: - - ```rust - my_ut8_data.to_wtf8().to_ucs2().as_u16_slice() == my_utf8_data.to_utf16().as_u16_slice() - ``` - -* Valid UTF-16 data re-encoded as WTF-8 produces the corresponding UTF-8 data: - - ```rust - my_utf16_data.to_wtf8().as_bytes() == my_utf16_data.to_utf8().as_bytes() - ``` - -These two properties mean that, when working with Unicode data, the WTF-8 -encoding is highly compatible with both UTF-8 *and* UTF-16. In particular, the -conversion from a Rust string to a WTF-8 string is a no-op, and the conversion -in the other direction is just a validation. - -**Observation 2**: all platforms can *consume* Unicode data (suitably - re-encoded), and it's also possible to validate the data they produce as - Unicode and extract it. - -**Observation 3**: the non-Unicode spaces on various platforms are deeply - incompatible: there is no standard way to port non-Unicode data from one to - another. Therefore, the only cross-platform APIs are those that work entirely - with Unicode. - -### The design: `os_str` -[The design: `os_str`]: #the-design-os_str - -The observations above lead to a somewhat radical new treatment of strings, -first proposed in the -[Path Reform RFC](https://github.com/rust-lang/rfcs/pull/474). This RFC proposes -to introduce new string and string slice types that (opaquely) represent -*platform-sensitive strings*, housed in the `std::os_str` module. - -The `OsString` type is analogous to `String`, and `OsStr` is analogous to `str`. -Their backing implementation is platform-dependent, but they offer a -cross-platform API: - -```rust -pub mod os_str { - /// Owned OS strings - struct OsString { - inner: imp::Buf - } - /// Slices into OS strings - struct OsStr { - inner: imp::Slice - } - - // Platform-specific implementation details: - #[cfg(unix)] - mod imp { - type Buf = Vec; - type Slice = [u8]; - ... - } - - #[cfg(windows)] - mod imp { - type Buf = Wtf8Buf; // See https://github.com/SimonSapin/rust-wtf8 - type Slice = Wtf8; - ... - } - - impl OsString { - pub fn from_string(String) -> OsString; - pub fn from_str(&str) -> OsString; - pub fn as_slice(&self) -> &OsStr; - pub fn into_string(Self) -> Result; - pub fn into_string_lossy(Self) -> String; - - // and ultimately other functionality typically found on vectors, - // but CRUCIALLY NOT as_bytes - } - - impl Deref for OsString { ... } - - impl OsStr { - pub fn from_str(value: &str) -> &OsStr; - pub fn as_str(&self) -> Option<&str>; - pub fn to_string_lossy(&self) -> CowString; - - // and ultimately other functionality typically found on slices, - // but CRUCIALLY NOT as_bytes - } - - trait IntoOsString { - fn into_os_str_buf(self) -> OsString; - } - - impl IntoOsString for OsString { ... } - impl<'a> IntoOsString for &'a OsStr { ... } - - ... -} -``` - -These APIs make OS strings appear roughly as opaque vectors (you -cannot see the byte representation directly), and can always be -produced starting from Unicode data. They make it possible to collapse -functions like `getenv` and `getenv_as_bytes` into a single function -that produces an OS string, allowing the client to decide how (or -whether) to extract Unicode data. It will be possible to do things -like concatenate OS strings without ever going through Unicode. - -It will also likely be possible to do things like search for Unicode -substrings. The exact details of the API are left open and are likely -to grow over time. - -In addition to APIs like the above, there will also be -platform-specific ways of viewing or constructing OS strings that -reveals more about the space of possible values: - -```rust -pub mod os { - #[cfg(unix)] - pub mod unix { - trait OsStringExt { - fn from_vec(Vec) -> Self; - fn into_vec(Self) -> Vec; - } - - impl OsStringExt for os_str::OsString { ... } - - trait OsStrExt { - fn as_byte_slice(&self) -> &[u8]; - fn from_byte_slice(&[u8]) -> &Self; - } - - impl OsStrExt for os_str::OsStr { ... } - - ... - } - - #[cfg(windows)] - pub mod windows{ - // The following extension traits provide a UCS-2 view of OS strings - - trait OsStringExt { - fn from_wide_slice(&[u16]) -> Self; - } - - impl OsStringExt for os_str::OsString { ... } - - trait OsStrExt { - fn to_wide_vec(&self) -> Vec; - } - - impl OsStrExt for os_str::OsStr { ... } - - ... - } - - ... -} -``` - -By placing these APIs under `os`, using them requires a clear *opt in* -to platform-specific functionality. - -### The future -[The future]: #the-future - -Introducing an additional string type is a bit daunting, since many -existing APIs take and consume only standard Rust strings. Today's -solution demands that strings coming from the OS be assumed or turned -into Unicode, and the proposed API continues to allow that (with more -explicit and finer-grained control). - -In the long run, however, robust applications are likely to work -opaquely with OS strings far beyond the boundary to the system to -avoid data loss and ensure maximal compatibility. If this situation -becomes common, it should be possible to introduce an abstraction over -various string types and generalize most functions that work with -`String`/`str` to instead work generically. This RFC does *not* -propose taking any such steps now -- but it's important that we *can* -do so later if Rust's standard strings turn out to not be sufficient -and OS strings become commonplace. - -## Deadlines -[Deadlines]: #deadlines - -> To be added in a follow-up PR. - -## Splitting streams and cancellation -[Splitting streams and cancellation]: #splitting-streams-and-cancellation - -> To be added in a follow-up PR. - -## Modules -[Modules]: #modules - -Now that we've covered the core principles and techniques used -throughout IO, we can go on to explore the modules in detail. - -### `core::io` -[core::io]: #coreio - -Ideally, the `io` module will be split into the parts that can live in -`libcore` (most of it) and the parts that are added in the `std::io` -facade. This part of the organization is non-normative, since it -requires changes to today's `IoError` (which currently references -`String`); if these changes cannot be performed, everything here will -live in `std::io`. - -#### Adapters -[Adapters]: #adapters - -The current `std::io::util` module offers a number of `Reader` and -`Writer` "adapters". This RFC refactors the design to more closely -follow `std::iter`. Along the way, it generalizes the `by_ref` adapter: - -```rust -trait ReadExt: Read { - // ... eliding the methods already described above - - // Postfix version of `(&mut self)` - fn by_ref(&mut self) -> &mut Self { ... } - - // Read everything from `self`, then read from `next` - fn chain(self, next: R) -> Chain { ... } - - // Adapt `self` to yield only the first `limit` bytes - fn take(self, limit: u64) -> Take { ... } - - // Whenever reading from `self`, push the bytes read to `out` - #[unstable] // uncertain semantics of errors "halfway through the operation" - fn tee(self, out: W) -> Tee { ... } -} - -trait WriteExt: Write { - // Postfix version of `(&mut self)` - fn by_ref<'a>(&'a mut self) -> &mut Self { ... } - - // Whenever bytes are written to `self`, write them to `other` as well - #[unstable] // uncertain semantics of errors "halfway through the operation" - fn broadcast(self, other: W) -> Broadcast { ... } -} - -// An adaptor converting an `Iterator` to `Read`. -pub struct IterReader { ... } -``` - -As with `std::iter`, these adapters are object unsafe and hence placed -in an extension trait with a blanket `impl`. - -#### Free functions -[Free functions]: #free-functions - -The current `std::io::util` module also includes a number of primitive -readers and writers, as well as `copy`. These are updated as follows: - -```rust -// A reader that yields no bytes -fn empty() -> Empty; // in theory just returns `impl Read` - -impl Read for Empty { ... } - -// A reader that yields `byte` repeatedly (generalizes today's ZeroReader) -fn repeat(byte: u8) -> Repeat; - -impl Read for Repeat { ... } - -// A writer that ignores the bytes written to it (/dev/null) -fn sink() -> Sink; - -impl Write for Sink { ... } - -// Copies all data from a `Read` to a `Write`, returning the amount of data -// copied. -pub fn copy(r: &mut R, w: &mut W) -> Result -``` - -Like `write_all`, the `copy` method will discard the amount of data already -written on any error and also discard any partially read data on a `write` -error. This method is intended to be a convenience and `write` should be used -directly if this is not desirable. - -#### Seeking -[Seeking]: #seeking - -The seeking infrastructure is largely the same as today's, except that -`tell` is removed and the `seek` signature is refactored with more precise -types: - -```rust -pub trait Seek { - // returns the new position after seeking - fn seek(&mut self, pos: SeekFrom) -> Result; -} - -pub enum SeekFrom { - Start(u64), - End(i64), - Current(i64), -} -``` - -The old `tell` function can be regained via `seek(SeekFrom::Current(0))`. - -#### Buffering -[Buffering]: #buffering - -The current `Buffer` trait will be renamed to `BufRead` for -clarity (and to open the door to `BufWrite` at some later -point): - -```rust -pub trait BufRead: Read { - fn fill_buf(&mut self) -> Result<&[u8], Error>; - fn consume(&mut self, amt: uint); - - fn read_until(&mut self, byte: u8, buf: &mut Vec) -> Result<(), Error> { ... } - fn read_line(&mut self, buf: &mut String) -> Result<(), Error> { ... } -} - -pub trait BufReadExt: BufRead { - // Split is an iterator over Result, Error> - fn split(&mut self, byte: u8) -> Split { ... } - - // Lines is an iterator over Result - fn lines(&mut self) -> Lines { ... }; - - // Chars is an iterator over Result - fn chars(&mut self) -> Chars { ... } -} -``` - -The `read_until` and `read_line` methods are changed to take explicit, -mutable buffers, for similar reasons to `read_to_end`. (Note that -buffer reuse is particularly common for `read_line`). These functions -include the delimiters in the strings they produce, both for easy -cross-platform compatibility (in the case of `read_line`) and for ease -in copying data without loss (in particular, distinguishing whether -the last line included a final delimiter). - -The `split` and `lines` methods provide iterator-based versions of -`read_until` and `read_line`, and *do not* include the delimiter in -their output. This matches conventions elsewhere (like `split` on -strings) and is usually what you want when working with iterators. - -The `BufReader`, `BufWriter` and `BufStream` types stay -essentially as they are today, except that for streams and writers the -`into_inner` method yields the structure back in the case of a write error, -and its behavior is clarified to writing out the buffered data without -flushing the underlying reader: -```rust -// If writing fails, you get the unwritten data back -fn into_inner(self) -> Result>; - -pub struct IntoInnerError(W, Error); - -impl IntoInnerError { - pub fn error(&self) -> &Error { ... } - pub fn into_inner(self) -> W { ... } -} -impl FromError> for Error { ... } -``` - -#### `Cursor` -[Cursor]: #cursor - -Many applications want to view in-memory data as either an implementor of `Read` -or `Write`. This is often useful when composing streams or creating test cases. -This functionality primarily comes from the following implementations: - -```rust -impl<'a> Read for &'a [u8] { ... } -impl<'a> Write for &'a mut [u8] { ... } -impl Write for Vec { ... } -``` - -While efficient, none of these implementations support seeking (via an -implementation of the `Seek` trait). The implementations of `Read` and `Write` -for these types is not quite as efficient when `Seek` needs to be used, so the -`Seek`-ability will be opted-in to with a new `Cursor` structure with the -following API: - -```rust -pub struct Cursor { - pos: u64, - inner: T, -} -impl Cursor { - pub fn new(inner: T) -> Cursor; - pub fn into_inner(self) -> T; - pub fn get_ref(&self) -> &T; -} - -// Error indicating that a negative offset was seeked to. -pub struct NegativeOffset; - -impl Seek for Cursor> { ... } -impl<'a> Seek for Cursor<&'a [u8]> { ... } -impl<'a> Seek for Cursor<&'a mut [u8]> { ... } - -impl Read for Cursor> { ... } -impl<'a> Read for Cursor<&'a [u8]> { ... } -impl<'a> Read for Cursor<&'a mut [u8]> { ... } - -impl BufRead for Cursor> { ... } -impl<'a> BufRead for Cursor<&'a [u8]> { ... } -impl<'a> BufRead for Cursor<&'a mut [u8]> { ... } - -impl<'a> Write for Cursor<&'a mut [u8]> { ... } -impl Write for Cursor> { ... } -``` - -A sample implementation can be found in [a gist][cursor-impl]. Using one -`Cursor` structure allows to emphasize that the only ability added is an -implementation of `Seek` while still allowing all possible I/O operations for -various types of buffers. - -[cursor-impl]: https://gist.github.com/alexcrichton/8224f57ed029929447bd - -It is not currently proposed to unify these implementations via a trait. For -example a `Cursor>` is a reasonable instance to have, but it will not -have an implementation listed in the standard library to start out. It is -considered a backwards-compatible addition to unify these various `impl` blocks -with a trait. - -The following types will be removed from the standard library and replaced as -follows: - -* `MemReader` -> `Cursor>` -* `MemWriter` -> `Cursor>` -* `BufReader` -> `Cursor<&[u8]>` or `Cursor<&mut [u8]>` -* `BufWriter` -> `Cursor<&mut [u8]>` - -### The `std::io` facade -[The std::io facade]: #the-stdio-facade - -The `std::io` module will largely be a facade over `core::io`, but it -will add some functionality that can live only in `std`. - -#### `Errors` -[Errors]: #error - -The `IoError` type will be renamed to `std::io::Error`, following our -[non-prefixing convention](https://github.com/rust-lang/rfcs/pull/356). -It will remain largely as it is today, but its fields will be made -private. It may eventually grow a field to track the underlying OS -error code. - -The `std::io::IoErrorKind` type will become `std::io::ErrorKind`, and -`ShortWrite` will be dropped (it is no longer needed with the new -`Write` semantics), which should decrease its footprint. The -`OtherIoError` variant will become `Other` now that `enum`s are -namespaced. Other variants may be added over time, such as `Interrupted`, -as more errors are classified from the system. - -The `EndOfFile` variant will be removed in favor of returning `Ok(0)` -from `read` on end of file (or `write` on an empty slice for example). This -approach clarifies the meaning of the return value of `read`, matches Posix -APIs, and makes it easier to use `try!` in the case that a "real" error should -be bubbled out. (The main downside is that higher-level operations that might -use `Result` with some `T != usize` may need to wrap `IoError` in a -further enum if they wish to forward unexpected EOF.) - -#### Channel adapters -[Channel adapters]: #channel-adapters - -The `ChanReader` and `ChanWriter` adapters will be left as they are today, and -they will remain `#[unstable]`. The channel adapters currently suffer from a few -problems today, some of which are inherent to the design: - -* Construction is somewhat unergonomic. First a `mpsc` channel pair must be - created and then each half of the reader/writer needs to be created. -* Each call to `write` involves moving memory onto the heap to be sent, which - isn't necessarily efficient. -* The design of `std::sync::mpsc` allows for growing more channels in the - future, but it's unclear if we'll want to continue to provide a reader/writer - adapter for each channel we add to `std::sync`. - -These types generally feel as if they're from a different era of Rust (which -they are!) and may take some time to fit into the current standard library. They -can be reconsidered for stabilization after the dust settles from the I/O -redesign as well as the recent `std::sync` redesign. At this time, however, this -RFC recommends they remain unstable. - -#### `stdin`, `stdout`, `stderr` -[stdin, stdout, stderr]: #stdin-stdout-stderr - -The current `stdio` module will be removed in favor of these constructors in the -`io` module: - -```rust -pub fn stdin() -> Stdin; -pub fn stdout() -> Stdout; -pub fn stderr() -> Stderr; -``` - -* `stdin` - returns a handle to a **globally shared** standard input of - the process which is buffered as well. Due to the globally shared nature of - this handle, all operations on `Stdin` directly will acquire a lock internally - to ensure access to the shared buffer is synchronized. This implementation - detail is also exposed through a `lock` method where the handle can be - explicitly locked for a period of time so relocking is not necessary. - - The `Read` trait will be implemented directly on the returned `Stdin` handle - but the `BufRead` trait will not be (due to synchronization concerns). The - locked version of `Stdin` (`StdinLock`) will provide an implementation of - `BufRead`. - - The design will largely be the same as is today with the `old_io` module. - - ```rust - impl Stdin { - fn lock(&self) -> StdinLock; - fn read_line(&mut self, into: &mut String) -> io::Result<()>; - fn read_until(&mut self, byte: u8, into: &mut Vec) -> io::Result<()>; - } - impl Read for Stdin { ... } - impl Read for StdinLock { ... } - impl BufRead for StdinLock { ... } - ``` - -* `stderr` - returns a **non buffered** handle to the standard error output - stream for the process. Each call to `write` will roughly translate to a - system call to output data when written to `stderr`. This handle is locked - like `stdin` to ensure, for example, that calls to `write_all` are atomic with - respect to one another. There will also be an RAII guard to lock the handle - and use the result as an instance of `Write`. - - ```rust - impl Stderr { - fn lock(&self) -> StderrLock; - } - impl Write for Stderr { ... } - impl Write for StderrLock { ... } - ``` - -* `stdout` - returns a **globally buffered** handle to the standard output of - the current process. The amount of buffering can be decided at runtime to - allow for different situations such as being attached to a TTY or being - redirected to an output file. The `Write` trait will be implemented for this - handle, and like `stderr` it will be possible to lock it and then use the - result as an instance of `Write` as well. - - ```rust - impl Stdout { - fn lock(&self) -> StdoutLock; - } - impl Write for Stdout { ... } - impl Write for StdoutLock { ... } - ``` - -#### Windows and stdio -[Windows stdio]: #windows-and-stdio - -On Windows, standard input and output handles can work with either arbitrary -`[u8]` or `[u16]` depending on the state at runtime. For example a program -attached to the console will work with arbitrary `[u16]`, but a program attached -to a pipe would work with arbitrary `[u8]`. - -To handle this difference, the following behavior will be enforced for the -standard primitives listed above: - -* If attached to a pipe then no attempts at encoding or decoding will be done, - the data will be ferried through as `[u8]`. - -* If attached to a console, then `stdin` will attempt to interpret all input as - UTF-16, re-encoding into UTF-8 and returning the UTF-8 data instead. This - implies that data will be buffered internally to handle partial reads/writes. - Invalid UTF-16 will simply be discarded returning an `io::Error` explaining - why. - -* If attached to a console, then `stdout` and `stderr` will attempt to interpret - input as UTF-8, re-encoding to UTF-16. If the input is not valid UTF-8 then an - error will be returned and no data will be written. - -#### Raw stdio -[Raw stdio]: #raw-stdio - -> **Note**: This section is intended to be a sketch of possible raw stdio -> support, but it is not planned to implement or stabilize this -> implementation at this time. - -The above standard input/output handles all involve some form of locking or -buffering (or both). This cost is not always wanted, and hence raw variants will -be provided. Due to platform differences across unix/windows, the following -structure will be supported: - -```rust -mod os { - mod unix { - mod stdio { - struct Stdio { .. } - - impl Stdio { - fn stdout() -> Stdio; - fn stderr() -> Stdio; - fn stdin() -> Stdio; - } - - impl Read for Stdio { ... } - impl Write for Stdio { ... } - } - } - - mod windows { - mod stdio { - struct Stdio { ... } - struct StdioConsole { ... } - - impl Stdio { - fn stdout() -> io::Result; - fn stderr() -> io::Result; - fn stdin() -> io::Result; - } - // same constructors StdioConsole - - impl Read for Stdio { ... } - impl Write for Stdio { ... } - - impl StdioConsole { - // returns slice of what was read - fn read<'a>(&self, buf: &'a mut OsString) -> io::Result<&'a OsStr>; - // returns remaining part of `buf` to be written - fn write<'a>(&self, buf: &'a OsStr) -> io::Result<&'a OsStr>; - } - } - } -} -``` - -There are some key differences from today's API: - -* On unix, the API has not changed much except that the handles have been - consolidated into one type which implements both `Read` and `Write` (although - writing to stdin is likely to generate an error). -* On windows, there are two sets of handles representing the difference between - "console mode" and not (e.g. a pipe). When not a console the normal I/O traits - are implemented (delegating to `ReadFile` and `WriteFile`. The console mode - operations work with `OsStr`, however, to show how they work with UCS-2 under - the hood. - -#### Printing functions -[Printing functions]: #printing-functions - -The current `print`, `println`, `print_args`, and `println_args` functions will -all be "removed from the public interface" by [prefixing them with `__` and -marking `#[doc(hidden)]`][gh22607]. These are all implementation details of the -`print!` and `println!` macros and don't need to be exposed in the public -interface. - -[gh22607]: https://github.com/rust-lang/rust/issues/22607 - -The `set_stdout` and `set_stderr` functions will be removed with no replacement -for now. It's unclear whether these functions should indeed control a thread -local handle instead of a global handle as whether they're justified in the -first place. It is a backwards-compatible extension to allow this sort of output -to be redirected and can be considered if the need arises. - -### `std::env` -[std::env]: #stdenv - -Most of what's available in `std::os` today will move to `std::env`, -and the signatures will be updated to follow this RFC's -[Design principles] as follows. - -**Arguments**: - -* `args`: change to yield an iterator rather than vector if possible; in any - case, it should produce an `OsString`. - -**Environment variables**: - -* `vars` (renamed from `env`): yields a vector of `(OsString, OsString)` pairs. -* `var` (renamed from `getenv`): take a value bounded by `AsOsStr`, - allowing Rust strings and slices to be ergonomically passed in. Yields an - `Option`. -* `var_string`: take a value bounded by `AsOsStr`, returning `Result` where `VarError` represents a non-unicode `OsString` or a "not - present" value. -* `set_var` (renamed from `setenv`): takes two `AsOsStr`-bounded values. -* `remove_var` (renamed from `unsetenv`): takes a `AsOsStr`-bounded value. - -* `join_paths`: take an `IntoIterator` where `T: AsOsStr`, yield a - `Result`. -* `split_paths` take a `AsOsStr`, yield an `Iterator`. - -**Working directory**: - -* `current_dir` (renamed from `getcwd`): yields a `PathBuf`. -* `set_current_dir` (renamed from `change_dir`): takes an `AsPath` value. - -**Important locations**: - -* `home_dir` (renamed from `homedir`): returns home directory as a `PathBuf` -* `temp_dir` (renamed from `tmpdir`): returns a temporary directly as a `PathBuf` -* `current_exe` (renamed from `self_exe_name`): returns the full path - to the current binary as a `PathBuf` in an `io::Result` instead of an - `Option`. - -**Exit status**: - -* `get_exit_status` and `set_exit_status` stay as they are, but with - updated docs that reflect that these only affect the return value of - `std::rt::start`. These will remain `#[unstable]` for now and a future RFC - will determine their stability. - -**Architecture information**: - -* `num_cpus`, `page_size`: stay as they are, but remain `#[unstable]`. A future - RFC will determine their stability and semantics. - -**Constants**: - -* Stabilize `ARCH`, `DLL_PREFIX`, `DLL_EXTENSION`, `DLL_SUFFIX`, - `EXE_EXTENSION`, `EXE_SUFFIX`, `FAMILY` as they are. -* Rename `SYSNAME` to `OS`. -* Remove `TMPBUF_SZ`. - -This brings the constants into line with our naming conventions elsewhere. - -#### Items to move to `os::platform` - -* `pipe` will move to `os::unix`. It is currently primarily used for - hooking to the IO of a child process, which will now be done behind - a trait object abstraction. - -#### Removed items - -* `errno`, `error_string` and `last_os_error` provide redundant, - platform-specific functionality and will be removed for now. They - may reappear later in `os::unix` and `os::windows` in a modified - form. -* `dll_filename`: deprecated in favor of working directly with the constants. -* `_NSGetArgc`, `_NSGetArgv`: these should never have been public. -* `self_exe_path`: deprecated in favor of `current_exe` plus path operations. -* `make_absolute`: deprecated in favor of explicitly joining with the working directory. -* all `_as_bytes` variants: deprecated in favor of yielding `OsString` values - -### `std::fs` -[std::fs]: #stdfs - -The `fs` module will provide most of the functionality it does today, -but with a stronger cross-platform orientation. - -Note that all path-consuming functions will now take an -`AsPath`-bounded parameter for ergonomic reasons (this will allow -passing in Rust strings and literals directly, for example). - -#### Free functions -[Free functions]: #free-functions - -**Files**: - -* `copy`. Take `AsPath` bound. -* `rename`. Take `AsPath` bound. -* `remove_file` (renamed from `unlink`). Take `AsPath` bound. - -* `metadata` (renamed from `stat`). Take `AsPath` bound. Yield a new - struct, `Metadata`, with no public fields, but `len`, `is_dir`, - `is_file`, `perms`, `accessed` and `modified` accessors. The various - `os::platform` modules will offer extension methods on this - structure. - -* `set_perms` (renamed from `chmod`). Take `AsPath` bound, and a - `Perms` value. The `Perms` type will be revamped - as a struct with private implementation; see below. - -**Directories**: - -* `create_dir` (renamed from `mkdir`). Take `AsPath` bound. -* `create_dir_all` (renamed from `mkdir_recursive`). Take `AsPath` bound. -* `read_dir` (renamed from `readdir`). Take `AsPath` bound. Yield a - newtypes iterator, which yields a new type `DirEntry` which has an - accessor for `Path`, but will eventually provide other information - as well (possibly via platform-specific extensions). -* `remove_dir` (renamed from `rmdir`). Take `AsPath` bound. -* `remove_dir_all` (renamed from `rmdir_recursive`). Take - `AsPath` bound. -* `walk_dir`. Take `AsPath` bound. Yield an iterator over `IoResult`. - -**Links**: - -* `hard_link` (renamed from `link`). Take `AsPath` bound. -* `soft_link` (renamed from `symlink`). Take `AsPath` bound. -* `read_link` (renamed form `readlink`). Take `AsPath` bound. - -#### Files -[Files]: #files - -The `File` type will largely stay as it is today, except that it will -use the `AsPath` bound everywhere. - -The `stat` method will be renamed to `metadata`, yield a `Metadata` -structure (as described above), and take `&self`. - -The `fsync` method will be renamed to `sync_all`, and `datasync` will be -renamed to `sync_data`. (Although the latter is not available on -Windows, it can be considered an optimization for `flush` and on -Windows behave identically to `sync_all`, just as it does on some Unix -filesystems.) - -The `path` method wil remain `#[unstable]`, as we do not yet want to -commit to its API. - -The `open_mode` function will be removed in favor of and will take an -`OpenOptions` struct, which will encompass today's `FileMode` and -`FileAccess` and support a builder-style API. - -#### File kinds -[File kinds]: #file-kinds - -The `FileType` type will be removed. As mentioned above, `is_file` and -`is_dir` will be provided directly on `Metadata`; the other types -need to be audited for compatibility across -platforms. Platform-specific kinds will be relegated to extension -traits in `std::os::platform`. - -It's possible that an -[extensible](https://github.com/rust-lang/rfcs/pull/757) `Kind` will -be added in the future. - -#### File permissions -[File permissions]: #file-permissions - -The permission models on Unix and Windows vary greatly -- even between -different filesystems within the same OS. Rather than offer an API -that has no meaning on some platforms, we will initially provide a -very limited `Perms` structure in `std::fs`, and then rich -extension traits in `std::os::unix` and `std::os::windows`. Over time, -if clear cross-platform patterns emerge for richer permissions, we can -grow the `Perms` structure. - -On the Unix side, the constructors and accessors for `Perms` -will resemble the flags we have today; details are left to the implementation. - -On the Windows side, initially there will be no extensions, as Windows -has a very complex permissions model that will take some time to build -out. - -For `std::fs` itself, `Perms` will provide constructors and -accessors for "world readable" -- and that is all. At the moment, that -is all that is known to be compatible across the platforms that Rust -supports. - -#### `PathExt` -[PathExt]: #pathext - -This trait will essentially remain stay as it is (renamed from -`PathExtensions`), following the same changes made to `fs` free functions. - -#### Items to move to `os::platform` - -* `lstat` will move to `os::unix` and remain `#[unstable]` *for now* - since it is not yet implemented for Windows. - -* `chown` will move to `os::unix` (it currently does *nothing* on - Windows), and eventually `os::windows` will grow support for - Windows's permission model. If at some point a reasonable - intersection is found, we will re-introduce a cross-platform - function in `std::fs`. - -* In general, offer all of the `stat` fields as an extension trait on - `Metadata` (e.g. `os::unix::MetadataExt`). - -### `std::net` -[std::net]: #stdnet - -The contents of `std::io::net` submodules `tcp`, `udp`, `ip` and -`addrinfo` will be retained but moved into a single `std::net` module; -the other modules are being moved or removed and are described -elsewhere. - -#### SocketAddr - -This structure will represent either a `sockaddr_in` or `sockaddr_in6` which is -commonly just a pairing of an IP address and a port. - -```rust -enum SocketAddr { - V4(SocketAddrV4), - V6(SocketAddrV6), -} - -impl SocketAddrV4 { - fn new(addr: Ipv4Addr, port: u16) -> SocketAddrV4; - fn ip(&self) -> &Ipv4Addr; - fn port(&self) -> u16; -} - -impl SocketAddrV6 { - fn new(addr: Ipv6Addr, port: u16, flowinfo: u32, scope_id: u32) -> SocketAddrV6; - fn ip(&self) -> &Ipv6Addr; - fn port(&self) -> u16; - fn flowinfo(&self) -> u32; - fn scope_id(&self) -> u32; -} -``` - -#### Ipv4Addr - -Represents a version 4 IP address. It has the following interface: - -```rust -impl Ipv4Addr { - fn new(a: u8, b: u8, c: u8, d: u8) -> Ipv4Addr; - fn any() -> Ipv4Addr; - fn octets(&self) -> [u8; 4]; - fn to_ipv6_compatible(&self) -> Ipv6Addr; - fn to_ipv6_mapped(&self) -> Ipv6Addr; -} -``` - -#### Ipv6Addr - -Represents a version 6 IP address. It has the following interface: - -```rust -impl Ipv6Addr { - fn new(a: u16, b: u16, c: u16, d: u16, e: u16, f: u16, g: u16, h: u16) -> Ipv6Addr; - fn any() -> Ipv6Addr; - fn segments(&self) -> [u16; 8] - fn to_ipv4(&self) -> Option; -} -``` - -#### TCP -[TCP]: #tcp - -The current `TcpStream` struct will be pared back from where it is today to the -following interface: - -```rust -// TcpStream, which contains both a reader and a writer - -impl TcpStream { - fn connect(addr: &A) -> io::Result; - fn peer_addr(&self) -> io::Result; - fn local_addr(&self) -> io::Result; - fn shutdown(&self, how: Shutdown) -> io::Result<()>; - fn try_clone(&self) -> io::Result; -} - -impl Read for TcpStream { ... } -impl Write for TcpStream { ... } -impl<'a> Read for &'a TcpStream { ... } -impl<'a> Write for &'a TcpStream { ... } -#[cfg(unix)] impl AsRawFd for TcpStream { ... } -#[cfg(windows)] impl AsRawSocket for TcpStream { ... } -``` - -* `clone` has been replaced with a `try_clone` function. The implementation of - `try_clone` will map to using `dup` on Unix platforms and - `WSADuplicateSocket` on Windows platforms. The `TcpStream` itself will no - longer be reference counted itself under the hood. -* `close_{read,write}` are both removed in favor of binding the `shutdown` - function directly on sockets. This will map to the `shutdown` function on both - Unix and Windows. -* `set_timeout` has been removed for now (as well as other timeout-related - functions). It is likely that this may come back soon as a binding to - `setsockopt` to the `SO_RCVTIMEO` and `SO_SNDTIMEO` options. This RFC does not - currently proposed adding them just yet, however. -* Implementations of `Read` and `Write` are provided for `&TcpStream`. These - implementations are not necessarily ergonomic to call (requires taking an - explicit reference), but they express the ability to concurrently read and - write from a `TcpStream` - -Various other options such as `nodelay` and `keepalive` will be left -`#[unstable]` for now. The `TcpStream` structure will also adhere to both `Send` -and `Sync`. - -The `TcpAcceptor` struct will be removed and all functionality will be folded -into the `TcpListener` structure. Specifically, this will be the resulting API: - -```rust -impl TcpListener { - fn bind(addr: &A) -> io::Result; - fn local_addr(&self) -> io::Result; - fn try_clone(&self) -> io::Result; - fn accept(&self) -> io::Result<(TcpStream, SocketAddr)>; - fn incoming(&self) -> Incoming; -} - -impl<'a> Iterator for Incoming<'a> { - type Item = io::Result; - ... -} -#[cfg(unix)] impl AsRawFd for TcpListener { ... } -#[cfg(windows)] impl AsRawSocket for TcpListener { ... } -``` - -Some major changes from today's API include: - -* The static distinction between `TcpAcceptor` and `TcpListener` has been - removed (more on this in the [socket][Sockets] section). -* The `clone` functionality has been removed in favor of `try_clone` (same - caveats as `TcpStream`). -* The `close_accept` functionality is removed entirely. This is not currently - implemented via `shutdown` (not supported well across platforms) and is - instead implemented via `select`. This functionality can return at a later - date with a more robust interface. -* The `set_timeout` functionality has also been removed in favor of returning at - a later date in a more robust fashion with `select`. -* The `accept` function no longer takes `&mut self` and returns `SocketAddr`. - The change in mutability is done to express that multiple `accept` calls can - happen concurrently. -* For convenience the iterator does not yield the `SocketAddr` from `accept`. - -The `TcpListener` type will also adhere to `Send` and `Sync`. - -#### UDP -[UDP]: #udp - -The UDP infrastructure will receive a similar face-lift as the TCP -infrastructure will: - -```rust -impl UdpSocket { - fn bind(addr: &A) -> io::Result; - fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, SocketAddr)>; - fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; - fn local_addr(&self) -> io::Result; - fn try_clone(&self) -> io::Result; -} - -#[cfg(unix)] impl AsRawFd for UdpSocket { ... } -#[cfg(windows)] impl AsRawSocket for UdpSocket { ... } -``` - -Some important points of note are: - -* The `send` and `recv` function take `&self` instead of `&mut self` to indicate - that they may be called safely in concurrent contexts. -* All configuration options such as `multicast` and `ttl` are left as - `#[unstable]` for now. -* All timeout support is removed. This may come back in the form of `setsockopt` - (as with TCP streams) or with a more general implementation of `select`. -* `clone` functionality has been replaced with `try_clone`. - -The `UdpSocket` type will adhere to both `Send` and `Sync`. - -#### Sockets -[Sockets]: #sockets - -The current constructors for `TcpStream`, `TcpListener`, and `UdpSocket` are -largely "convenience constructors" as they do not expose the underlying details -that a socket can be configured before it is bound, connected, or listened on. -One of the more frequent configuration options is `SO_REUSEADDR` which is set by -default for `TcpListener` currently. - -This RFC leaves it as an open question how best to implement this -pre-configuration. The constructors today will likely remain no matter what as -convenience constructors and a new structure would implement consuming methods -to transform itself to each of the various `TcpStream`, `TcpListener`, and -`UdpSocket`. - -This RFC does, however, recommend not adding multiple constructors to the -various types to set various configuration options. This pattern is best -expressed via a flexible socket type to be added at a future date. - -#### Addresses -[Addresses]: #addresses - -For the current `addrinfo` module: - -* The `get_host_addresses` should be renamed to `lookup_host`. -* All other contents should be removed. - -For the current `ip` module: - -* The `ToSocketAddr` trait should become `ToSocketAddrs` -* The default `to_socket_addr_all` method should be removed. - -The following implementations of `ToSocketAddrs` will be available: - -```rust -impl ToSocketAddrs for SocketAddr { ... } -impl ToSocketAddrs for SocketAddrV4 { ... } -impl ToSocketAddrs for SocketAddrV6 { ... } -impl ToSocketAddrs for (Ipv4Addr, u16) { ... } -impl ToSocketAddrs for (Ipv6Addr, u16) { ... } -impl ToSocketAddrs for (&str, u16) { ... } -impl ToSocketAddrs for str { ... } -impl ToSocketAddrs for &T { ... } -``` - -### `std::process` -[std::process]: #stdprocess - -Currently `std::io::process` is used only for spawning new -processes. The re-envisioned `std::process` will ultimately support -inspecting currently-running processes, although this RFC does not -propose any immediate support for doing so -- it merely future-proofs -the module. - -#### `Command` -[Command]: #command - -The `Command` type is a builder API for processes, and is largely in -good shape, modulo a few tweaks: - -* Replace `ToCStr` bounds with `AsOsStr`. -* Replace `env_set_all` with `env_clear` -* Rename `cwd` to `current_dir`, take `AsPath`. -* Rename `spawn` to `run` -* Move `uid` and `gid` to an extension trait in `os::unix` -* Make `detached` take a `bool` (rather than always setting the - command to detached mode). - -The `stdin`, `stdout`, `stderr` methods will undergo a more -significant change. By default, the corresponding options will be -considered "unset", the interpretation of which depends on how the -process is launched: - -* For `run` or `status`, these will inherit from the current process by default. -* For `output`, these will capture to new readers/writers by default. - -The `StdioContainer` type will be renamed to `Stdio`, and will not be -exposed directly as an enum (to enable growth and change over time). -It will provide a `Capture` constructor for capturing input or output, -an `Inherit` constructor (which just means to use the current IO -object -- it does not take an argument), and a `Null` constructor. The -equivalent of today's `InheritFd` will be added at a later point. - -#### `Child` -[Child]: #child - -We propose renaming `Process` to `Child` so that we can add a -more general notion of non-child `Process` later on (every -`Child` will be able to give you a `Process`). - -* `stdin`, `stdout` and `stderr` will be retained as public fields, - but their types will change to newtyped readers and writers to hide the internal - pipe infrastructure. -* The `kill` method is dropped, and `id` and `signal` will move to `os::platform` extension traits. -* `signal_exit`, `signal_kill`, `wait`, and `forget` will all stay as they are. -* `set_timeout` will be changed to use the `with_deadline` infrastructure. - -There are also a few other related changes to the module: - -* Rename `ProcessOutput` to `Output` -* Rename `ProcessExit` to `ExitStatus`, and hide its - representation. Remove `matches_exit_status`, and add a `status` - method yielding an `Option` -* Remove `MustDieSignal`, `PleaseExitSignal`. -* Remove `EnvMap` (which should never have been exposed). - -### `std::os` -[std::os]: #stdos - -Initially, this module will be empty except for the platform-specific -`unix` and `windows` modules. It is expected to grow additional, more -specific platform submodules (like `linux`, `macos`) over time. - -## Odds and ends -[Odds and ends]: #odds-and-ends - -> To be expanded in a follow-up PR. - -### The `io` prelude -[The io prelude]: #the-io-prelude - -The `prelude` submodule will contain most of the traits, types, and -modules discussed in this RFC; it is meant to provide maximal -convenience when working with IO of any kind. The exact contents of -the module are left as an open question. - -# Drawbacks -[Drawbacks]: #drawbacks - -This RFC is largely about cleanup, normalization, and stabilization of -our IO libraries -- work that needs to be done, but that also -represents nontrivial churn. - -However, the actual implementation work involved is estimated to be -reasonably contained, since all of the functionality is already in -place in some form (including `os_str`, due to @SimonSapin's -[WTF-8 implementation](https://github.com/SimonSapin/rust-wtf8)). - -# Alternatives -[Alternatives]: #alternatives - -The main alternative design would be to continue staying with the -Posix tradition in terms of naming and functionality (for which there -is precedent in some other languages). However, Rust is already -well-known for its strong cross-platform compatibility in `std`, and -making the library more Windows-friendly will only increase its appeal. - -More radically different designs (in terms of different design -principles or visions) are outside the scope of this RFC. - -# Unresolved questions -[Unresolved questions]: #unresolved-questions - -> To be expanded in follow-up PRs. - -## Wide string representation - -(Text from @SimonSapin) - -Rather than WTF-8, `OsStr` and `OsString` on Windows could use -potentially-ill-formed UTF-16 (a.k.a. "wide" strings), with a -different cost trade off. - -Upside: -* No conversion between `OsStr` / `OsString` and OS calls. - -Downsides: -* More expensive conversions between `OsStr` / `OsString` and `str` / `String`. -* These conversions have inconsistent performance characteristics between platforms. (Need to allocate on Windows, but not on Unix.) -* Some of them return `Cow`, which has some ergonomic hit. - -The API (only parts that differ) could look like: - -```rust -pub mod os_str { - #[cfg(windows)] - mod imp { - type Buf = Vec; - type Slice = [u16]; - ... - } - - impl OsStr { - pub fn from_str(&str) -> Cow; - pub fn to_string(&self) -> Option; - pub fn to_string_lossy(&self) -> CowString; - } - - #[cfg(windows)] - pub mod windows{ - trait OsStringExt { - fn from_wide_slice(&[u16]) -> Self; - fn from_wide_vec(Vec) -> Self; - fn into_wide_vec(self) -> Vec; - } - - trait OsStrExt { - fn from_wide_slice(&[u16]) -> Self; - fn as_wide_slice(&self) -> &[u16]; - } - } -} -``` +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/10517-io-os-reform.md). diff --git a/text/0520-new-array-repeat-syntax.md b/text/0520-new-array-repeat-syntax.md index 45a858377f0..ca464883789 100644 --- a/text/0520-new-array-repeat-syntax.md +++ b/text/0520-new-array-repeat-syntax.md @@ -1,179 +1 @@ -- Start Date: 2014-12-13 -- RFC PR: [520](https://github.com/rust-lang/rfcs/pull/520) -- Rust Issue: [19999](https://github.com/rust-lang/rust/issues/19999) - -# Summary - -Under this RFC, the syntax to specify the type of a fixed-length array -containing `N` elements of type `T` would be changed to `[T; N]`. Similarly, the -syntax to construct an array containing `N` duplicated elements of value `x` -would be changed to `[x; N]`. - -# Motivation - -[RFC 439](https://github.com/rust-lang/rfcs/blob/master/text/0439-cmp-ops-reform.md) -(cmp/ops reform) has resulted in an ambiguity that must be resolved. Previously, -an expression with the form `[x, ..N]` would unambiguously refer to an array -containing `N` identical elements, since there would be no other meaning that -could be assigned to `..N`. However, under RFC 439, `..N` should now desugar to -an object of type `RangeTo`, with `T` being the type of `N`. - -In order to resolve this ambiguity, there must be a change to either the syntax -for creating an array of repeated values, or the new range syntax. This RFC -proposes the former, in order to preserve existing functionality while avoiding -modifications that would make the range syntax less intuitive. - -# Detailed design - -The syntax `[T, ..N]` for specifying array types will be replaced by the new -syntax `[T; N]`. - -In the expression `[x, ..N]`, the `..N` will refer to an expression of type -`RangeTo` (where `T` is the type of `N`). As with any other array of two -elements, `x` will have to be of the same type, and the array expression will be -of type `[RangeTo; 2]`. - -The expression `[x; N]` will be equivalent to the old meaning of the syntax -`[x, ..N]`. Specifically, it will create an array of length `N`, each element of -which has the value `x`. - -The effect will be to convert uses of arrays such as this: - -```rust -let a: [uint, ..2] = [0u, ..2]; -``` - -to this: - -```rust -let a: [uint; 2] = [0u; 2]; -``` - -## Match patterns - -In match patterns, `..` is always interpreted as a wildcard for constructor -arguments (or for slice patterns under the `advanced_slice_patterns` feature -gate). This RFC does not change that. In a match pattern, `..` will always be -interpreted as a wildcard, and never as sugar for a range constructor. - -## Suggested implementation - -While not required by this RFC, one suggested transition plan is as follows: - -- Implement the new syntax for `[T; N]`/`[x; N]` proposed above. - -- Issue deprecation warnings for code that uses `[T, ..N]`/`[x, ..N]`, allowing - easier identification of code that needs to be transitioned. - -- When RFC 439 range literals are implemented, remove the deprecated syntax and - thus complete the implementation of this RFC. - -# Drawbacks - -## Backwards incompatibility - -- Changing the method for specifying an array size will impact a large amount of - existing code. Code conversion can probably be readily automated, but will - still require some labor. - -## Implementation time - -This proposal is submitted very close to the anticipated release of Rust -1.0. Changing the array repeat syntax is likely to require more work than -changing the range syntax specified in RFC 439, because the latter has not yet -been implemented. - -However, this decision cannot be reasonably postponed. Many users have expressed -a preference for implementing the RFC 439 slicing syntax as currently specified -rather than preserving the existing array repeat syntax. This cannot be resolved -in a backwards-compatible manner if the array repeat syntax is kept. - -# Alternatives - -Inaction is not an alternative due to the ambiguity introduced by RFC 439. Some -resolution must be chosen in order for the affected modules in `std` to be -stabilized. - -## Retain the type syntax only - -In theory, it seems that the type syntax `[T, ..N]` could be retained, while -getting rid of the expression syntax `[x, ..N]`. The problem with this is that, -if this syntax was removed, there is currently no way to define a macro to -replace it. - -Retaining the current type syntax, but changing the expression syntax, would -make the language somewhat more complex and inconsistent overall. There seem to -be no advocates of this alternative so far. - -## Different array repeat syntax - -The comments in [pull request #498](https://github.com/rust-lang/rfcs/pull/498) -mentioned many candidates for new syntax other than the `[x; N]` form in this -RFC. The comments on the pull request of this RFC mentioned many more. - -- Instead of using `[x; N]`, use `[x for N]`. - - - This use of `for` would not be exactly analogous to existing `for` loops, - because those accept an iterator rather than an integer. To a new user, - the expression `[x for N]` would resemble a list comprehension - (e.g. Python's syntax is `[expr for i in iter]`), but in fact it does - something much simpler. - - It may be better to avoid uses of `for` that could complicate future - language features, e.g. returning a value other than `()` from loops, or - some other syntactic sugar related to iterators. However, the risk of - actual ambiguity is not that high. - -- Introduce a different symbol to specify array sizes, e.g. `[T # N]`, - `[T @ N]`, and so forth. - -- Introduce a keyword rather than a symbol. There are many other options, e.g. - `[x by N]`. The original version of this proposal was for `[N of x]`, but this - was deemed to complicate parsing too much, since the parser would not know - whether to expect a type or an expression after the opening bracket. - -- Any of several more radical changes. - -## Change the range syntax - -The main problem here is that there are no proposed candidates that seem as -clear and ergonomic as `i..j`. The most common alternative for slicing in other -languages is `i:j`, but in Rust this simply causes an ambiguity with a different -feature, namely type ascription. - -## Limit range syntax to the interior of an index (use `i..j` for slicing only) - -This resolves the issue since indices can be distinguished from arrays. However, -it removes some of the benefits of RFC 439. For instance, it removes the -possibility of using `for i in 1..10` to loop. - -## Remove `RangeTo` from RFC 439 - -The proposal in pull request #498 is to remove the sugar for `RangeTo` (i.e., -`..j`) while retaining other features of RFC 439. This is the simplest -resolution, but removes some convenience from the language. It is also -counterintuitive, because `RangeFrom` (i.e. `i..`) is retained, and because `..` -still has several different meanings in the language (ranges, repetition, and -pattern wildcards). - -# Unresolved questions - -## Match patterns - -There will still be two semantically distinct uses of `..`, for the RFC 439 -range syntax and for wildcards in patterns. This could be considered harmful -enough to introduce further changes to separate the two. Or this could be -considered innocuous enough to introduce some additional range-related meaning -for `..` in certain patterns. - -It is possible that the new syntax `[x; N]` could itself be used within -patterns. - -This RFC does not attempt to address any of these issues, because the current -pattern syntax does not allow use of the repeated array syntax, and does not -contain an ambiguity. - -## Behavior of `for` in array expressions - -It may be useful to allow `for` to take on a new meaning in array expressions. -This RFC keeps this possibility open, but does not otherwise propose any -concrete changes to move towards or away from this feature. +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0520-new-array-repeat-syntax.md). diff --git a/text/0601-replace-be-with-become.md b/text/0601-replace-be-with-become.md index 767c8e1f44f..9b92699e251 100644 --- a/text/0601-replace-be-with-become.md +++ b/text/0601-replace-be-with-become.md @@ -1,37 +1 @@ -- Start Date: 2015-01-20 -- RFC PR: [rust-lang/rfcs#601](https://github.com/rust-lang/rfcs/pull/601/) -- Rust Issue: [rust-lang/rust#22141](https://github.com/rust-lang/rust/issues/22141) - -# Summary - -Rename the `be` reserved keyword to `become`. - -# Motivation - -A keyword needs to be reserved to support guaranteed tail calls in a backward-compatible way. Currently the keyword reserved for this purpose is `be`, but the `become` alternative was proposed in -the old [RFC](https://github.com/rust-lang/rfcs/pull/81) for guaranteed tail calls, which is now postponed and tracked in [PR#271](https://github.com/rust-lang/rfcs/issues/271). - -Some advantages of the `become` keyword are: - - it provides a clearer indication of its meaning ("this function becomes that function") - - its syntax results in better code alignment (`become` is exactly as long as `return`) - -The expected result is that users will be unable to use `become` as identifier, ensuring that it will be available for future language extensions. - -This RFC is not about implementing tail call elimination, only on whether the `be` keyword should be replaced with `become`. - -# Detailed design - -Rename the `be` reserved word to `become`. This is a very simple find-and-replace. - -# Drawbacks - -Some code might be using `become` as an identifier. - -# Alternatives - -The main alternative is to do nothing, i.e. to keep the `be` keyword reserved for supporting guaranteed tail calls in a backward-compatible way. Using `become` as the keyword for tail calls would not be backward-compatible because it would introduce a new keyword, which might have been used in valid code. - -Another option is to add the `become` keyword, without removing `be`. This would have the same drawbacks as the current proposal (might break existing code), but it would also guarantee that the `become` keyword is available in the future. - -# Unresolved questions - +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0601-replace-be-with-become.md). diff --git a/text/0909-move-thread-local-to-std-thread.md b/text/0909-move-thread-local-to-std-thread.md index 937c5dd608f..80c2a3718d8 100644 --- a/text/0909-move-thread-local-to-std-thread.md +++ b/text/0909-move-thread-local-to-std-thread.md @@ -1,47 +1 @@ -- Feature Name: N/A -- Start Date: 2015-02-25 -- RFC PR: https://github.com/rust-lang/rfcs/pull/909 -- Rust Issue: https://github.com/rust-lang/rust/issues/23547 - -# Summary - -Move the contents of `std::thread_local` into `std::thread`. Fully -remove `std::thread_local` from the standard library. - -# Motivation - -Thread locals are directly related to threading. Combining the modules -would reduce the number of top level modules, combine related concepts, -and make browsing the docs easier. It also would have the potential to -slightly reduce the number of `use` statementsl - -# Detailed design - -The contents of`std::thread_local` module would be moved into to -`std::thread::local`. `Key` would be renamed to `LocalKey`, and -`scoped` would also be flattened, providing `ScopedKey`, etc. This -way, all thread related code is combined in one module. - -It would also allow using it as such: - -```rust -use std::thread::{LocalKey, Thread}; -``` - -# Drawbacks - -It's pretty late in the 1.0 release cycle. This is a mostly bike -shedding level of a change. It may not be worth changing it at this -point and staying with two top level modules in `std`. Also, some users -may prefer to have more top level modules. - -# Alternatives - -An alternative (as the RFC originally proposed) would be to bring -`thread_local` in as a submodule, rather than flattening. This was -decided against in an effort to keep hierarchies flat, and because of -the slim contents on the `thread_local` module. - -# Unresolved questions - -The exact strategy for moving the contents into `std::thread` +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/0909-move-thread-local-to-std-thread.md). diff --git a/text/1102-rename-connect-to-join.md b/text/1102-rename-connect-to-join.md index 35bae6a7d5f..248f08bc8f2 100644 --- a/text/1102-rename-connect-to-join.md +++ b/text/1102-rename-connect-to-join.md @@ -1,77 +1 @@ -- Feature Name: `rename_connect_to_join` -- Start Date: 2015-05-02 -- RFC PR: [rust-lang/rfcs#1102](https://github.com/rust-lang/rfcs/pull/1102) -- Rust Issue: [rust-lang/rust#26900](https://github.com/rust-lang/rust/issues/26900) - -# Summary - -Rename `.connect()` to `.join()` in `SliceConcatExt`. - -# Motivation - -Rust has a string concatenation method named `.connect()` in `SliceConcatExt`. -However, this does not align with the precedents in other languages. Most -languages use `.join()` for that purpose, as seen later. - -This is probably because, in the ancient Rust, `join` was a keyword to join a -task. However, `join` retired as a keyword in 2011 with the commit -rust-lang/rust@d1857d3. While `.connect()` is technically correct, the name may -not be directly inferred by the users of the mainstream languages. There was [a -question] about this on reddit. - -[a question]: http://www.reddit.com/r/rust/comments/336rj3/whats_the_best_way_to_join_strings_with_a_space/ - -The languages that use the name of `join` are: - -- Python: [str.join](https://docs.python.org/3/library/stdtypes.html#str.join) -- Ruby: [Array.join](http://ruby-doc.org/core-2.2.0/Array.html#method-i-join) -- JavaScript: [Array.prototype.join](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/join) -- Go: [strings.Join](https://golang.org/pkg/strings/#Join) -- C#: [String.Join](https://msdn.microsoft.com/en-us/library/dd783876%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396) -- Java: [String.join](http://docs.oracle.com/javase/8/docs/api/java/lang/String.html#join-java.lang.CharSequence-java.lang.Iterable-) -- Perl: [join](http://perldoc.perl.org/functions/join.html) - -The languages not using `join` are as follows. Interestingly, they are -all functional-ish languages. - -- Haskell: [intercalate](http://hackage.haskell.org/package/text-1.2.0.4/docs/Data-Text.html#v:intercalate) -- OCaml: [String.concat](http://caml.inria.fr/pub/docs/manual-ocaml/libref/String.html#VALconcat) -- F#: [String.concat](https://msdn.microsoft.com/en-us/library/ee353761.aspx) - -Note that Rust also has `.concat()` in `SliceConcatExt`, which is a specialized -version of `.connect()` that uses an empty string as a separator. - -Another reason is that the term "join" already has similar usage in the standard -library. There are `std::path::Path::join` and `std::env::join_paths` which are -used to join the paths. - -# Detailed design - -While the `SliceConcatExt` trait is unstable, the `.connect()` method itself is -marked as stable. So we need to: - -1. Deprecate the `.connect()` method. -2. Add the `.join()` method. - -Or, if we are to achieve the [instability guarantee], we may remove the old -method entirely, as it's still pre-1.0. However, the author considers that this -may require even more consensus. - -[instability guarantee]: https://github.com/rust-lang/rust/issues/24928 - -# Drawbacks - -Having a deprecated method in a newborn language is not pretty. - -If we do remove the `.connect()` method, the language becomes pretty again, but -it breaks the stability guarantee at the same time. - -# Alternatives - -Keep the status quo. Improving searchability in the docs will help newcomers -find the appropriate method. - -# Unresolved questions - -Are there even more clever names for the method? How about `.homura()`, or -`.madoka()`? +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/1102-rename-connect-to-join.md). diff --git a/text/1679-panic-safe-slicing.md b/text/1679-panic-safe-slicing.md index 11cbd2c4f1b..59e7d959bbc 100644 --- a/text/1679-panic-safe-slicing.md +++ b/text/1679-panic-safe-slicing.md @@ -1,122 +1 @@ -- Feature Name: `panic_safe_slicing` -- Start Date: 2015-10-16 -- RFC PR: [rust-lang/rfcs#1679](https://github.com/rust-lang/rfcs/pull/1679) -- Rust Issue: [rust-lang/rfcs#35729](https://github.com/rust-lang/rust/issues/35729) - -# Summary - -Add "panic-safe" or "total" alternatives to the existing panicking indexing syntax. - -# Motivation - -`SliceExt::get` and `SliceExt::get_mut` can be thought as non-panicking versions of the simple -indexing syntax, `a[idx]`, and `SliceExt::get_unchecked` and `SliceExt::get_unchecked_mut` can -be thought of as unsafe versions with bounds checks elided. However, there is no such equivalent for -`a[start..end]`, `a[start..]`, or `a[..end]`. This RFC proposes such methods to fill the gap. - -# Detailed design - -The `get`, `get_mut`, `get_unchecked`, and `get_unchecked_mut` will be made generic over `usize` -as well as ranges of `usize` like slice's `Index` implementation currently is. This will allow e.g. -`a.get(start..end)` which will behave analagously to `a[start..end]`. - -Because methods cannot be overloaded in an ad-hoc manner in the same way that traits may be -implemented, we introduce a `SliceIndex` trait which is implemented by types which can index into a -slice: -```rust -pub trait SliceIndex { - type Output: ?Sized; - - fn get(self, slice: &[T]) -> Option<&Self::Output>; - fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output>; - unsafe fn get_unchecked(self, slice: &[T]) -> &Self::Output; - unsafe fn get_mut_unchecked(self, slice: &[T]) -> &mut Self::Output; - fn index(self, slice: &[T]) -> &Self::Output; - fn index_mut(self, slice: &mut [T]) -> &mut Self::Output; -} - -impl SliceIndex for usize { - type Output = T; - // ... -} - -impl SliceIndex for R - where R: RangeArgument -{ - type Output = [T]; - // ... -} -``` - -And then alter the `Index`, `IndexMut`, `get`, `get_mut`, `get_unchecked`, and `get_mut_unchecked` -implementations to be generic over `SliceIndex`: -```rust -impl [T] { - pub fn get(&self, idx: I) -> Option - where I: SliceIndex - { - idx.get(self) - } - - pub fn get_mut(&mut self, idx: I) -> Option - where I: SliceIndex - { - idx.get_mut(self) - } - - pub unsafe fn get_unchecked(&self, idx: I) -> I::Output - where I: SliceIndex - { - idx.get_unchecked(self) - } - - pub unsafe fn get_mut_unchecked(&mut self, idx: I) -> I::Output - where I: SliceIndex - { - idx.get_mut_unchecked(self) - } -} - -impl Index for [T] - where I: SliceIndex -{ - type Output = I::Output; - - fn index(&self, idx: I) -> &I::Output { - idx.index(self) - } -} - -impl IndexMut for [T] - where I: SliceIndex -{ - fn index_mut(&self, idx: I) -> &mut I::Output { - idx.index_mut(self) - } -} -``` - -# Drawbacks - -- The `SliceIndex` trait is unfortunate - it's tuned for exactly the set of methods it's used by. - It only exists because inherent methods cannot be overloaded the same way that trait - implementations can be. It would most likely remain unstable indefinitely. -- Documentation may suffer. Rustdoc output currently explicitly shows each of the ways you can - index a slice, while there will simply be a single generic implementation with this change. This - may not be that bad, though. The doc block currently seems to provided the most valuable - information to newcomers rather than the trait bound, and that will still be present with this - change. - -# Alternatives - -- Stay as is. -- A previous version of this RFC introduced new `get_slice` etc methods rather than overloading - `get` etc. This avoids the utility trait but is somewhat less ergonomic. -- Instead of one trait amalgamating all of the required methods, we could have one trait per - method. This would open a more reasonable door to stabilizing those traits, but adds quite a lot - more surface area. Replacing an unstable `SliceIndex` trait with a collection would be - backwards compatible. - -# Unresolved questions - -None +The file for this RFC has been removed, but the RFC is still in force and can be [read on GitHub](https://github.com/rust-lang/rfcs/blob/d046f391fa560839af3569be5b13b477a5aa29f9/text/1679-panic-safe-slicing.md).