Skip to content

Numeric literal expressions and literal suffixes #1177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
34e7e5f
tokens.md: move the link reference definitions to the end of the file
mattheww Mar 5, 2022
e06fea0
Move the general description of literal expressions from tokens.md to…
mattheww Mar 6, 2022
9b32e40
Describe the effect of integer literal suffixes in literal-expr.md ra…
mattheww Mar 6, 2022
8fbbbda
Describe the effect of floating-point literal suffixes in literal-exp…
mattheww Mar 6, 2022
d478e54
Document how the value of an integer literal expression is determined
mattheww Mar 6, 2022
938011b
Say that integer literals out of the u128 range are a parse-time error
mattheww Mar 6, 2022
9495744
Literal expressions: add a Note describing the overflowing_literals lint
mattheww Mar 6, 2022
72a554a
Document how the value of a floating-point literal expression is dete…
mattheww Mar 6, 2022
c8b9a20
Literal expressions: add a Note on infinite and NaN floating-point li…
mattheww Mar 6, 2022
1913a4f
Notes about negated literals
mattheww Mar 6, 2022
46d4a27
Say that out-of-range suffixed integer literals are valid lexer tokens
mattheww Mar 6, 2022
5f81f6a
Make the FLOAT_LITERAL rule mention keywords as well as identifiers
mattheww Mar 6, 2022
71fd6e3
Add the 5f32 case to the text description of floating-point literals
mattheww Mar 6, 2022
e2015cc
Add some examples of possibly confusing hexadecimal literals
mattheww Mar 6, 2022
6a379ac
Add a Lexer rules block for number literals with arbitrary suffixes
mattheww Mar 6, 2022
6e19792
Document reserved forms similar to number literals
mattheww Mar 6, 2022
e5ef69a
tokens.md: add two zero-width spaces to placate linkchecker
mattheww Mar 6, 2022
8aa8b9a
Cover two missing cases of number pseudoliterals
mattheww Mar 21, 2022
7baad0a
Make the FLOAT_LITERAL rule about final `.` more accurate
mattheww Mar 22, 2022
9704aad
Literal expressions: text improvements from ehuss
mattheww Mar 22, 2022
56105c2
tokens.md: add missing superscript markup
mattheww Mar 22, 2022
d1f3e7f
Number pseudoliterals and reserved forms: text improvements from ehuss
mattheww Mar 22, 2022
0c4554f
Literal expressions: use a sublist when describing choice of radix
mattheww Mar 23, 2022
2c78399
Literal expressions: add placeholder sections for types not yet docum…
mattheww Mar 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 148 additions & 3 deletions src/expressions/literal-expr.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,164 @@
>    | [BYTE_LITERAL]\
>    | [BYTE_STRING_LITERAL]\
>    | [RAW_BYTE_STRING_LITERAL]\
>    | [INTEGER_LITERAL]\
>    | [INTEGER_LITERAL][^out-of-range]\
>    | [FLOAT_LITERAL]\
>    | [BOOLEAN_LITERAL]
>
> [^out-of-range]: A value ≥ 2<sup>128</sup> is not allowed.

A _literal expression_ consists of one of the [literal](../tokens.md#literals) forms described earlier.
It directly describes a number, character, string, or boolean value.
A _literal expression_ is an expression consisting of a single token, rather than a sequence of tokens, that immediately and directly denotes the value it evaluates to, rather than referring to it by name or some other evaluation rule.

A literal is a form of [constant expression], so is evaluated (primarily) at compile time.

Each of the lexical [literal][literal tokens] forms described earlier can make up a literal expression.

```rust
"hello"; // string type
'5'; // character type
5; // integer type
```

## Character literal expressions

A character literal expression consists of a single [CHAR_LITERAL] token.

> **Note**: This section is incomplete.

## String literal expressions

A string literal expression consists of a single [STRING_LITERAL] or [RAW_STRING_LITERAL] token.

> **Note**: This section is incomplete.

## Byte literal expressions

A byte literal expression consists of a single [BYTE_LITERAL] token.

> **Note**: This section is incomplete.

## Byte string literal expressions

A string literal expression consists of a single [BYTE_STRING_LITERAL] or [RAW_BYTE_STRING_LITERAL] token.

> **Note**: This section is incomplete.

## Integer literal expressions

An integer literal expression consists of a single [INTEGER_LITERAL] token.

If the token has a [suffix], the suffix will be the name of one of the [primitive integer types][numeric types]: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `u128`, `i128`, `usize`, or `isize`, and the expression has that type.

If the token has no suffix, the expression's type is determined by type inference:

* If an integer type can be _uniquely_ determined from the surrounding program context, the expression has that type.

* If the program context under-constrains the type, it defaults to the signed 32-bit integer `i32`.

* If the program context over-constrains the type, it is considered a static type error.

Examples of integer literal expressions:

```rust
123; // type i32
123i32; // type i32
123u32; // type u32
123_u32; // type u32
let a: u64 = 123; // type u64

0xff; // type i32
0xff_u8; // type u8

0o70; // type i32
0o70_i16; // type i16

0b1111_1111_1001_0000; // type i32
0b1111_1111_1001_0000i64; // type i64

0usize; // type usize
```

The value of the expression is determined from the string representation of the token as follows:

* An integer radix is chosen by inspecting the first two characters of the string, as follows:

* `0b` indicates radix 2
* `0o` indicates radix 8
* `0x` indicates radix 16
* otherwise the radix is 10.

* If the radix is not 10, the first two characters are removed from the string.

* Any underscores are removed from the string.

* The string is converted to a `u128` value as if by [`u128::from_str_radix`] with the chosen radix.
If the value does not fit in `u128`, the expression is rejected by the parser.

* The `u128` value is converted to the expression's type via a [numeric cast].

> **Note**: The final cast will truncate the value of the literal if it does not fit in the expression's type.
> `rustc` includes a [lint check] named `overflowing_literals`, defaulting to `deny`, which rejects expressions where this occurs.

> **Note**: `-1i8`, for example, is an application of the [negation operator] to the literal expression `1i8`, not a single integer literal expression.

## Floating-point literal expressions

A floating-point literal expression consists of a single [FLOAT_LITERAL] token.

If the token has a [suffix], the suffix will be the name of one of the [primitive floating-point types][floating-point types]: `f32` or `f64`, and the expression has that type.

If the token has no suffix, the expression's type is determined by type inference:

* If a floating-point type can be _uniquely_ determined from the surrounding program context, the expression has that type.

* If the program context under-constrains the type, it defaults to `f64`.

* If the program context over-constrains the type, it is considered a static type error.

Examples of floating-point literal expressions:

```rust
123.0f64; // type f64
0.1f64; // type f64
0.1f32; // type f32
12E+99_f64; // type f64
5f32; // type f32
let x: f64 = 2.; // type f64
```

The value of the expression is determined from the string representation of the token as follows:

* Any underscores are removed from the string.

* The string is converted to the expression's type as if by [`f32::from_str`] or [`f64::from_str`].

> **Note**: `-1.0`, for example, is an application of the [negation operator] to the literal expression `1.0`, not a single floating-point literal expression.

> **Note**: `inf` and `NaN` are not literal tokens.
> The [`f32::INFINITY`], [`f64::INFINITY`], [`f32::NAN`], and [`f64::NAN`] constants can be used instead of literal expressions.
> In `rustc`, a literal large enough to be evaluated as infinite will trigger the `overflowing_literals` lint check.

## Boolean literal expressions

A boolean literal expression consists of a single [BOOLEAN_LITERAL] token.

> **Note**: This section is incomplete.

[constant expression]: ../const_eval.md#constant-expressions
[floating-point types]: ../types/numeric.md#floating-point-types
[lint check]: ../attributes/diagnostics.md#lint-check-attributes
[literal tokens]: ../tokens.md#literals
[numeric cast]: operator-expr.md#numeric-cast
[numeric types]: ../types/numeric.md
[suffix]: ../tokens.md#suffixes
[negation operator]: operator-expr.md#negation-operators
[`f32::from_str`]: ../../core/primitive.f32.md#method.from_str
[`f32::INFINITY`]: ../../core/primitive.f32.md#associatedconstant.INFINITY
[`f32::NAN`]: ../../core/primitive.f32.md#associatedconstant.NAN
[`f64::from_str`]: ../../core/primitive.f64.md#method.from_str
[`f64::INFINITY`]: ../../core/primitive.f64.md#associatedconstant.INFINITY
[`f64::NAN`]: ../../core/primitive.f64.md#associatedconstant.NAN
[`u128::from_str_radix`]: ../../core/primitive.u128.md#method.from_str_radix
[CHAR_LITERAL]: ../tokens.md#character-literals
[STRING_LITERAL]: ../tokens.md#string-literals
[RAW_STRING_LITERAL]: ../tokens.md#raw-string-literals
Expand Down
Loading