Skip to content

Commit 4502f1a

Browse files
committed
Add the repository README
1 parent 0a966e7 commit 4502f1a

File tree

1 file changed

+132
-1
lines changed

1 file changed

+132
-1
lines changed

README.md

Lines changed: 132 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,138 @@ fast-float
66
[![Documentation](https://docs.rs/fast-float/badge.svg)](https://docs.rs/fast-float)
77
[![Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
88
[![MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
9-
[![Rust 1.47+](https://img.shields.io/badge/rustc-1.47+-lightgray.svg)](https://blog.rust-lang.org/2020/10/08/Rust-1.47.html)
9+
[![Rustc 1.47+](https://img.shields.io/badge/rustc-1.47+-lightgray.svg)](https://blog.rust-lang.org/2020/10/08/Rust-1.47.html)
10+
11+
This crate provides a super-fast decimal number parser from strings into floats.
12+
13+
```toml
14+
[dependencies]
15+
fast-float = "0.1"
16+
```
17+
18+
There are no dependencies and the crate can be used in a no_std context by disabling the "std" feature.
19+
20+
*Compiler support: rustc 1.47+.*
21+
22+
## Usage
23+
24+
There's two top-level functions provided:
25+
[`parse()`](https://docs.rs/fast-float/latest/fast_float/fn.parse.html) and
26+
[`parse_partial()`](https://docs.rs/fast-float/latest/fast_float/fn.parse_partial.html), both taking
27+
either a string or a bytes slice and parsing the input into either `f32` or `f64`:
28+
29+
- `parse()` treats the whole string as a decimal number and returns an error if there are
30+
invalid characters or if the string is empty.
31+
- `parse_partial()` tries to find the longest substring at the beginning of the given input
32+
string that can be parsed as a decimal number and, in the case of success, returns the parsed
33+
value along the number of characters processed; an error is returned if the string doesn't
34+
start with a decimal number or if it is empty. This function is most useful as a building
35+
block when constructing more complex parsers, or when parsing streams of data.
36+
37+
Example:
38+
39+
```rust
40+
// Parse the entire string as a decimal number.
41+
let s = "1.23e-02";
42+
let x: f32 = fast_float::parse(s).unwrap();
43+
assert_eq!(x, 0.0123);
44+
45+
// Parse as many characters as possible as a decimal number.
46+
let s = "1.23e-02foo";
47+
let (x, n) = fast_float::parse_partial::<f32, _>(s).unwrap();
48+
assert_eq!(x, 0.0123);
49+
assert_eq!(n, 8);
50+
assert_eq!(&s[n..], "foo");
51+
```
52+
53+
## Details
54+
55+
This crate is a direct port of Daniel Lemire's [`fast_float`](https://github.com/fastfloat/fast_float)
56+
C++ library (valuable discussions with Daniel while porting it helped shape the crate and get it to
57+
the performance level it's at now), with some Rust-specific tweaks. Please see the original
58+
repository for many useful details regarding the algorithm and the implementation.
59+
60+
The parser is locale-independent. The resulting value is the closest floating-point values (using either
61+
`f32` or `f64), using the "round to even" convention for values that would otherwise fall right in-between
62+
two values. That is, we provide exact parsing according to the IEEE standard.
63+
64+
Infinity and NaN values can be parsed, along with scientific notation.
65+
66+
Both little-endian and big-endian platforms are equally supported, with extra optimizations enabled
67+
on little-endian architectures.
68+
69+
## Performance
70+
71+
The presented parser seems to beat all of the existing C/C++/Rust float parsers known to us at the
72+
moment by a large margin, in all of the datasets we tested it on so far – see detailed benchmarks
73+
below (the only exception being the original fast_float C++ library, of course – performance of
74+
which is within noise bounds of this crate). On modern machines, parsing throughput can reach
75+
up to 1GB/s.
76+
77+
In particular, it is faster than Rust standard library's `FromStr::from_str()` by a factor of 2-8x
78+
(larger factor for longer float strings).
79+
80+
While various details regarding the algorithm can be found in the repository for the original
81+
C++ library, here are few brief notes:
82+
83+
- The parser is specialized to work lightning-fast on inputs with at most 19 significant digits
84+
(which constitutes the so called "fast-path"). We believe that most real-life inputs should
85+
fall under this category, and we treat longer inputs as "degenerate" edge cases since it
86+
inevitable causes overflows and loss of precision.
87+
- If the significand happens to be longer than 19 digits, the parser falls back to the "slow path",
88+
in which case its performance roughly matches that of the top Rust/C++ libraries (and still
89+
beats them most of the time, although not by a lot).
90+
- On little-endian systems, there's additional optimizations for numbers with more than 8 digits
91+
after the decimal point.
92+
93+
## Benchmarks
94+
95+
Below is the table of average timings in nanoseconds for parsing a single number
96+
into a 64-bit float.
97+
98+
```
99+
| | `canada` | `mesh` | `uniform` | `iidi` | `iei` | `rec32` |
100+
| ---------------- | -------- | -------- | --------- | ------ | ------ | ------- |
101+
| fast-float | 22.08 | 11.10 | 20.04 | 40.77 | 26.33 | 29.84 |
102+
| lexical | 61.63 | 25.10 | 53.77 | 72.33 | 53.39 | 72.40 |
103+
| lexical/lossy | 61.51 | 25.24 | 54.00 | 71.30 | 52.87 | 71.71 |
104+
| from_str | 175.07 | 22.58 | 103.00 | 228.78 | 115.76 | 211.13 |
105+
| fast_float (C++) | 22.78 | 10.99 | 20.05 | 41.12 | 27.51 | 30.85 |
106+
| abseil (C++) | 42.66 | 32.88 | 46.01 | 50.83 | 46.33 | 49.95 |
107+
| netlib (C++) | 57.53 | 24.86 | 64.72 | 56.63 | 36.20 | 67.29 |
108+
| strtod (C) | 286.10 | 31.15 | 258.73 | 295.73 | 205.72 | 315.95 |
109+
```
110+
111+
Parsers:
112+
113+
- `fast-float` - this very crate
114+
- `lexical` – from `lexical_core` crate, v0.7
115+
- `lexical/lossy` - from `lexical_core` crate, v0.7 (lossy parser)
116+
- `from_str` – Rust standard library, `FromStr` trait
117+
- `fast_float (C++)` – original C++ implementation of 'fast-float' method
118+
- `abseil (C++)` – Abseil C++ Common Libraries
119+
- `netlib (C++)` – C++ Network Library
120+
- `strtod (C)` – C standard library
121+
122+
Datasets:
123+
124+
- `canada` – numbers in `canada.txt` file
125+
- `mesh` – numbers in `mesh.txt` file
126+
- `uniform` – uniform random numbers from 0 to 1
127+
- `iidi` – random numbers of format `%d%d.%d`
128+
- `iei` – random numbers of format `%de%d`
129+
- `rec32` – reciprocals of random 32-bit integers
130+
131+
Notes:
132+
133+
- Test environment: macOS 10.14.6, clang 11.0, Rust 1.49, 3.5 GHz i7-4771 Haswell.
134+
- The two test files referred above can be found in
135+
[this](https://github.com/lemire/simple_fastfloat_benchmark) repository.
136+
- The Rust part of the table (along with a few other benchmarks) can be generated via
137+
the benchmark tool that can be found under `extras/simple-bench` of this repo.
138+
- The C/C++ part of the table (along with a few other benchmarks and parsers) can be
139+
generated via a C++ utility that can be found in [this](https://github.com/lemire/simple_fastfloat_benchmark)
140+
repository.
10141

11142
<br>
12143

0 commit comments

Comments
 (0)