• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

src/H03-May-2022-44,91534,047

.cargo-checksum.jsonH A D03-May-202289 11

.cargo_vcs_info.jsonH A D01-Jan-197074 65

CHANGELOGH A D01-Jan-19706.8 KiB12399

Cargo.tomlH A D01-Jan-19702.1 KiB9179

Cargo.toml.orig-cargoH A D01-Jan-19702.7 KiB8477

LICENSE-APACHEH A D01-Jan-197010.6 KiB202169

LICENSE-MITH A D01-Jan-19701,023 2421

README.mdH A D01-Jan-197028.6 KiB479359

build.rsH A D01-Jan-1970778 2312

README.md

1lexical-core
2============
3
4[![Build Status](https://api.travis-ci.org/Alexhuszagh/rust-lexical.svg?branch=master)](https://travis-ci.org/Alexhuszagh/rust-lexical)
5[![Latest Version](https://img.shields.io/crates/v/lexical-core.svg)](https://crates.io/crates/lexical-core)
6[![Rustc Version 1.37+](https://img.shields.io/badge/rustc-1.37+-lightgray.svg)](https://blog.rust-lang.org/2019/08/15/Rust-1.37.0.html)
7
8Low-level, lexical conversion routines for use in a `no_std` context. This crate by default does not use the Rust standard library.
9
10- [Getting Started](#getting-started)
11- [Features](#features)
12  - [Format](#format)
13- [Configuration](#configuration)
14- [Constants](#constants)
15- [Documentation](#documentation)
16- [Validation](#validation)
17- [Implementation Details](#implementation-details)
18  - [Float to String](#float-to-string)
19  - [String to Float](#string-to-float)
20  - [Arbitrary-Precision Arithmetic](#arbitrary-precision-arithmetic)
21  - [Algorithm Background and Comparison](#algorithm-background-and-comparison)
22- [Known Issues](#known-issues)
23- [Versioning and Version Support](#versioning-and-version-support)
24- [Changelog](#changelog)
25- [License](#license)
26- [Contributing](#contributing)
27
28# Getting Started
29
30lexical-core is a low-level API for number-to-string and string-to-number conversions, without requiring a system allocator. If you would like to use a convenient, high-level API, please look at [lexical](../lexical) instead.
31
32Add lexical-core to your `Cargo.toml`:
33
34```toml
35[dependencies]
36lexical-core = "^0.7.1"
37```
38
39And an introduction through use:
40
41```rust
42extern crate lexical_core;
43
44// String to number using Rust slices.
45// The argument is the byte string parsed.
46let f: f32 = lexical_core::parse(b"3.5").unwrap();   // 3.5
47let i: i32 = lexical_core::parse(b"15").unwrap();    // 15
48
49// All lexical_core parsers are checked, they validate the
50// input data is entirely correct, and stop parsing when invalid data
51// is found, or upon numerical overflow.
52let r = lexical_core::parse::<u8>(b"256"); // Err(ErrorCode::Overflow.into())
53let r = lexical_core::parse::<u8>(b"1a5"); // Err(ErrorCode::InvalidDigit.into())
54
55// In order to extract and parse a number from a substring of the input
56// data, use `parse_partial`. These functions return the parsed value and
57// the number of processed digits, allowing you to extract and parse the
58// number in a single pass.
59let r = lexical_core::parse_partial::<i8>(b"3a5"); // Ok((3, 1))
60
61// If an insufficiently long buffer is passed, the serializer will panic.
62// PANICS
63let mut buf = [b'0'; 1];
64//let slc = lexical_core::write::<i64>(15, &mut buf);
65
66// In order to guarantee the buffer is long enough, always ensure there
67// are at least `T::FORMATTED_SIZE` bytes, which requires the
68// `lexical_core::Number` trait to be in scope.
69use lexical_core::Number;
70let mut buf = [b'0'; f64::FORMATTED_SIZE];
71let slc = lexical_core::write::<f64>(15.1, &mut buf);
72assert_eq!(slc, b"15.1");
73
74// When the `radix` feature is enabled, for decimal floats, using
75// `T::FORMATTED_SIZE` may significantly overestimate the space
76// required to format the number. Therefore, the
77// `T::FORMATTED_SIZE_DECIMAL` constants allow you to get a much
78// tighter bound on the space required.
79let mut buf = [b'0'; f64::FORMATTED_SIZE_DECIMAL];
80let slc = lexical_core::write::<f64>(15.1, &mut buf);
81assert_eq!(slc, b"15.1");
82```
83
84# Features
85
86- **correct** Use a correct string-to-float parser.
87    <blockquote>Enabled by default, and may be turned off by setting <code>default-features = false</code>. If neither <code>algorithm_m</code> nor <code>bhcomp</code> is enabled while <code>correct</code> is enabled, lexical uses the <code>bigcomp</code> algorithm.</blockquote>
88- **trim_floats** Export floats without a fraction as an integer.
89    <blockquote>For example, <code>0.0f64</code> will be serialized to "0" and not "0.0", and <code>-0.0</code> as "0" and not "-0.0".</blockquote>
90- **radix** Allow conversions to and from non-decimal strings.
91    <blockquote>With radix enabled, any radix from 2 to 36 (inclusive) is valid, otherwise, only 10 is valid.</blockquote>
92- **format** Customize accepted inputs for number parsing.
93    <blockquote>With format enabled, the number format is dictated through the <code>NumberFormat</code> bitflags, which allow you to toggle how to parse a string into a number. Various flags including enabling digit separators, requiring integer or fraction digits, and toggling special values.</blockquote>
94- **rounding** Enable custom rounding for IEEE754 floats.
95    <blockquote>By default, lexical uses round-nearest, tie-even for float rounding (recommended by IEE754).</blockquote>
96- **ryu** Use dtolnay's [ryu](https://github.com/dtolnay/ryu/) library for float-to-string conversions.
97    <blockquote>Enabled by default, and may be turned off by setting <code>default-features = false</code>. Ryu is ~2x as fast as other float formatters.</blockquote>
98- **libm** Enable use of the [libm](https://github.com/rust-lang/libm) library for stable `no_std` support.
99
100
101## Format
102
103Every language has competing specifications for valid numerical input, meaning a number parser for Rust will incorrectly accept or reject input for different programming or data languages. For example:
104
105```rust
106extern crate lexical_core;
107
108use lexical_core::*;
109
110// Valid in Rust strings.
111// Not valid in JSON.
112let f: f64 = parse(b"3.e7").unwrap();                       // 3e7
113
114// Let's only accept JSON floats.
115let format = NumberFormat::JSON;
116let f: f64 = parse_format(b"3.0e7", format).unwrap();       // 3e7
117let f: f64 = parse_format(b"3.e7", format).unwrap();        // Panics!
118
119// We can also allow digit separators, for example.
120// OCaml, a programming language that inspired Rust,
121// accepts digit separators pretty much anywhere.
122let format = NumberFormat::OCAML_STRING;
123let f: f64 = parse(b"3_4.__0_1").unwrap();                  // Panics!
124let f: f64 = parse_format(b"3_4.__0_1", format).unwrap();   // 34.01
125```
126
127The parsing specification is defined by `NumberFormat`, which provides pre-defined constants for over 40 programming and data languages. However, it also allows you to create your own specification, to dictate parsing.
128
129```rust
130extern crate lexical_core;
131
132use lexical_core::*;
133
134// Let's use the standard, Rust grammar.
135let format = NumberFormat::standard().unwrap();
136
137// Let's use a permissive grammar, one that allows anything besides
138// digit separators.
139let format = NumberFormat::permissive().unwrap();
140
141// Let's ignore digit separators and have an otherwise permissive grammar.
142let format = NumberFormat::ignore(b'_').unwrap();
143
144// Create our own grammar.
145// A NumberFormat is compiled from options into binary flags, each
146// taking 1-bit, allowing high-performance, customizable parsing
147// once they're compiled. Each flag will be explained while defining it.
148
149// The '_' character will be used as a digit separator.
150let digit_separator = b'_';
151
152// Require digits in the integer component of a float.
153// `0.1` is valid, but `.1` is not.
154let required_integer_digits = false;
155
156// Require digits in the fraction component of a float.
157// `1.0` is valid, but `1.` and `1` are not.
158let required_fraction_digits = false;
159
160// Require digits in the exponent component of a float.
161// `1.0` and `1.0e7` is valid, but `1.0e` is not.
162let required_exponent_digits = false;
163
164// Do not allow a positive sign before the mantissa.
165// `1.0` and `-1.0` are valid, but `+1.0` is not.
166let no_positive_mantissa_sign = false;
167
168// Require a sign before the mantissa.
169// `+1.0` and `-1.0` are valid, but `1.0` is not.
170let required_mantissa_sign = false;
171
172// Do not allow the use of exponents.
173// `300.0` is valid, but `3.0e2` is not.
174let no_exponent_notation = false;
175
176// Do not allow a positive sign before the exponent.
177// `3.0e2` and 3.0e-2` are valid, but `3.0e+2` is not.
178let no_positive_exponent_sign = false;
179
180// Require a sign before the exponent.
181// `3.0e+2` and `3.0e-2` are valid, but `3.0e2` is not.
182let required_exponent_sign = false;
183
184// Do not allow an exponent without fraction digits.
185// `3.0e7` is valid, but `3e7` and `3.e7` are not.
186let no_exponent_without_fraction = false;
187
188// Do not allow special values.
189// `1.0` is valid, but `NaN` and `inf` are not.
190let no_special = false;
191
192// Use case-sensitive matching when parsing special values.
193// `NaN` is valid, but `nan` and `NAN` are not.
194let case_sensitive_special = false;
195
196// Allow digit separators between digits in the integer component.
197// `3_4.01` is valid, but `_34.01`, `34_.01` and `34.0_1` are not.
198let integer_internal_digit_separator = false;
199
200// Allow digit separators between digits in the fraction component.
201// `34.0_1` is valid, but `34._01`, `34.01_` and `3_4.01` are not.
202let fraction_internal_digit_separator = false;
203
204// Allow digit separators between digits in the exponent component.
205// `1.0e6_7` is valid, but `1.0e_67`, `1.0e67_` and `1_2.0e67` are not.
206let exponent_internal_digit_separator = false;
207
208// Allow digit separators before any digits in the integer component.
209// These digit separators may occur before or after the sign, as long
210// as they occur before any digits.
211// `_34.01` is valid, but `3_4.01`, `34_.01` and `34._01` are not.
212let integer_leading_digit_separator = false;
213
214// Allow digit separators before any digits in the fraction component.
215// `34._01` is valid, but `34.0_1`, `34.01_` and `_34.01` are not.
216let fraction_leading_digit_separator = false;
217
218// Allow digit separators before any digits in the exponent component.
219// These digit separators may occur before or after the sign, as long
220// as they occur before any digits.
221// `1.0e_67` is valid, but `1.0e6_7`, `1.0e67_` and `_1.0e67` are not.
222let exponent_leading_digit_separator = false;
223
224// Allow digit separators after any digits in the integer component.
225// If `required_integer_digits` is not set, `_.01` is valid.
226// `34_.01` is valid, but `3_4.01`, `_34.01` and `34.01_` are not.
227let integer_trailing_digit_separator = false;
228
229// Allow digit separators after any digits in the fraction component.
230// If `required_fraction_digits` is not set, `1._` is valid.
231// `34.01_` is valid, but `34.0_1`, `34._01` and `34_.01` are not.
232let fraction_trailing_digit_separator = false;
233
234// Allow digit separators after any digits in the exponent component.
235// If `required_exponent_digits` is not set, `1.0e_` is valid.
236// `1.0e67_` is valid, but `1.0e6_7`, `1.0e_67` and `1.0_e67` are not.
237let exponent_trailing_digit_separator = false;
238
239// Allow consecutive separators in the integer component.
240// This requires another integer digit separator flag to be set.
241// For example, if `integer_internal_digit_separator` and this flag are set,
242// `3__4.01` is valid, but `__34.01`, `34__.01` and `34.0__1` are not.
243let integer_consecutive_digit_separator = false;
244
245// Allow consecutive separators in the fraction component.
246// This requires another fraction digit separator flag to be set.
247// For example, if `fraction_internal_digit_separator` and this flag are set,
248// `34.0__1` is valid, but `34.__01`, `34.01__` and `3__4.01` are not.
249let fraction_consecutive_digit_separator = false;
250
251// Allow consecutive separators in the exponent component.
252// This requires another exponent digit separator flag to be set.
253// For example, if `exponent_internal_digit_separator` and this flag are set,
254// `1.0e6__7` is valid, but `1.0e__67`, `1.0e67__` and `1__2.0e67` are not.
255let exponent_consecutive_digit_separator = false;
256
257// Allow digit separators in special values.
258// If set, allow digit separators in special values will be ignored.
259// `N_a_N__` is valid, but `i_n_f_e` is not.
260let special_digit_separator = false;
261
262// Compile the grammar.
263let format = NumberFormat::compile(
264    digit_separator,
265    required_integer_digits,
266    required_fraction_digits,
267    required_exponent_digits,
268    no_positive_mantissa_sign,
269    required_mantissa_sign,
270    no_exponent_notation,
271    no_positive_exponent_sign,
272    required_exponent_sign,
273    no_exponent_without_fraction,
274    no_special,
275    case_sensitive_special,
276    integer_internal_digit_separator,
277    fraction_internal_digit_separator,
278    exponent_internal_digit_separator,
279    integer_leading_digit_separator,
280    fraction_leading_digit_separator,
281    exponent_leading_digit_separator,
282    integer_trailing_digit_separator,
283    fraction_trailing_digit_separator,
284    exponent_trailing_digit_separator,
285    integer_consecutive_digit_separator,
286    fraction_consecutive_digit_separator,
287    exponent_consecutive_digit_separator,
288    special_digit_separator
289).unwrap();
290```
291
292# Configuration
293
294Lexical-core also includes configuration options that allow you to configure float processing and formatting. These are provided as getters and setters, so lexical-core can validate the input.
295
296- **NaN**
297    - `get_nan_string`
298    - `set_nan_string`
299    <blockquote>The representation of Not a Number (NaN) as a string (default <code>b"NaN"</code>). For float parsing, lexical-core uses case-insensitive comparisons. This string <b>must</b> start with an <code>'N'</code> or <code>'n'</code>.</blockquote>
300- **Short Infinity**
301    - `get_inf_string`
302    - `set_inf_string`
303    <blockquote>The short, default representation of infinity as a string (default <code>b"inf"</code>). For float parsing, lexical-core uses case-insensitive comparisons. This string **must** start with an <code>'I'</code> or <code>'i'</code>.</blockquote>
304- **Long Infinity**
305    - `get_infinity_string`
306    - `set_infinity_string`
307    <blockquote>The long, backup representation of infinity as a string (default <code>b"infinity"</code>). The long infinity must be at least as long as the short infinity, and will only be used during float parsing (and is case-insensitive). This string **must** start with an <code>'I'</code> or <code>'i'</code>.</blockquote>
308- **Exponent Default Character**
309    - `get_exponent_default_char`
310    - `set_exponent_default_char`
311    <blockquote>The default character designating the exponent component of a float (default <code>b'e'</code>) for strings with a radix less than 15 (including decimal strings). For float parsing, lexical-core uses case-insensitive comparisons. This value should be not be in character set <code>[0-9a-eA-E.+\-]</code>.</blockquote>
312- **Exponent Backup Character** (radix only)
313    - `get_exponent_backup_char`
314    - `set_exponent_backup_char`
315    <blockquote>The backup character designating the exponent component of a float (default <code>b'^'</code>) for strings with a radix greater than or equal to 15. This value should be not be in character set <code>[0-9a-zA-Z.+\-]</code>.</blockquote>
316- **Float Rounding** (rounding only)
317    - `get_float_rounding`
318    - `set_float_rounding`
319    <blockquote>The IEEE754 float-rounding scheme to be used during float parsing. In almost every case, this should be set to <code>RoundingKind::NearestTieEven</code>.</blockquote>
320
321# Constants
322
323Lexical-core also includes a few constants to simplify interfacing with number-to-string code, and are implemented for the `lexical_core::Number` trait, which is required by `ToLexical`.
324
325- **FORMATTED_SIZE** The maximum number of bytes a formatter may write.
326    <blockquote>For example, <code>lexical_core::write_radix::&lt;i32&gt;</code> may write up to <code>i32::FORMATTED_SIZE</code> characters. This constant may significantly overestimate the number of characters required for decimal strings when the radix feature is enabled.</blockquote>
327- **FORMATTED_SIZE_DECIMAL** The maximum number of bytes a formatter may write in decimal (base 10).
328    <blockquote>For example, <code>lexical_core::write::&lt;i32&gt;</code> may write up to <code>i32::FORMATTED_SIZE_DECIMAL</code> characters.</blockquote>
329
330These are provided as Rust constants so they may be used as the size element in arrays.
331
332# Documentation
333
334Lexical-core's documentation can be found on [docs.rs](https://docs.rs/lexical-core).
335
336# Validation
337
338Float parsing is difficult to do correctly, and major bugs have been found in implementations from [libstdc++'s strtod](https://www.exploringbinary.com/glibc-strtod-incorrectly-converts-2-to-the-negative-1075/) to [Python](https://bugs.python.org/issue7632). In order to validate the accuracy of the lexical, we employ the following external tests:
339
3401. Hrvoje Abraham's [strtod](https://github.com/ahrvoje/numerics/tree/master/strtod) test cases.
3412. Rust's [test-float-parse](https://github.com/rust-lang/rust/tree/64185f205dcbd8db255ad6674e43c63423f2369a/src/etc/test-float-parse) unittests.
3423. Testbase's [stress tests](https://www.icir.org/vern/papers/testbase-report.pdf) for converting from decimal to binary.
3434. [Various](https://www.exploringbinary.com/glibc-strtod-incorrectly-converts-2-to-the-negative-1075/) [difficult](https://www.exploringbinary.com/how-glibc-strtod-works/) [cases](https://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/) reported on blogs.
344
345Although lexical may contain bugs leading to rounding error, it is tested against a comprehensive suite of random-data and near-halfway representations, and should be fast and correct for the vast majority of use-cases.
346
347# Implementation Details
348
349## Float to String
350
351For more information on the Grisu2 and Grisu3 algorithms, see [Printing Floating-Point Numbers Quickly and Accurately with Integers](https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf).
352
353For more information on the Ryu algorithm, see [Ryū: fast float-to-string conversion](https://dl.acm.org/citation.cfm?id=3192369).
354
355## String to Float
356
357In order to implement an efficient parser in Rust, lexical uses the following steps:
358
3591. We ignore the sign until the end, and merely toggle the sign bit after creating a correct representation of the positive float.
3602. We handle special floats, such as "NaN", "inf", "Infinity". If we do not have a special float, we continue to the next step.
3613. We parse up to 64-bits from the string for the mantissa, ignoring any trailing digits, and parse the exponent (if present) as a signed 32-bit integer. If the exponent overflows or underflows, we set the value to i32::max_value() or i32::min_value(), respectively.
3624. **Fast Path** We then try to create an exact representation of a native binary float from parsed mantissa and exponent. If both can be exactly represented, we multiply the two to create an exact representation, since IEEE754 floats mandate the use of guard digits to minimizing rounding error. If either component cannot be exactly represented as the native float, we continue to the next step.
3635. **Moderate Path** We create an approximate, extended, 80-bit float type (64-bits for the mantissa, 16-bits for the exponent) from both components, and multiplies them together. This minimizes the rounding error, through guard digits. We then estimate the error from the parsing and multiplication steps, and if the float +/- the error differs significantly from b+h, we return the correct representation (b or b+u). If we cannot unambiguously determine the correct floating-point representation, we continue to the next step.
3646. **Fallback Moderate Path** Next, we create a 128-bit representation of the numerator and denominator for b+h, to disambiguate b from b+u by comparing the actual digits in the input to theoretical digits generated from b+h. This is accurate for ~36 significant digits from a 128-bit approximation with decimal float strings. If the input is less than or equal to 36 digits, we return the value from this step. Otherwise, we continue to the next step.
3657. **Slow Path** We use arbitrary-precision arithmetic to disambiguate the correct representation without any rounding error. We create an exact representation of the input digits as a big integer, to determine how to round the top 53 bits for the mantissa. If there is a fraction or a negative exponent, we create a representation of the significant digits for `b+h` and scale the input digits by the binary exponent in `b+h`, and scale the significant digits in `b+h` by the decimal exponent, and compare the two to determine if we need to round up or down.
366
367Since arbitrary-precision arithmetic is slow and scales poorly for decimal strings with many digits or exponents of high magnitude, lexical also supports a lossy algorithm, which returns the result from the moderate path. The result from the lossy parser should be accurate to within 1 ULP.
368
369## Arbitrary-Precision Arithmetic
370
371Lexical uses arbitrary-precision arithmetic to exactly represent strings between two floating-point representations, and is highly optimized for performance. The following section is a comparison of different algorithms to determine the correct float representation. The arbitrary-precision arithmetic logic is not dependent on memory allocation: it only uses the heap when the `radix` feature is enabled.
372
373## Algorithm Background and Comparison
374
375For close-to-halfway representations of a decimal string `s`, where `s` is close between two representations, `b` and the next float `b+u`, arbitrary-precision arithmetic is used to determine the correct representation. This means `s` is close to `b+h`, where `h` is the halfway point between `b` and `b+u`.
376
377For the following example, we will use the following values for our test case:
378
379* `s = 2.4703282292062327208828439643411068618252990130716238221279284125033775363510437593264991818081799618989828234772285886546332835517796989819938739800539093906315035659515570226392290858392449105184435931802849936536152500319370457678249219365623669863658480757001585769269903706311928279558551332927834338409351978015531246597263579574622766465272827220056374006485499977096599470454020828166226237857393450736339007967761930577506740176324673600968951340535537458516661134223766678604162159680461914467291840300530057530849048765391711386591646239524912623653881879636239373280423891018672348497668235089863388587925628302755995657524455507255189313690836254779186948667994968324049705821028513185451396213837722826145437693412532098591327667236328125001e-324`
380* `b = 0.0`
381* `b+h = 2.4703282292062327208828439643411068618252990130716238221279284125033775363510437593264991818081799618989828234772285886546332835517796989819938739800539093906315035659515570226392290858392449105184435931802849936536152500319370457678249219365623669863658480757001585769269903706311928279558551332927834338409351978015531246597263579574622766465272827220056374006485499977096599470454020828166226237857393450736339007967761930577506740176324673600968951340535537458516661134223766678604162159680461914467291840300530057530849048765391711386591646239524912623653881879636239373280423891018672348497668235089863388587925628302755995657524455507255189313690836254779186948667994968324049705821028513185451396213837722826145437693412532098591327667236328125e-324`
382* `b+u = 5e-324`
383
384**Algorithm M**
385
386Algorithm M represents the significant digits of a float as a fraction of arbitrary-precision integers (a more in-depth description can be found [here](https://www.exploringbinary.com/correct-decimal-to-floating-point-using-big-integers/)). For example, 1.23 would be 123/100, while 314.159 would be 314159/1000. We then scale the numerator and denominator by powers of 2 until the quotient is in the range `[2^52, 2^53)`, generating the correct significant digits of the mantissa.
387
388A naive implementation, in Python, is as follows:
389
390```python
391def algorithm_m(num, b):
392    # Ensure numerator >= 2**52
393    bits = int(math.ceil(math.log2(num)))
394    if bits <= 53:
395        num <<= 53
396        b -= 53
397
398    # Track number of steps required (optional).
399    steps = 0
400    while True:
401        steps += 1
402        c = num//b
403        if c < 2**52:
404            b //= 2
405        elif c >= 2**53:
406            b *= 2
407        else:
408            break
409
410    return (num, b, steps-1)
411```
412
413**bigcomp**
414
415Bigcomp is the canonical string-to-float parser, which creates an exact representation of `b+h` as a big integer, and compares the theoretical digits from `b+h` scaled into the range `[1, 10)` by a power of 10 to the actual digits in the input string (a more in-depth description can be found [here](https://www.exploringbinary.com/bigcomp-deciding-truncated-near-halfway-conversions/)). A maximum of 768 digits need to be compared to determine the correct representation, and the size of the big integers in the ratio does not depend on the number of digits in the input string.
416
417Bigcomp is used as a fallback algorithm for lexical-core when the radix feature is enabled, since the radix-representation of a binary float may never terminate if the radix is not divisible by 2. Since bigcomp uses constant memory, it is used as the default algorithm if more than `2^15` digits are passed and the representation is potentially non-terminating.
418
419**bhcomp**
420
421Bhcomp is a simple, performant algorithm that compared the significant digits to the theoretical significant digits for `b+h`. Simply, the significant digits from the string are parsed, creating a ratio. A ratio is generated for `b+h`, and these two ratios are scaled using the binary and radix exponents.
422
423For example, "2.470328e-324" produces a ratio of `2470328/10^329`, while `b+h` produces a binary ratio of `1/2^1075`. We're looking to compare these ratios, so we need to scale them using common factors. Here, we convert this to `(2470328*5^329*2^1075)/10^329` and `(1*5^329*2^1075)/2^1075`, which converts to `2470328*2^746` and `1*5^329`.
424
425Our significant digits (real_digits) and `b+h` (bh_digits) therefore start like:
426```
427real_digits = 91438982...
428bh_digits   = 91438991...
429```
430
431Since our real digits are below the theoretical halfway point, we know we need to round-down, meaning our literal value is `b`, or `0.0`. This approach allows us to calculate whether we need to round-up or down with a single comparison step, without any native divisions required. This is the default algorithm lexical-core uses.
432
433**Other Optimizations**
434
4351. We remove powers of 2 during exponentiation in bhcomp.
4362. We limit the number of parsed digits to the theoretical max number of digits produced by `b+h` (768 for decimal strings), and merely compare any trailing digits to '0'. This provides an upper-bound on the computation cost.
4373. We use fast exponentiation and multiplication algorithms to scale the significant digits for comparison.
4384. For the fallback bigcomp algorithm, we use a division algorithm optimized for the generation of a single digit from a given radix, by setting the leading bit in the denominator 4 below the most-significant bit (in decimal strings). This requires only 1 native division per digit generated.
4394. The individual "limbs" of the big integers are optimized to the architecture we compile on, for example, u32 on x86 and u64 on x86-64, minimizing the number of native operations required. Currently, 64-bit limbs are used on target architectures `aarch64`, `powerpc64`, `mips64`, and `x86_64`.
440
441# Known Issues
442
443On the ARMVv6 architecture, the stable exponentiation for the fast, incorrect float parser is not fully stable. For example, `1e-300` is correct, while `5e-324` rounds to `0`, leading to "5e-324" being incorrectly parsed as `0`. This does not affect the default, correct float parser, nor ARMVv7 or ARMVv8 (aarch64) architectures. This bug can compound errors in the incorrect parser (feature-gated by disabling the `correct` feature`). It is not known if this bug is an artifact of Qemu emulation of ARMv6, or is actually representative the hardware.
444
445Versions of lexical-core prior to 0.4.3 could round parsed floating-point numbers with an error of up to 1 ULP. This occurred for strings with 16 or more digits and a trailing 0 in the fraction, the `b+h` comparison in the slow-path algorithm incorrectly scales the the theoretical digits due to an over-calculated real exponent. This affects a very small percentage of inputs, however, it is recommended to update immediately.
446
447# Versioning and Version Support
448
449**Version Support**
450
451The currently supported versions are:
452- v0.7.x
453- v0.6.x (Maintenace)
454
455**Rustc Compatibility**
456
457v0.7.x supports 1.37+, including stable, beta, and nightly.
458v0.6.x supports Rustc 1.24+, including stable, beta, and nightly.
459
460Please report any errors compiling a supported lexical-core version on a compatible Rustc version.
461
462**Versioning**
463
464Lexical-core uses [semantic versioning](https://semver.org/). Removing support for Rustc versions newer than the latest stable Debian or Ubuntu version is considered an incompatible API change, requiring a major version change.
465
466# Changelog
467
468All changes since 0.4.1 are documented in [CHANGELOG](CHANGELOG).
469
470# License
471
472Lexical-core is dual licensed under the Apache 2.0 license as well as the MIT license. See the LICENCE-MIT and the LICENCE-APACHE files for the licenses.
473
474Lexical-core also ports some code from [rust](https://github.com/rust-lang/rust) (for backwards compatibility), [V8](https://github.com/v8/v8), [libgo](https://golang.org/src) and [fpconv](https://github.com/night-shift/fpconv), and therefore might be subject to the terms of a 3-clause BSD license or BSD-like license.
475
476# Contributing
477
478Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in lexical by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
479