Skip to content

Finish and fix float printing #930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 29, 2018
Merged

Finish and fix float printing #930

merged 4 commits into from
Apr 29, 2018

Conversation

tiehuis
Copy link
Member

@tiehuis tiehuis commented Apr 17, 2018

WIP: See #375.

A start at finishing off the float printing once and for all. I've pushed early since I want to confirm the changed behaviour.

For now, I've only done the formatFloatDecimal function. I've adjusted the code so we now round the last digit if required so we are in line with what glibc is doing. Was there any reason behind truncating or has we just not got around to implementing rounding? I've also fixed the negative exponential cases which were not handled previously. Infinity and NaN were renamed to match C.

Here is a test program which I'd like to get passing before merging this. We'll need one for formatFloat, too, and should also randomly generate f64/double values and ensure coverage looks good on that area too.

Currently we print the first 713620 values the same as libc. I'll work through the ones which don't and add test cases as new errors are encountered.

const std = @import("std");

const c = @cImport(@cInclude("stdio.h"));

pub fn main() !void {
    var i: u32 = 0;
    while (true) : (i += 1) {
        // Need a double since c only prints double representations.
        var f = f64(@bitCast(f32, i));

        var c_buf: [64]u8 = undefined;
        var c_len = usize(c.snprintf(&c_buf[0], c_buf.len, c"%.5f", f));

        var z_buf: [64]u8 = undefined;
        var z_len = (try std.fmt.bufPrint(z_buf[0..], "{.5}", f)).len;

        if (c_len != z_len or !std.mem.eql(u8, c_buf[0..c_len], z_buf[0..z_len])) {
            _ = c.printf(c"full repr: %e\n", f);
            std.debug.warn(
                \\not equal: (i = {})
                \\
                \\   c: {}
                \\ zig: {}
                \\
                \\
                ,
                i,
                c_buf[0..c_len],
                z_buf[0..z_len],
            );
            break;
        }

        const progress = 1 << 16;
        if (i % progress == progress - 1) {
            std.debug.warn("i = {}\n", i);
        }

        if (i == @maxValue(u32)) {
            std.debug.warn("all ok!\n");
            break;
        }
    }
}

@tiehuis tiehuis changed the title Add float rounding and fix negative exponential cases Finish and fix float printing Apr 17, 2018
@tiehuis
Copy link
Member Author

tiehuis commented Apr 17, 2018

Currently tested up to i = 916,964,780 until the next error. Will finish the remainder tomorrow.

@andrewrk
Copy link
Member

which build mode are you testing in?

@tiehuis
Copy link
Member Author

tiehuis commented Apr 17, 2018

Debug mode only. Since there are outstanding issues for release modes I'll consider that in a separate PR once the base logic is known to work.

@tiehuis
Copy link
Member Author

tiehuis commented Apr 18, 2018

Up to 1518338056 now. I think there is an issue with the actual errol reference code right now on some larger inputs. Will double check and confirm tomorrow.

With the reference code:

#include <stdbool.h>
#include <stdio.h>

#include "lib/errol.h"

int main(void)
{
    double d = 18014400656965632;
    char buf[64];

    errol3u_dtoa(d, buf);

    printf("libc:  %f\n", d);
    printf("errol: %s\n", buf);
}

We get:

libc:  18014400656965632.000000
errol: 1801440065696563

In the zig port we actually seem to fill the buffer correctly (with the trailing 2) but the length of the slice is one short. I don't think I've overlooked anything obvious here. I'll try diagnose the issue and make a fix before reporting just yet.


Also, here is an updated test script to ignore the wierd y.xxx25 rounding that libc is doing.

const std = @import("std");
const c = @cImport(@cInclude("stdio.h"));

pub fn main() !void {
    var i: u32 = 1518338049;
    outer: while (true) : (i += 1) {
        // Need a double since c only prints double representations.
        var f = f64(@bitCast(f32, i));

        var c_buf: [64]u8 = undefined;
        var c_len = usize(c.snprintf(&c_buf[0], c_buf.len, c"%.5f", f));

        var z_buf: [64]u8 = undefined;
        var z_len = (try std.fmt.bufPrint(z_buf[0..], "{.5}", f)).len;

        if (c_len != z_len or !std.mem.eql(u8, c_buf[0..c_len], z_buf[0..z_len])) {
            // 0.015625 rounds to 0.01562 with libc but we round it to 0.01563.
            // Ignore these error cases.
            var ec_buf: [64]u8 = undefined;
            var ec_len = usize(c.snprintf(&ec_buf[0], ec_buf.len, c"%.6f", f));

            if (ec_buf[ec_len-2] == '2' and ec_buf[ec_len-1] == '5') {
                // ignore (these are moderately common (every few million) for small values.
            } else {
                _ = c.printf(c"full repr: %e\n", f);
                std.debug.warn(
                    \\not equal: (i = {})
                    \\
                    \\   c: {}
                    \\ zig: {}
                    \\
                    \\
                    ,
                    i,
                    c_buf[0..c_len],
                    z_buf[0..z_len],
                );
                break;
            }
        }

        const progress = 1 << 20;
        if (i % progress == progress - 1) {
            std.debug.warn("i = {}\n", i);
        }

        if (i == @maxValue(u32)) {
            std.debug.warn("all ok!\n");
            break;
        }
    }
}

Feels like floating point is one edge case after another.

@andrewrk
Copy link
Member

That's really frustrating. It might be worth giving up errol and having simpler, smaller code for parsing and printing floats. This is what I started in the floats branch. Although looking at it, I guess I didn't start the work on formatting, just parsing.

Even if we had errol in the standard library for speed, we still might want this smaller code implementation, and the default implementation would choose based on --release-fast or --release-small.

@thejoshwolfe
Copy link
Contributor

Just to clarify something: I'm aware of Zig's general stance on non-finite float values, which is that they usually shouldn't happen, but this formatting code still accepts non-finite values. Is there any kind of low level "fast math" formatting API that we want to have that does not accept non-finite values? Would we gain anything from such a function?

JSON, for example, never needs to deal with non-finite floats.

@andrewrk
Copy link
Member

I'm not convinced that optimization is worth it. I think the 2 or 3 branches that check for non-finite float values are a negligible slice of the pie chart that is formatting float values.

Also, even though the default float mode is Optimized, we still support Strict at a scope level. My stance is that formatting floats should work for all values.

@tiehuis
Copy link
Member Author

tiehuis commented Apr 19, 2018

I think errol3 is still fine. This is the first edge case encountered after 1.5 billion floats so it's not too bad. I think we should strive for an accurate implementation and compared to other methods this strikes the best compromise as far as I can see.

The code size is slightly large, but the only rationale for having a simpler float printing would be embedded in my view. 5-10Kb is worth the cost of a good implementation in all other cases. At least if someone isn't using the float formatting then it won't be compiled in.

@tiehuis
Copy link
Member Author

tiehuis commented Apr 19, 2018

I've had a bit of a closer look into this and I'm fairly certain there is no bug in errol. This looks like a non-optimal rounding accuracy in the libc implementation.

You can test Grisu3 as well, which is used by most major browser engines, by pasting 18014400656965632 into your console. You should see it rounded down to 18014400656965630. I'll instead see if I can skip these large integer examples from the remaining tests and proceed as normal.

@tiehuis
Copy link
Member Author

tiehuis commented Apr 20, 2018

I've tested the {.5} format against libc now for all 32-bit floating point numbers and we are now identical with the following exceptions. I retested these with the latest changes so the first 1.5 billion are still good:

  • 0.015625 rounding as 0.01562 in libc vs. zigs 0.01563
  • Errol3 performs different rounding of integer values above ~9e10 as mentioned above.

What remains now is to do the following:

  • Implement rounding/corrections to the 0.00e-43 formatting code. This is largely the same except we can remove the leading zero printing and print only the sf fractional digits.
  • Test some of the edge case formatting specifiers e.g. {.1}, {.20} with randomly generated doubles.
  • Test the exp code when written with some randomly generated doubles
  • Check the Windows failure on appveyor

@tiehuis
Copy link
Member Author

tiehuis commented Apr 20, 2018

Few thoughts/questions:

  • We default to scientific notation as this is most consistent for large/small numbers.
  • I think it would be useful if the default float format printed in full precision to get the property that we can always re-parse values back to the original value.
  • In saying that, we could specify {.} to print full precision by default (instead of 5)? Although that is less useful. Some way to print full precision decimals may be nice however.
  • Do we want a new float format specific (such as {e}) for scientific notation so one can specify the precision?

@andrewrk
Copy link
Member

I think it would be useful if the default float format printed in full precision to get the property that we can always re-parse values back to the original value.

I agree with this. My proposal would be {} to print a value that we could parse back into the original value. Users can use {.5} if they want smaller output.

Do we want a new float format specific (such as {e}) for scientific notation so one can specify the precision?

This makes sense to me. Perhaps we can find some inspiration from Python3's docs for formatting.

@tiehuis
Copy link
Member Author

tiehuis commented Apr 22, 2018

This is mostly complete now. The windows test failure is a bit annoying. I'll see if I can test under wine or otherwise I'll set up a vm for it. Consider the formatting notation of scientific values temporary. I would like to review this the formatting string language completely and revamp as needed.

I've tested scientific/decimal rounding to a precision of 5 for all f32 values against libc and they all match with a few minor exceptions which are noted in the tests.

const std = @import("std");

pub fn main() void {
    const a = f64(1.2345678);
    std.debug.warn("{}\n", a);
    std.debug.warn("{.}\n", a);
    std.debug.warn("{e}\n", a);
    std.debug.warn("{.0}\n", a);
    std.debug.warn("{e0}\n", a);
    std.debug.warn("{.1}\n", a);
    std.debug.warn("{e1}\n", a);
    std.debug.warn("{e9}\n", a);
    std.debug.warn("{.9}\n", a);

    std.debug.warn("\n");

    const b = f64(1.5e-20);
    std.debug.warn("{}\n", b);
    std.debug.warn("{.}\n", b);
    std.debug.warn("{e}\n", b);
    std.debug.warn("{.0}\n", b);
    std.debug.warn("{e0}\n", b);
    std.debug.warn("{.1}\n", b);
    std.debug.warn("{e1}\n", b);
    std.debug.warn("{e9}\n", b);
    std.debug.warn("{.9}\n", b);

    std.debug.warn("\n");

    const c = f64(1.52314e+29);
    std.debug.warn("{}\n", c);
    std.debug.warn("{.}\n", c);
    std.debug.warn("{e}\n", c);
    std.debug.warn("{.0}\n", c);
    std.debug.warn("{e0}\n", c);
    std.debug.warn("{.1}\n", c);
    std.debug.warn("{e1}\n", c);
    std.debug.warn("{e9}\n", c);
    std.debug.warn("{.9}\n", c);
}
1.2345678e+00
1.2345678
1.2345678e+00
1
1e+00
1.2
1.2e+00
1.234567800e+00
1.234567800

1.5e-20
0.000000000000000000015
1.5e-20
0
2e-20
0.0
1.5e-20
1.500000000e-20
0.000000000

1.52314e+29
152314000000000000000000000000
1.52314e+29
152314000000000000000000000000
2e+29
152314000000000000000000000000.0
1.5e+29
1.523140000e+29
152314000000000000000000000000.000000000

tiehuis added 2 commits April 23, 2018 17:22
 - Fix errors printing very small numbers
 - Add explicit scientific output mode
 - Add rounding based on a specific precision for both decimal/exp
 modes.
 - Test and confirm exp/decimal against libc for all f32 values. Various
 changes to better match libc.
@andrewrk
Copy link
Member

more details about the windows failure

integer division by zero

stack trace from msvc:

>	test.exe!00007ff7e995b0ad() Line 45	udivmod.zig
 	test.exe!00007ff7e9959732() Line 6	     udivmodti4.zig
 	test.exe!00007ff7e995b7aa() Line 8	    umodti3.zig
 	test.exe!errolInt(FloatDecimal * val, double buffer, []u8 *) Line 325	errol/index.zig
 	test.exe!errol3u(FloatDecimal * val, double buffer, []u8 *) Line 103	C
 	test.exe!errol3(FloatDecimal * value, double buffer, []u8 *) Line 95	C
 	test.exe!formatFloatDecimal(StackTrace * value, double maybe_precision, ?usize * context, BufPrintContext * output, error (StackTrace *, BufPrintContext *, []u8 *)) Line 421	C
 	test.exe!format(StackTrace * context, BufPrintContext * output, error (StackTrace *, BufPrintContext *, []u8 *) arg2, double) Line 167	C
 	test.exe!bufPrint(@typeOf(bufPrint).ReturnType.ErrorSet![]u8 * buf, StackTrace * arg1, []u8 *) Line 688	C
 	test.exe!fmt.format(StackTrace *) Line 859	fmt/index.zig
 	test.exe!main(StackTrace *) Line 11	C
 	test.exe!callMain() Line 96	C
 	test.exe!WinMainCRTStartup() Line 45	C
 	[External Code]	

I'm quite puzzled. Happy to try anything you suggest.

@andrewrk
Copy link
Member

Some ideas I thought of to try next time I get a chance to look into this:

  • inspect the registers and make sure the calling convention is respected
  • using gdb check what happens on linux in this test case and see if the control flow matches on windows

@andrewrk
Copy link
Member

I did not solve it but I have a bunch of clues.

One I believe is a red herring, but it is rather interesting - in this test case, we are invoking udivmodti4 with

a=152313999999999991610955792383
b=10000000000000000000

The correct answer for division is 15231400000 but udivmodti4 gives 15231399999. This matches upstream behavior with llvm compiler-rt. I sent a message to llvm-dev.

However I think this is a red herring because the answer is consistent on linux and windows. Also note that on windows udivmodti4 passes the quite comprehensive test suite, but this uses normal zig function calls and does not try to use u128 types across .o file boundaries.

Another clue is JuliaLang/julia#3081
I am not sure if this is still true, this was from LLVM 3.x.
But according to MSVC the division by zero happens in an if statement that checks if maybe_rem is not null, even though we passed null. So it seems the arguments are not being passed correctly.

This seems to point to a bug in LLVM since LLVM is the one generating calls to udivmodti4 in the first place, and we build compiler_rt.o with LLVM using the C calling convention. I will put together a small test case and send another email to llvm-dev.

@andrewrk
Copy link
Member

andrewrk commented Apr 25, 2018

Here's a small test that fails on windows but passes on linux:

const std = @import("std");

test "aoeu" {
    var a: u128 = 152313999999999991610955792383;
    var b: u128 = 10000000000000000000;
    var c = a / b;
    std.debug.assert(c == 15231399999);
}

Later I'll work on turning this into LLVM IR and trying to trigger the problem using only clang and LLVM, which should get some attention on a bug report.

@andrewrk
Copy link
Member

Here's the latest mail I sent to llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2018-April/122775.html

@andrewrk
Copy link
Member

andrewrk commented Apr 26, 2018

It seems that the problem here is that there is no ABI specification for how to handle i128 on Windows. It's not clear to me what calling convention LLVM is using to call udivmodti3, but the problem is that it is not matching the calling convention in std/special/compiler-rt/*:

export fn __udivti3(a: u128, b: u128) u128 {
    @setRuntimeSafety(builtin.is_test);
    return __udivmodti4(a, b, null);
}

So here's how to solve this problem:

  • Determine what is the calling convention that LLVM is using when it
    generates a call to __udivti3. Theoretically we should be able to inspect
    the assembly of the function call and that tells us the CC.
  • Update the definition of __udivti3 to match that CC. Ideally it would
    be one of the calling conventions we already have available. If
    necessary, use inline assembly,

@andrewrk
Copy link
Member

I believe the commit I just pushed should fix the compiler-rt ABI for windows.

@andrewrk andrewrk merged commit 0bb054e into master Apr 29, 2018
@andrewrk andrewrk deleted the float-printing branch April 29, 2018 02:43
@tiehuis
Copy link
Member Author

tiehuis commented Apr 29, 2018

Thanks for tracking that down and fixing.

@andrewrk
Copy link
Member

Thank you for doing the floating point fixes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants