Fix precision of float to string to max significant decimals #348

wokalski · 2015-12-08T13:35:16Z

The default precision which is used for converting floating point numbers to strings leads to many confusing results. If we take a Float32 1.00000000 value and 1.00000012 of the same type, these two, obviously are not equal. However, if we log them, we are displayed the same value. So a much more helpful display using 9 decimal digits is thus: [1.00000000 != 1.00000012] showing that the two values are in fact different.
(example taken from: http://www.boost.org/doc/libs/1_59_0/libs/test/doc/html/boost_test/test_output/log_floating_points.html)

I'm by no means a floating point number expert, however having investigated this issue I found numerous sources saying that "magic" numbers 9 and 17 for 32 and 64 bit values respectively are the correct format. Numbers 9 and 17 represent the maximum number of decimal digits that round trips. This means that number 0.100000000000000005 and 0.1000000000000000 are the same as their floating-point representations are concerned.

Please let me know if you want me to elaborate more on this subject.

wokalski · 2015-12-08T13:40:53Z

A link to the JIRA issue. https://bugs.swift.org/browse/SR-106

wokalski · 2015-12-08T15:04:42Z

I'm also wondering if I should implement tests for this specific behavior. Similar tests are in /test/1_stdlib/Print.swift, specifically test_FloatingPointPrinting().

getaaron · 2015-12-08T21:02:00Z

The issue is because of the hardcoded 32 here right? (A Double is 64 bytes). Should / can we replace that number with the correct number based on the type? If not, this solution seems acceptable.

wokalski · 2015-12-08T21:08:43Z

No, the issue is completely different. 32 bytes is the buffer size for the string. It was just the matter of format string.

stephentyrone · 2015-12-08T21:33:02Z

stdlib/public/stubs/Stubs.cpp

@@ -172,13 +172,13 @@ static uint64_t swift_floatingPointToString(char *Buffer, size_t BufferLength,
 extern "C" uint64_t swift_float32ToString(char *Buffer, size_t BufferLength,
                                          float Value) {
  return swift_floatingPointToString<float>(Buffer, BufferLength, Value,
-                                            "%0.*g");
+                                            "%0.9g");


By changing the format string, you are invoking undefined behavior when swift_snprintf_l is called within swift_floatingPointToString; the call now passes the wrong number of arguments*. The goal here is broadly fairly sensible, but instead of changing the format string, the more correct fix would be to replace numeric_limits::digits10 with numeric_limits::max_digits10. It's worth noting that that's a C++11-ism, but I believe that's fine (someone else please confirm).

*there should have been a compiler warning about this.

I'm not a C++ guru, so I have no opinion about the latter part but changing the format string is needed (too). %f defaults to 6 decimal places. The format string I suggested does not take two arguments.

C++ punts to C to define format strings. The C standard says the following:

As noted above, a field width, or precision, or both, may be indicated by an asterisk. In this case, an int argument supplies the field width or precision. The arguments specifying field width, or precision, or both, shall appear (in that order) before the argument (if any) to be converted.

So "%0.9g" expects a single value argument, but "%0.*g" expects two arguments, first an integer specifying how many digits to print ("Precision"), and then the value to be printed ("Value"). This is why simply changing the definition of Precision and not modifying the format string suffices.

The format string I suggested does not take two arguments.

Correct; that's the problem. The format string takes one argument, but you're passing two:

swift_snprintf_l(Buffer, BufferLength, nullptr, Format, Precision, Value); ~~~~~~~~~ ~~~~~

I get it. it was dumb. I will change it.

An easy mistake to make. Thanks for addressing it. @dabrahams once that's resolved, I'm fine with this change.

wokalski · 2015-12-08T22:07:12Z

@stephentyrone thanks a lot. I fixed the issue you pointed. I will squash the commits and change commit message before merge.

wokalski · 2015-12-08T22:25:18Z

@stephentyrone and what about tests?

stephentyrone · 2015-12-08T22:54:57Z

Yes, tests would be great. It would be easy to add regression tests to ensure that a few values like 1.0000000000000002 round-trip Double -> String -> Double with exact equality. If you have time to do more than that, that's even better.

wokalski · 2015-12-09T01:41:16Z

@stephentyrone So, as you might have expected, the change breaks stdlib tests defined in Print.swift. The test becomes very unintuitive when you see things like printedIs(asFloat32(1.00001), "1.00001001"), which is true but a bit confusing.

I still however, maintain that the change is needed because getting more precise values from description is very precious, especially when debugging.

I'm wondering how to change the tests. I can see two possible solutions:

Change the numbers in tests so that the test is more intuitive
Replace the existing tests which assert the way values are printed with tests which verify if round trips from String -> Float -> String and Float -> String -> Float are defined correctly for floating point numbers with precision of digits10 and max_digits10 respectively.

stephentyrone · 2015-12-09T14:29:47Z

Obviously longer term we need a richer system for converting between floating-point numbers and strings. However, we don't have that yet, and that's a big feature to design[1].

The current behavior favors String -> FP -> String round trips; your change would favor FP -> String -> FP round trips instead. I think that the latter is more important, so this change makes sense, but it is a behavior change, and some people will be confused no matter what. If folks are onboard for this change (I haven't seen any objections yet), then we should update the tests to conform to the new behavior, probably by simply specifying more digits in the source values so that the tests don't look insane.

[1] for simple print statements, the most intuitive thing would be to change the print behavior to print exactly as many digits as are needed to round-trip the number being printed (this would avoid the problem you're hitting here). However, the C standard library doesn't have that mode, so we'd need to write our own converters, which is a big project. Still, in the long term, this would be a nice thing to do.

gribozavr · 2015-12-09T20:38:49Z

@stephentyrone The current design was an explicit decision, IIRC. We thought it is more intuitive to omit decimal digits that are not guaranteed to be correct, and if one wants to round-trip FP through serialization, they should do something different (e.g., use a different printing function, use hexadecimal representation, or just binary representation).

We have been trying to capture the distinction between "user-presentable" and "accurate serialization" in description and debugDescription. Would it make sense to you to keep description as is, but change debugDescription to use max_digits10? Or is always using max_digits10 better in your opinion?

stephentyrone · 2015-12-09T21:09:38Z

@gribozavr I thought that might be the case. As I hinted, it's not at all cut-and-dry. To provide some context for everyone, the trade-off boils down to:

If we use max_digits10, then all floating-point values round trip T -> String -> T exactly, but values like 3.2 get printed as "3.2000000000000002", which is annoying and potentially confusing.[1]
If we instead use digits10, then all strings of up to digits10 decimal digits round-trip String -> T -> String exactly (which as a corollary means that "3.2" prints as "3.2"), but (a) different floating-point values print as the same decimal string and (b) many floating values do not round trip T -> String -> T correctly.

Certainly debugDescription should use max_digits10. That much is clear--debug descriptions should accurately reflect the data (personally as a numerical programmer, I would prefer they use the hexadecimal floating-point format, which is exact, but I recognize that most people won't know what to do with this).

As for description, I favor max_digits10 there too, because the risk (programmer / user confusion) is less than the alternative (data loss). (There is also the risk of data loss with the first option, if someone decided to try to move string data around as doubles, but that's such a comically bad idea that I'm mostly willing to discount it[2].) It's also easier to fix programmer / user confusion via education about floating-point. I can see the argument the other way too, however.

Long-term, we should make description use as many digits as are necessary to round-trip, and not more (so 3.2 prints as "3.2"), but that's a much more invasive code change, outside the scope of this pull request. We could fake it by making multiple calls to snprintf( ), but that seems non-ideal too.

[1] On the other hand, it hints that maybe 3.2 isn't really "3.2", so maybe that's not so bad.
[2] insert aside about people storing numeric string data as doubles in JavaScript here.

gribozavr · 2015-12-09T21:29:27Z

@stephentyrone I think I'm convinced! I want to know what @dabrahams thinks about this change.

I'm concerned about taking this change in Swift 2.2 (this might be a significant breaking change for those apps that use this API to display floating point numbers in UI), but it should be fine in Swift 3.0. Maybe we also need to provide a replacement API that allows one to specify the precision, so that one does not need to fall back to format strings?

stephentyrone · 2015-12-09T21:34:14Z

Agreed on all points: I want to hear Dave's take, it makes sense to keep out of 2.2, we need an easy-to-use "format nicely for display", as well as finer-grained control, and we need to document all of it better so confused people can figure out what to do.

Post-2.2, a change along these lines seems like a good starting point.

gribozavr · 2015-12-09T21:38:40Z

But we can probably take a change in Swift 2.2 to debugDescription only.

wokalski · 2015-12-09T23:53:14Z

stdlib/public/stubs/Stubs.cpp

@@ -140,12 +140,21 @@ static int swift_snprintf_l(char *Str, size_t StrSize, locale_t Locale,
 #endif

 template <typename T>
+static int swift_floatingPointToStringPrecision(bool Debug) {


As far as I understand the convention, methods are defined before being called(?)

Yes, we try to avoid forward declarations if we can just reorder functions.

It's not totally obvious to me that this warrants a separate function. Do we expect this to get more logic to become more complex in the future? If not, it seems clearer to me if this is just folded into the caller.

It looks to me like a good alternative to code like this:

int Precision = std::numeric_limits<T>::digits10; if (Debug) { Precision = std::numeric_limits<T>::max_digits10; }

which adds complexity for the reader - it adds a 5th conditional to the function body.

Personally, I find that much clearer, but I definitely don't feel strongly enough to argue for it. If you're happy with it as is, great.

If you want to be extra clear:

int Precision; if (Debug) { Precision = std::numeric_limits<T>::max_digits10; } else { Precision = std::numeric_limits<T>::digits10; }

wokalski · 2015-12-10T00:04:47Z

@stephentyrone I made a change as suggested in the conversation to speed up the process.

In fact, the way I started to work on a fix for this issue, was because of a misleading value shown in the debugger. I'm not too opinionated about this topic, but from the user perspective digits10 for description and max_digits10 for debugDescription are better than the current behavior IMO.

gribozavr · 2015-12-10T02:42:14Z

stdlib/public/core/FloatingPoint.swift.gyb

+  }
+}
+
+


Extra newline.

gribozavr · 2015-12-10T02:44:19Z

@wczekalski This patch probably would affect tests (did you run them? any fallout?) If it does not, then our tests are not good enough, and more need to be written :)

wokalski · 2015-12-11T17:24:53Z

@gribozavr There are only two tests failing and I will add some for debugDescription in Print.swift tests.

One of the existing tests fail because Array.description calls debugDescription on its contents. Is it desired behavior? I'd expect Array.description to call respective method on its contents and the same about Array.debugDescription.

Since both Array.debugDescription and Array.description call the same method Array._makeDescription which takes a Bool value as a parameter, the behavior is should not be too hard to change.

The default precision which is used for converting floating point numbers to strings leads to many confusing results. If we take a Float32 1.00000000 value and 1.00000012 of the same type, these two, obviously are not equal. However, if we log them, we are displayed the same value. So a much more helpful display using 9 decimal digits is thus: [1.00000000 != 1.00000012] showing that the two values are in fact different. (example taken from: http://www.boost.org/doc/libs/1_59_0/libs/test/doc/html/boost_test/test_output/log_floating_points.html) I'm by no means a floating point number expert, however having investigated this issue I found numerous sources saying that "magic" numbers 9 and 17 for 32 and 64 bit values respectively are the correct format. Numbers 9 and 17 represent the maximum number of decimal digits that round trips. This means that number 0.100000000000000005 and 0.1000000000000000 are the same as their floating-point representations are concerned.

debugDescription prints numbers with greater precision

debugDescription of all floating point numbers shows the number with greater precision, thus the tests had to be changed.

They didn't fail if expected2 was nil (nil was default value) but the actual value was different than expected1

wokalski · 2015-12-14T18:27:43Z

@gribozavr I changed the formatting, fixed the tests, and added some. I also cleaned up the history.

gribozavr · 2015-12-14T22:49:26Z

The tests pass on Linux.

gribozavr · 2015-12-15T10:41:46Z

@wczekalski The change to debugDescription LGTM, thanks!

@dabrahams We are still interested to know what you think about making the same change for description.

Fix precision of float to string to max significant decimals

stephentyrone · 2015-12-15T21:05:30Z

Thanks for following through on this, @wczekalski !

zwang · 2016-02-05T23:14:49Z

Hi guys, thanks for the awesome work. Just wondering if this is going to fix this issue too. Thank you.

// Tested in Xcode 7.2 Swift 2.1 Playground
let v: Double = 2.6090288509851067  // Note there are 16 decimals, swift double minimum is 15 decimals and when we do print(v) directly, it only print up to 15 decimals)
let dictionary = ["test": v]   //["test": 2.609028850985107]  <-- Playground shows only 15 decimals and rounded

gribozavr · 2016-02-05T23:24:28Z

@zwang Playground display style is not controlled by the standard library, please file a radar.

zwang · 2016-02-06T08:38:57Z

@gribozavr The same issue exists for print() too. I just use PlayGround as an example of showing the issue.

I will file a radar for playground issue.

Thank you.

wokalski · 2016-02-06T11:07:51Z

@zwang I'm no numerical expert but having worked on this one I think it works as expected, i.e. we print 17 digits which is enough for float->text->float roundtrip (read this or google max_digits10 for more info).
Also read this comment (it's in this PR). I think the issue you encountered will be addressed in the future.

wokalski · 2016-02-06T11:15:42Z

@zwang The particular issue you are seeing is that print() invokes description on the argument, not debugDescription. This issue is also addressed in the comments above. Sorry for the confusion!

tbkka · 2017-05-09T20:13:06Z

Both description and debugDescription should be accurate (in the sense explained in Steele & White's classic 1990 paper). Having only debugDescription be accurate is a very strange state of affairs.

Re-enable SwifterSwift-watchOS since it's no longer failing

lattner assigned gribozavr and dabrahams and unassigned gribozavr Dec 8, 2015

stephentyrone reviewed Dec 8, 2015
View reviewed changes

wokalski reviewed Dec 9, 2015
View reviewed changes

gribozavr reviewed Dec 10, 2015
View reviewed changes

stdlib/public/core/FloatingPoint.swift.gyb

}

}

Copy link

Contributor

gribozavr Dec 10, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newline.

wokalski force-pushed the bugfix/float-string-precision branch 2 times, most recently from 6d41d93 to 65408c8 Compare December 14, 2015 18:21

wokalski added 3 commits December 14, 2015 19:25

Introduced debug parameter in floating point to string conversion

e4fe5f1

Made floating point numbers conform to CustomDebugStringConvertible

10b61b7

debugDescription prints numbers with greater precision

Fixed tests of printing of floats after behavior change

63f36a8

debugDescription of all floating point numbers shows the number with greater precision, thus the tests had to be changed.

wokalski added 2 commits December 14, 2015 19:25

Fixed debugPrintedIs assertion logic in Print.swift

3a9eec6

They didn't fail if expected2 was nil (nil was default value) but the actual value was different than expected1

Added tests for debug printing of floating point numbers

575300e

wokalski force-pushed the bugfix/float-string-precision branch from 65408c8 to 575300e Compare December 14, 2015 18:25

gribozavr added a commit that referenced this pull request Dec 15, 2015

Merge pull request #348 from wczekalski/bugfix/float-string-precision

10a8f4e

Fix precision of float to string to max significant decimals

gribozavr merged commit 10a8f4e into swiftlang:master Dec 15, 2015

frootloops added a commit to frootloops/swift that referenced this pull request Dec 24, 2015

Tests from pull request swiftlang#348

bf15d49

tbkka mentioned this pull request Apr 17, 2017

Performance: formatting float/double for json and text formats apple/swift-protobuf#392

Closed

wokalski deleted the bugfix/float-string-precision branch May 9, 2017 20:45

swift-ci mentioned this pull request Dec 8, 2015

[SR-106] description returns a rounded value of Double #42728

Closed

freak4pc pushed a commit to freak4pc/swift that referenced this pull request Sep 28, 2022

Merge pull request swiftlang#348 from nkcsgexi/enable-SwifterSwift

d10b115

Re-enable SwifterSwift-watchOS since it's no longer failing

Fix precision of float to string to max significant decimals #348

Fix precision of float to string to max significant decimals #348

Uh oh!

Conversation

wokalski commented Dec 8, 2015

Uh oh!

wokalski commented Dec 8, 2015

Uh oh!

wokalski commented Dec 8, 2015

Uh oh!

getaaron commented Dec 8, 2015

Uh oh!

wokalski commented Dec 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wokalski commented Dec 8, 2015

Uh oh!

wokalski commented Dec 8, 2015

Uh oh!

stephentyrone commented Dec 8, 2015

Uh oh!

wokalski commented Dec 9, 2015

Uh oh!

stephentyrone commented Dec 9, 2015

Uh oh!

gribozavr commented Dec 9, 2015

Uh oh!

stephentyrone commented Dec 9, 2015

Uh oh!

gribozavr commented Dec 9, 2015

Uh oh!

stephentyrone commented Dec 9, 2015

Uh oh!

gribozavr commented Dec 9, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wokalski commented Dec 10, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gribozavr commented Dec 10, 2015

Uh oh!

wokalski commented Dec 11, 2015

Uh oh!

wokalski commented Dec 14, 2015

Uh oh!

gribozavr commented Dec 14, 2015

Uh oh!

gribozavr commented Dec 15, 2015

Uh oh!

stephentyrone commented Dec 15, 2015

Uh oh!

zwang commented Feb 5, 2016

Uh oh!

gribozavr commented Feb 5, 2016

Uh oh!

zwang commented Feb 6, 2016

Uh oh!

wokalski commented Feb 6, 2016

Uh oh!