You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In runtime/asm_386.s's asminit, we try to set the floating point control word so that all 387 floating-point math happens at 64-bit precision, with rounding to 64-bit after every operation.
The Intel manual says:
The precision-control (PC) field (bits 8 and 9 of the x87 FPU control word) determines the precision
(64, 53, or 24 bits) of floating-point calculations made by the x87 FPU (see Table 8-2). The
default precision is double extended precision, which uses the full 64-bit significand available
with the double extended-precision floating-point format of the x87 FPU data registers. This
setting is best suited for most applications, because it allows applications to take full advantage
of the maximum precision available with the x87 FPU data registers.
The double precision and single precision settings reduce the size of the significand to 53 bits
and 24 bits, respectively. These settings are provided to support IEEE Standard 754 and to
provide compatibility with the specifications of certain existing programming languages. Using
these settings nullifies the advantages of the double extended-precision floating-point format's
64-bit significand length. When reduced precision is specified, the rounding of the significand
value clears the unused bits on the right to zeros.
The precision-control bits only affect the results of the following floating-point instructions:
FADD, FADDP, FIADD, FSUB, FSUBP, FISUB, FSUBR, FSUBRP, FISUBR, FMUL,
FMULP, FIMUL, FDIV, FDIVP, FIDIV, FDIVR, FDIVRP, FIDIVR, and FSQRT.
Suppose x=1, y=100, z=1e308 and we compute x/(y*z).
The value yz is 1e310, which is too big for a float64, so it should compute as +Inf. Then 1/+Inf is 0. So we expect to get zero from x/(yz), and we do in x86 with SSE and on non-x86 systems. But on x86 using 387 instructions, even with the FPU precision set to 64-bit, we get 1e-310. Clearly the FPU is storing the intermediate y*z result in something more than a float64.
Here is a program, suppose its in a directory called x:
$ cat x.go
package main
import (
"fmt"
"runtime"
)
func fpuControlWord() uint16
func main() {
runtime.LockOSThread()
fmt.Printf("Control Word: %#x\n", fpuControlWord())
x, y, z := vals()
fmt.Printf("x=%v y=%v z=%v y*z=%v x/(y*z)=%v\n", x, y, z, y*z, x/(y*z))
fmt.Printf("g(x, y, z)=%v\n", g(x, y, z))
fmt.Printf("Control Word: %#x\n", fpuControlWord())
}
//go:noinline
func g(x, y, z float64) float64 {
return x / (y * z)
}
//go:noinline
func vals() (float64, float64, float64) {
return 1, 100, 1e308
}
$ cat fld.s
#include "go_asm.h"
TEXT ·fpuControlWord(SB),$0-0
FSTCW ret+0(FP)
RET
$ GOARCH=386 GO386=387 go build
$ ./x
Control Word: 0x27f
x=1 y=100 z=1e+308 y*z=+Inf x/(y*z)=1e-310
g(x, y, z)=1e-310
Control Word: 0x27f
$
The control word is set correctly.
The function g is here to make it easier to see the computation instructions. Here they are from compile -S:
The multiplication result (product) is stored in a 387 register and then used in the division.
The control word is set such that the product should be rounded to float64.
If the product were rounded, you'd get +Inf.
From the division result, it's clear that the product is stored as 1e310.
As I finish writing this, I realize the problem: the product mantissa is being rounded to double precision, but the exponent is being left alone: the truncated registers still have extended exponents until they are converted to float64 by transiting memory. Hence the discrepancy.
Lesson: even with the control word change the 387 does not behave exactly like standard float64 hardware.
I'm bothering to file this at all so that maybe I can find it the next time I get confused by this.
The text was updated successfully, but these errors were encountered:
The double precision and single precision settings reduce the size of the significand to 53 bits and 24 bits, respectively. These settings are provided to support IEEE Standard 754 and to provide compatibility with the specifications of certain existing programming languages.
Uh oh!
There was an error while loading. Please reload this page.
In runtime/asm_386.s's asminit, we try to set the floating point control word so that all 387 floating-point math happens at 64-bit precision, with rounding to 64-bit after every operation.
The Intel manual says:
Suppose x=1, y=100, z=1e308 and we compute x/(y*z).
The value yz is 1e310, which is too big for a float64, so it should compute as +Inf. Then 1/+Inf is 0. So we expect to get zero from x/(yz), and we do in x86 with SSE and on non-x86 systems. But on x86 using 387 instructions, even with the FPU precision set to 64-bit, we get 1e-310. Clearly the FPU is storing the intermediate y*z result in something more than a float64.
Here is a program, suppose its in a directory called x:
The control word is set correctly.
The function g is here to make it easier to see the computation instructions. Here they are from compile -S:
And here they are from lldb, just to confirm:
I think this shows:
As I finish writing this, I realize the problem: the product mantissa is being rounded to double precision, but the exponent is being left alone: the truncated registers still have extended exponents until they are converted to float64 by transiting memory. Hence the discrepancy.
Lesson: even with the control word change the 387 does not behave exactly like standard float64 hardware.
I'm bothering to file this at all so that maybe I can find it the next time I get confused by this.
The text was updated successfully, but these errors were encountered: