Floating-point precision when moving from i386 to x86_64
I have an application that was developed for Linux x86 32 bits. There are lots of floating-point operations and a lot of tests depending on the results. Now we are porting it to x86_64, but the test results are different in this architecture. We don't want to keep a separate set of results for each architecture.
According to the article An Introduction to GCC - for the GNU compilers gcc and g++ the problem is that GCC in X86_64 assumes fpmath=sse while x86 assumes fpmath=387. The 387 FPU uses 80 bit internal precision for all operations and only convert the result to a given floating-point type (float, double or long double) while SSE uses the type of the operands to determine its internal precision.
I can force -mfpmath=387 when compiling my own code and all my operations work correctly, but whenever I call some library function (sin, cos, atan2, etc.) the results are wrong again. I assume it's because libm was compiled without the fpmath override.
I tried to build libm myself (glibc) using 387 emulation, but it caused a lot of crashes all around (don't know if I did something wrong).
Is there a way to force all code in a process to use the 387 emulation in x86_64? Or maybe some library that returns the same values as libm does on both architectures? Any suggestions?
Regarding the question of "Do you need the 80 bit precision", I have to say that this is not a problem for an individual operation. In this simple case the difference is really small and makes no difference. When compounding a lot of operations, though, the error propagates and the difference in the final result is not so small any more and makes a difference. So I guess I need the 80 bit precision.
I'd say you need to fix your tests. You're generally setting yourself up for disappointment if you assume floating point math to be accurate. Instead of testing for exact equality, test whether it's close enough to the expected result. What you've found isn't a bug, after all, so if your tests report errors, the tests are wrong. ;)
As you've found out, every library you rely on is going to assume SSE floating point, so unless you plan to compile everything manually, now and forever, just so you can set the FP mode to x87, you're better off dealing with the problem now, and just accepting that FP math is not 100% accurate, and will not in general yield the same result on two different platforms. (I believe AMD CPU's yield slightly different results in x87 math as well).
Do you absolutely need 80-bit precision? (If so, there obviously aren't many alternatives, other than to compile everything yourself to use 80-bit FP.)
Otherwise, adjust your tests to perform comparisons and equality tests within some small epsilon. If the difference is smaller than that epsilon, the values are considered equal.
80 bit precision is actually dangerous. The problem is that it is actually preserved as long as the variable is stored in the CPU register. Whenever it is forced out to RAM, it is truncated to the type precision. So you can have a variable actually change its value even though nothing happened to it in the code.
If you want long double precision, use long double for all of your floating point variables, rather than expecting float or double to have extra magic precision. This is really a no-brainer.
SSE floating point and 387 floating point use entirely different instructions, and so there's no way to convince SSE fp instructions to use the 387. Probably the best way to deal with this is resign your test suite to getting slightly different results, and not depend on results being the same to the last bit.