This topic has come up many times on StackOverflow, but I believe this is a new take. Yes, I have read Bruce Dawson's articles and What Every Computer Scientist Should Know About Floating-Point Arithmetic and this nice answer.
As I understand it, on a typical system there are four basic problems when comparing floating-point numbers for equality:
- Floating point calculations are not exact
- Whether
a-b
is "small" depends on the scale ofa
andb
- Whether
a-b
is "small" depends on the type ofa
andb
(e.g. float, double, long double) - Floating point typically has +-infinity, NaN, and denormalized representations, any of which can interfere with a naïve formulation
This answer -- aka. "the Google approach" -- seems to be popular. It does handle all of the tricky cases. And it does scale the comparison very precisely, checking whether two values are within a fixed number of ULPs of each other. Thus, for example, a very large number compares "almost equal" to infinity.
However:
- It is very messy, in my opinion.
- It is not particularly portable, relying heavily on internal representations, using a union to read the bits from a float, etc.
- It only handles single-precision and double-precision IEEE 754 (in particular, no x86 long double)
I want something similar, but using standard C++ and handling long doubles. By "standard", I mean C++03 if possible and C++11 if necessary.
Here is my attempt.
#include
#include
#include
namespace {
// Local version of frexp() that handles infinities specially.
template
T my_frexp(const T num, int *exp)
{
typedef std::numeric_limits limits;
// Treat +-infinity as +-(2^max_exponent).
if (std::abs(num) > limits::max())
{
*exp = limits::max_exponent + 1;
return std::copysign(0.5, num);
}
else return std::frexp(num, exp);
}
}
template
bool almostEqual(const T a, const T b, const unsigned ulps=4)
{
// Handle NaN.
if (std::isnan(a) || std::isnan(b))
return false;
typedef std::numeric_limits limits;
// Handle very small and exactly equal values.
if (std::abs(a-b) <= ulps * limits::denorm_min())
return true;
// frexp() does the wrong thing for zero. But if we get this far
// and either number is zero, then the other is too big, so just
// handle that now.
if (a == 0 || b == 0)
return false;
// Break the numbers into significand and exponent, sorting them by
// exponent.
int min_exp, max_exp;
T min_frac = my_frexp(a, &min_exp);
T max_frac = my_frexp(b, &max_exp);
if (min_exp > max_exp)
{
std::swap(min_frac, max_frac);
std::swap(min_exp, max_exp);
}
// Convert the smaller to the scale of the larger by adjusting its
// significand.
const T scaled_min_frac = std::ldexp(min_frac, min_exp-max_exp);
// Since the significands are now in the same scale, and the larger
// is in the range [0.5, 1), 1 ulp is just epsilon/2.
return std::abs(max_frac-scaled_min_frac) <= ulps * limits::epsilon() / 2;
}
I claim that this code (a) handles all of the relevant cases, (b) does the same thing as the Google implementation for IEEE-754 single- and double-precision, and (c) is perfectly standard C++.
One or more of these claims is almost certainly wrong. I will accept any answer that demonstrates such, preferably with a fix. A good answer should include one or more of:
- Specific inputs differing by more than
ulps
Units in Last Place, but for which this function returns true (the bigger the difference, the better) - Specific inputs differing by up to
ulps
Units in Last Place, but for which this function returns false (the smaller the difference, the better) - Any case(s) I have missed
- Any way in which this code relies on undefined behavior or breaks depending on implementation-defined behavior. (If possible, please cite a relevant spec.)
- Fixes for whatever problem(s) you identify
- Any way to simplify the code without breaking it
I intend to place a non-trivial bounty on this question.
No comments:
Post a Comment