Monday 4 April 2016

c++ - Floating point comparison revisited

This topic has come up many times on StackOverflow, but I believe this is a new take. Yes, I have read Bruce Dawson's articles and What Every Computer Scientist Should Know About Floating-Point Arithmetic and this nice answer.



As I understand it, on a typical system there are four basic problems when comparing floating-point numbers for equality:




  1. Floating point calculations are not exact

  2. Whether a-b is "small" depends on the scale of a and b

  3. Whether a-b is "small" depends on the type of a and b (e.g. float, double, long double)

  4. Floating point typically has +-infinity, NaN, and denormalized representations, any of which can interfere with a naïve formulation




This answer -- aka. "the Google approach" -- seems to be popular. It does handle all of the tricky cases. And it does scale the comparison very precisely, checking whether two values are within a fixed number of ULPs of each other. Thus, for example, a very large number compares "almost equal" to infinity.



However:




  • It is very messy, in my opinion.

  • It is not particularly portable, relying heavily on internal representations, using a union to read the bits from a float, etc.

  • It only handles single-precision and double-precision IEEE 754 (in particular, no x86 long double)




I want something similar, but using standard C++ and handling long doubles. By "standard", I mean C++03 if possible and C++11 if necessary.



Here is my attempt.



#include 
#include
#include


namespace {
// Local version of frexp() that handles infinities specially.
template
T my_frexp(const T num, int *exp)
{
typedef std::numeric_limits limits;

// Treat +-infinity as +-(2^max_exponent).
if (std::abs(num) > limits::max())
{

*exp = limits::max_exponent + 1;
return std::copysign(0.5, num);
}
else return std::frexp(num, exp);
}
}

template
bool almostEqual(const T a, const T b, const unsigned ulps=4)
{

// Handle NaN.
if (std::isnan(a) || std::isnan(b))
return false;

typedef std::numeric_limits limits;

// Handle very small and exactly equal values.
if (std::abs(a-b) <= ulps * limits::denorm_min())
return true;


// frexp() does the wrong thing for zero. But if we get this far
// and either number is zero, then the other is too big, so just
// handle that now.
if (a == 0 || b == 0)
return false;

// Break the numbers into significand and exponent, sorting them by
// exponent.
int min_exp, max_exp;
T min_frac = my_frexp(a, &min_exp);

T max_frac = my_frexp(b, &max_exp);
if (min_exp > max_exp)
{
std::swap(min_frac, max_frac);
std::swap(min_exp, max_exp);
}

// Convert the smaller to the scale of the larger by adjusting its
// significand.
const T scaled_min_frac = std::ldexp(min_frac, min_exp-max_exp);


// Since the significands are now in the same scale, and the larger
// is in the range [0.5, 1), 1 ulp is just epsilon/2.
return std::abs(max_frac-scaled_min_frac) <= ulps * limits::epsilon() / 2;
}


I claim that this code (a) handles all of the relevant cases, (b) does the same thing as the Google implementation for IEEE-754 single- and double-precision, and (c) is perfectly standard C++.



One or more of these claims is almost certainly wrong. I will accept any answer that demonstrates such, preferably with a fix. A good answer should include one or more of:





  • Specific inputs differing by more than ulps Units in Last Place, but for which this function returns true (the bigger the difference, the better)

  • Specific inputs differing by up to ulps Units in Last Place, but for which this function returns false (the smaller the difference, the better)

  • Any case(s) I have missed

  • Any way in which this code relies on undefined behavior or breaks depending on implementation-defined behavior. (If possible, please cite a relevant spec.)

  • Fixes for whatever problem(s) you identify

  • Any way to simplify the code without breaking it




I intend to place a non-trivial bounty on this question.

No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...