Re: max<float>(NaN, 42.0f)?

From:

=?ISO-8859-1?Q?Daniel_Kr=FCgler?= <daniel.kruegler@googlemail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Wed, 26 Jan 2011 12:39:54 CST

Message-ID:

<ihphsv$ads$1@news.eternal-september.org>

On 2011-01-26 09:55, Ulrich Eckhardt wrote:

I've been wondering about how to most efficiently find the maximum of a
sequence of floating point values.

[..]

What I was thinking about was to replace the body of
the loop with this instead:

   float res = numeric_limits<float>::quiet_NaN();
   for(...) {
      res = max(res, *it);
   }
   return res;

What I was wondering was if std::max() was defined in a way that it would
work without relying on undefined or implementation-defined behavior.

I think what we miss here is your definition of "would work", see also
below.

From what I understand, it requires a less-than comparison, so it will probably
look like this:

   float max(float a, float b) {
      return (a< b) ? b : a;
   }

Now, if either a or b is NaN, the comparison is always false so that the
second one is returned. This means the one that could be NaN must come
first.

IMO there is a logic error in this argumentation, or I misunderstand
your intention (the "would work" definition). I understand you that you
want to use the NaN value as a means to convert even an empty loop into
a non-empty one so to always have a defined maximum value (in the empty
loop case this value is NaN, otherwise not). Is that correct? This
means, you need to put the NaN value as the *second* argument, if you
want the first non-NAN value to be returned as part of this comparison,
because max(nan, v) returns nan.

My primary concern is whether the definition *must* work like this,
i.e. if this is guaranteed to work. In any case, I could define a max()
function that does work as I need it to, but I'm curious.

I assume that you are referring here to std::max from <algorithm>. In
this case we have the fundamental requirement that the template
parameter satisfies the LessThanComparable requirement set, which
assumes that the expression

a < b

is supported, and nothing more. The requirement for using this set is
that this operation induces a strict weak ordering among the arguments.
So, for user-defined argument types this means that an implementation is
basically *required* to use exactly this expression. In theory, an
implementation could ensure that for fundamental types ("built-in"
types) a different implementation is invoked which satisfies the
requirements of the algorithm. For std::max, the larger value is
returned and and for equivalent values the first argument as
representative. The critical question here is, whether these
requirements are strong enough to guarantee that for type float the
exact same (or equivalent) expression is used or not, especially in the
presence of NaN operands. IMO you *cannot* rely that your assumptions
holds, because the set {nan-values, non-nan-values} does not satisfy the
strict weak ordering criterion over operator <. As a simple exclusion
criterion we find that the antisymmetry property does not hold:

f(x, y) => !f(y, x)

When r is some floating point value and nan is some NaN value, the
following holds:

(r < nan) == (nan < r)

So given this contract breakage on your side providing a NaN value as
argument is technically undefined behaviour.

Ignoring this detail for a while and thinking further at least in theory
an implementation could replace the above assumed
return statement by either

return (b < a) ? a : b;

or

return (a > b) ? a : b;

for any arithmetic type, which is valid for the sub-domain of the
floating-point types that satisfies the strict weak ordering criterion,
even though I'm not aware of any such existing implementation.[1]

Let me also add that you may still reconsider your choice if you
actively add NaN's into the algorithm: In C99 there exists a much more
adequate library function, namely fmaxf for your scenario: In this case
the specification explicitly describes the behaviour in the sense you
are already assuming.

Either way can still be improved. By introducing explicitly NaN operands
into your calculation, both yet discussed forms will typically set
special floating point status flags (and the program can be configured
to raise a floating-point "exception"). Depending on your environment
requirements this behaviour is unwanted. In this case I recommend to use
the guaranteed "silent" ordering functions (e.g. isgreater or isless
depending on your usage) instead. This won't help you when using
fmax(f), though.

Now while nearly being ready, I just notice the presence of the symbolic
'*it' in your example which I associate with something like an object
'it' of a type satisfying the iterator requirements. If this assumption
is correct and you have at least a forward iterator, I would *strongly*
recommend to ignore the above discussion and to use instead
std::max_element with the implicit ordering predicate (<). This
algorithm has the advantage that the return value will be the
past-the-end value of the iterator arguments for an empty sequence and
can thus be very reasonably used to handle empty sequences as well. In
this case you don't need to rely on special floating-point semantics
with all the uncertainties in regard to buggy implementations.

HTH & Greetings from Bremen,

Daniel Kr?gler

[1]: In theory the difference of two equivalent values is detectable for
the the special floating point values -0.0 and +0.0, because either -0.0
or +0.0 would be returned, so this decision would be questionable. In
normal equality or inequality tests both values compare equal, but e.g.
via IO output or proper usage of copysign we could detect the difference
of these implementations. But negative complicate the situation anyway.
I could not find in C99 a clear statement about what value is returned
in the expression fmax(0.0, -0.0) compared to fmax(-0.0, 0.0).

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]