# Introducing `mozilla/FloatingPoint.h`: methods for floating point types and values

Tags: , , , , , , — Jeff @ 19:00

The latest addition to the Mozilla Framework Based on Templates (mfbt) is `mozilla/FloatingPoint.h`. This header implements various floating point functionality.

## Functionality overview

`mozilla/FloatingPoint.h` currently implements the following functionality, all centered around working with double-precision floating point numbers. (There’s no single-precision support only because Mozilla seemingly doesn’t need it. The only code I can find that does this sort of thing for single-precision numbers is `nsCoord.h`, and that only barely. We can add single-precision equivalents when we need them.)

`MOZ_DOUBLE_IS_NaN(double d)`
Determines whether a value is `NaN` (not a number).
`MOZ_DOUBLE_IS_INFINITE(double d)`
Determines whether a value is positive or negative infinity.
`MOZ_DOUBLE_IS_FINITE(double d)`
Determines whether a number is finite — that is, not `NaN` or positive or negative infinity.
`MOZ_DOUBLE_IS_NEGATIVE(double d)`
Determines whether a number is negative. This is useful because `d < 0` does not answer this question! There are two zero values, `+0` and `-0`, and IEEE-754 requires that `(-0 < 0)` be false. (There are good reasons for this, but this isn’t the place to get into them. If you haven’t read it, read What Every Computer Scientist Should Know About Floating-Point Arithmetic right now. It probably gives the answer, and much much more knowledge as well.) This method properly distinguishes `-0` as being negative.
`MOZ_DOUBLE_IS_NEGATIVE_ZERO(double d)`
Determines whether a number is `-0`.
`MOZ_DOUBLE_EXPONENT(double d)`

Returns the exponent portion of the number. Floating point numbers are represented as a sign bit `s`, a binary fraction `b0..p`, and an exponent `E`. The represented number, then, is `(-1)s(b0..p)2E`. This method returns the number `E` for a floating point number.
`MOZ_DOUBLE_POSITIVE_INFINITY()`
Returns positive infinity.
`MOZ_DOUBLE_NEGATIVE_INFINITY()`
Returns negative infinity.
`MOZ_DOUBLE_SPECIFIC_NaN(int signbit, uint64_t significand)`
Computes a specific `NaN` value, with a bit pattern specified by provided parameters. The bit layout IEEE-754 specifies for floating point formats interestingly requires that multiple bit patterns be treated as `NaN` values. This method allows the user to create such custom `NaN` values if he needs to. (99% of code should never, ever touch this method. Instead, most code should use…)
`MOZ_DOUBLE_NaN()`

Computes an unspecified `NaN` value. If you need a `NaN` value and you don’t know that you need a specific `NaN`, use this method instead to get one.
`MOZ_DOUBLE_MIN_VALUE()`
Returns the smallest non-zero positive double value.
`MOZ_DOUBLE_IS_INT32(double d, int32_t* i)`
Determines if the provided number is a signed 32-bit integer. (`-0` doesn’t count as such; `+0`, the “normal” zero value, does.) If it is, `*i` will be set to that value when the method returns.

(There’s one more method in the header, currently, that used to have users. Sometime in the last couple months, however, that method’s users all disappeared, and I didn’t notice it when rebasing. Thus I’ll be removing it shortly, and I haven’t mentioned it here.)

## Why add some of these methods? Aren’t `isinf`, `isnan`, and so on good enough?

There are standard methods implementing some (but not all) of this functionality. In the best of all possible worlds we could simply use `isnan` and other such methods directly. In practice we’ve encountered a number of problems.

First, Microsoft’s compilers gratuitously Think Different and don’t expose `isnan` and friends, so on those platforms you’d have to use `_isnan` instead. (`std::isnan` isn’t usable there because some of our code still must work as both C and C++.) Obviously, we don’t want to `#ifdef` every place we need to use the method.

Second, we’ve found various compilers have bugs when using either the standard methods or Microsoft’s bogo-named methods. Most commonly this bustage occurs with PGO builds; interestingly, both MSVC and gcc have problems here, despite their optimizers obviously not sharing any code.

Third, we’ve found even some obvious bitwise algorithms trigger PGO build failures, again on entirely different compilers. (You can’t win.)

Basically, then, we can’t use the standard methods, we can’t use some bitwise methods, and whatever we do we have to be really careful about to make sure we don’t break anything. Hopefully this header will satisfy those requirements.

The header is located at `mfbt/FloatingPoint.h` in the source tree. However, per standard mfbt practice, you should use `#include "mozilla/FloatingPoint.h"` to include it. Knock yourself out using it.

1. Why such ugly names? i.e. why make them look like macros when they are not?

Comment by Jeff Muizelaar — 19.04.12 @ 21:16

2. Probably because I didn’t think all that hard about the names, and because my immediate goal was getting it all out of SpiderMonkey (which had basically the same methods and names) and consolidating it with similar code elsewhere in Gecko. (Also I’m primarily busy with other stuff, and this only happened after some timely prodding.) If you want to file a bug to paint the bikeshed a different color, that’s fine with me.

Comment by Jeff — 19.04.12 @ 21:40

3. Just a naive question: why not just implement IS_NAN as:

return x != x;

?

Comment by Benoit Jacob — 20.04.12 @ 06:22

4. Various compilers miscompile it sometimes (I think PGO may again be relevant here). Arguably, too, a named method is clearer.

Comment by Jeff — 20.04.12 @ 11:21

5. we’ve found even some obvious bitwise algorithms trigger PGO build failures

How so? Are you saying that valid C/C++ code causes the compiler to error out or to generate incorrect code with PGO enabled?

Comment by Adam Rosenfield — 24.04.12 @ 09:10

6. Yes, valid code (at least, to the extent that you assume punning a `double`‘s bits into a `uint64_t` works and produces the IEEE-754 pattern) causes the compiler to error or generate invalid code. Bug 653777 comment 12 and comment 13 say that this is essentially the algorithm that MSVC failed to compile:

```#define JSDOUBLE_SIGNBIT (((uint64_t) 1) << 63)
#define JSDOUBLE_EXPMASK (((uint64_t) 0x7ff) << 52)

union jsdpun {
double d;
uint64_t u64;
};

static inline int
JSDOUBLE_IS_NaN(double d)
{
jsdpun pun;
pun.d = d;
return (pun.u64 & ~JSDOUBLE_SIGNBIT) > JSDOUBLE_EXPMASK;
}
```

That looks perfectly fine to me — mask out the sign bit, check that all the exponent bits are set (because it’s greater than the value with only the exponent bits set), check that at least one significand bit is set (greater than the value with none set).

Comment by Jeff — 24.04.12 @ 10:05