Ask HN: How is it possible to get -0.0 in a sum?

I'm looking for corner cases where he result is -0.0. As far as I know, the only way to get -0.0 in a sum is

  (-0.0) + (-0.0)

Does someone know any other case in IEEE 754?

Bonus question: What happens in subtractions? I only know

  (-0.0) - (+0.0)

Is there any other case?

10 points | by gus_massa 1 day ago

4 comments

sparkie 1 day ago
It depends on the FP rounding mode. If rounding mode is FE_TOWARDZERO/FE_UPWARD/FE_TONEAREST then the case you gave is the only one I'm aware of. If rounding mode is FE_DOWNWARD (towards negative infinity) then other calculations that result in a zero will give a -0.0.
Here's an example of -1.0f + 1.0f resulting in -0.0: https://godbolt.org/z/5qvqsdh9P
[-]
- gus_massa 14 hours ago
  Thanks! [Sorry for the delay.]
  ---
  FYI: For more context, I'm trying to send a PR to Chez Scheme (and indirectly to Racket) https://github.com/cisco/ChezScheme/pull/959 to reduce expressions like
```
  (+ 1.0 (length L))  ;  ==>  (+ 1.0 (fixnum->flonum (length L)))
```
  where the "fixnums" are small integers and "flonums" are double.
  It's fine, unless you have the case
```
  (+ -0.0 (length L))  ;  =wrong=>  (+ -0.0 (fixnum->flonum (length L)))
```
  because if the length is 0, it get's transformed into 0.0 instead of -0.0
  There are a few corner cases, in particular because it's possible to have
```
   (+ 1.0 x (length L))
```
  and I really want to avoid the runtime check of (length L) == 0 if possible.
  So I took a look, asked there, and now your opinion confirms what I got so far. My C is not very good, so it's nice to have a example of how the rounding directions are used. Luckily Chez Scheme only uses the default rounding and it's probably correct to cut a few corners. I'll take a looks for a few days in case there is some surprise.
  [-]
  - sparkie 1 hour ago
    I'm not sure you can avoid the check, but you can avoid a branch.
    An AVX-512 extension has a `vfixupimm` instruction[1] which can adjust special floating point values. You could use this to adjust all zeroes to -0 but leave any non-zeroes untouched. It isn't very obvious how to use though.
```
    vfixupimmsd dst, src, fixup, flag

 * The `flag` is for error reporting - we can set it to zero to ignore errors.

 * `dst` and `src` are a floating point value - they can be the same register.

 * The instruction first checks `src` and turns any denormals into zero if the MXCSR.DAZ flag is set.

 * It then categorizes `src` as one of {QNAN, SNAN, ZERO, ONE, NEG_INF, POS_ING, NEG_VALUE, POS_VALUE}

 * `fixup` is an array of 8 nybbles (a 32-bit int) and is looked up based on the categorization of `src` {QNAN = 0 ... POS_VALUE = 7}

 * The values of each nybble denote which value to place into `dst`:

    0x0 : dst (unchanged)
    0x1 : src (with denormals as zero if MXCSR.DAZ is set)
    0x2 : QNaN(src)
    0x3 : QNAN_Indefinite
    0x4 : -INF
    0x5 : +INF
    0x6 : src < 0 ? -INF : +INF
    0x7 : -0
    0x8 : +0
    0x9 : -1
    0xA : +1
    0xB : 1/2
    0xC : 90.0
    0xD : PI/2
    0xE : MAX_FLOAT
    0xF : -MAX_FLOAT
```
    You want to set the nybble for categorization ZERO (bits 11..8) to 0x7 (-0) in `fixup`. This would mean you want `fixup` to be equal to `0x00000700`. So usage would be:
```
    double fixnum_to_flonum(int64_t fixnum) {
        __m128d result = { (double)fixnum };
        return _mm_cvtsd_f64(_mm_fixupimm_sd(result, result, (__m128i){ 0x700 }, 0));
    }
```
    Which compiles to just 4 instructions, with no branches:
```
    .FIXUP:
        .long   1792                            # 0x700
        .long   0                               # 0x0
        .long   0                               # 0x0
        .long   0                               # 0x0
    fixnum_to_flonum:
        vcvtsi2sd       xmm0, xmm0, rdi
        vmovq           xmm0, xmm0
        vfixupimmsd     xmm0, xmm0, qword ptr [rip + .FIXUP], 0
        ret
```
    It can be extended to operate on 8 int64->double at a time (__m512d) with little extra cost.
    You could maybe use this optimization where the instruction is available and just stick with a branch version otherwise, or figure out some other way to make it branchless - though I can't think of any other way which would be any faster than a branch.
    [1]:https://www.intel.com/content/www/us/en/docs/intrinsics-guid...
gethly 1 day ago
i would guess that because of how *** * floats are in binary computers, you have something like -0.0000000000000000000000000000000000001 and when you round it you end up with -0.0. Same goes for positive value, you're just not used to write the + sign before every number, so seeing the minus feels strange.
[-]
- dcminter 14 hours ago
  You're answering a question that OP did not ask.
kazinator 1 day ago
What happens if we take the smallest (as in closest to zero) negative subnormal and add it to itself?
[-]
- gus_massa 1 day ago
  Copying the example by sparkie, something like this? https://godbolt.org/z/xhdnb9ax3 I get +0.0 if I comment the round to negative option.