What is "positive zero" and "negative zero" in float and double types?

Asked

Viewed 2,065 times

28

In response of this link the operation of the atan2(), and its translation of documentation by Victor Stafusa, there are some excerpts that I highlight below:

(...)

  • If the first argument is positive zero and the second argument is positive, or the first argument is positive and finite and the second argument is positive infinite, then the result is positive zero.

  • If the first argument is negative zero and the second argument is positive, or the first argument is negative and finite and the second argument is positive infinite, then the result is negative zero.

(...)

This explanation of positive zero and negative zero refers to the parameters received by the method cited, which are of the type double.

I always understood that "zero" represents a neutral point in the numerical system, so I don’t understand how something neutral can have a signal.

What does it mean "positive zero" and "negative zero" and why they use signals for floating point types in Java?

  • 5

    I think this is because with floating points, no value is exact. So when you think you have zero, you actually have a value that tends to zero.

  • 8

    Hj I went to the bank and saw that the Dow Jones had oscillated Nan%. I was impressed :)

  • 1

    I’m almost sure that this refers to IEEE 754 and it’s not a question with Java, it even has two answers on the site about this, but I can’t find.

  • 2

    @Maniero "Hello, Undefined, Congratulations for you Nanth Birthday!"

2 answers

31


Where float and double are specified

The float and the double are implemented in accordance with the IEEE 754 standard, used by virtually all modern programming languages working with 32 or 64 bit floating point numbers.

The internal representation of float

The float is represented with 32 bits of this form (screenshot of wikipedia):

a

Note the first bit, it is the signal bit. If it is 0 it is a positive number, if it is 1 it is negative. So the value of float is as follows:

(to) -1^sign x (1 + fraction/2^23) x 2^(expoent - 127), if the expoent is different from 0 and 255.

(b) -1^sign x fraction/2^23 x 2^-126, if the expoent is equal to 0.

(c) fórmula +inf, if the expoent is equal to 255 and fraction is equal to 0.

(d) Nan, if the expoent is equal to 255 and fraction other than 0.

The values of equation (a) are those that are called normal floating point numbers, while those of equation (b) are called subnormal or denormal floating point numbers. The values of (c) are those of infinity and (d) are Nan (not-a-number).

We see what formula (a) is multiplying. There are three different terms:

  • The part of sinal will be 1 or 1.

  • The part of 2^(expoent - 127) will always be a power of 2.

  • On the part of fraction/2^23, since fraction has 23 bits, so it is between 0 and 2^23-1. That way that term will have a value x such that 1<x<=2.

In case (a), it is seen that it is not possible for any of the multiplied terms to result in zero, so it is not possible to have a zero value here.

The equation of (b) is similar to (a). The difference is that the maintained exponent is the same as the smallest possible value of the case (a), being used 126 instead of 127. But the values are different because that 1 is no longer added to the second term. The second term in case (b), is different, will always be a value x such that 0<x<=1.

Thus, when the expoent is zero and the fraction also zero, the equation (b) will result in zero. But the signal bit is still there, giving two possible representations to the value zero. We also have several values that result in Nan (2^52 - 1 of these values to be exact).

Here is a test to show the two zeroes:

class Teste {
    public static void main(String[] args) {
        float a = Float.intBitsToFloat(0);
        float b = Float.intBitsToFloat(0b1000_0000__0000_0000__0000_0000__0000_0000);
        System.out.println(a);
        System.out.println(b);
    }
}

Here’s the way out:

0.0
-0.0

See here working on ideone.

The internal representation of double

We still have the double that uses a similar concept, but with more different bits and values:

b

The equations of double are those:

(to) -1^sign x (1 + fraction/2^52) x 2^(expoent - 1023), if the expoent is different from 0 and 2047.

(b) -1^sign x fraction/2^52 x 2^-1022, if the expoent is equal to 0.

(c) fórmula +inf, if the expoent is equal to 2047 and fraction is equal to 0.

(d) Nan, if the expoent is equal to 2047 and fraction other than 0.

Why there is +0.0 and 0.0?

You wonder why they did such a thing. The answer is that the signal is to be preserved when a calculation with floating point numbers reaches a value so small that it loses all bits of significance, even though the signal is preserved, then differentiating a negative value which has been rounded to zero of a positive value which has been rounded to zero.

For example:

class Teste2 {
    public static void main(String[] args) {
        float a = Float.intBitsToFloat(5);
        float pinf = Float.intBitsToFloat(0b0111_1111__0000_0000__0000_0000__0000_0000);
        float ninf = Float.intBitsToFloat(0b1111_1111__0000_0000__0000_0000__0000_0000);
        float b = a / pinf;
        float c = a / ninf;
        System.out.println(b);
        System.out.println(c);
    }
}

Here’s the way out:

0.0
-0.0

See here working on ideone.

How the values +0.0, 0.0 and Nan are compared?

And finally, we have that +0.0 and 0.0 are equal values, whereas a Nan value is never equal to anything (not to itself):

class Teste3 {
    public static void main(String[] args) {
        float zeroPositivo = Float.intBitsToFloat(0);
        float zeroNegativo = Float.intBitsToFloat(0b1000_0000__0000_0000__0000_0000__0000_0000);
        System.out.println(zeroPositivo == zeroNegativo);
        System.out.println(zeroPositivo != zeroNegativo);
        float nan1 = Float.intBitsToFloat(0b1111_1111__1000_0000__0000_0000__0000_0001);
        float nan2 = Float.intBitsToFloat(0b1111_1111__1000_0000__0000_0000__0000_0010);
        System.out.println(nan1 == nan2);
        System.out.println(nan1 == nan1);
        System.out.println(nan1 != nan2);
        System.out.println(nan1 != nan1);
    }
}

Here’s the way out:

true
false
false
false
true
true

See here working on ideone.

This exit means that the == considers that +0.0 and 0.0 have the same value, regardless of the fact that the bit values differ.

On the other hand, when one of the values compared with == is some kind of Nan, so the result is always false, even if a Nan value is being compared to itself.

The operator == always returns the opposite of what the != returns. This means that a Nan value is always different from itself and all other Nans.

Sources

10

The Java programming language uses the IEEE 754 standard for floating point arithmetic that defines -0.0 and when to use.

The smallest representable number does not have 1 bit in meaning subnormal and is called positive or negative zero as per determined by the sign. It actually represents a rounding to zero of numbers in the interval between zero and the smallest non-zero number representative of the same sign, which is why it has a sign and why that the reciprocal + Infinite or -Infinite also has a sign.

You can work around specific problems by adding 0.0

for example:

Double.toString(value + 0.0);

See more about floating number complexity

Briefly

"-0.0" is produced when a floating point operation results in a negative floating point number so close to 0 that it cannot be represented normally. And "+0.0" is also produced when a floating point operation results in a floating point number positive so close to 0 that it also cannot be represented normally

Regarding operations, in applications it can be a source of errors, if the developer does not take into account that while the two zero representations behave as equals under numerical comparisons, it can produce different results in some operations.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.