Why is it not possible to use comparison operators with Nan?

Asked

Viewed 235 times

9

Nan - The global property Nan is a special value meaning Not-A-Number (not a number).

The curious thing is that it is not possible to perform operations of comparison with this almost mystical property. Presenting a non-reflective characteristic where NaN === NaN is false.

I did just below a Javascript executable code just to test the concept, but could be any programming language.

var spans = document.querySelectorAll('span');
for(var i  in spans){
 var span = spans[i];
 if(span && typeof span.getAttribute == 'function'){
  var exp = span.getAttribute("data-evaluate");
  console.log(exp, eval(exp));
  span.innerHTML = "Expr: "+ exp + " ------------ :> " + eval(exp);
 }
}
<span data-evaluate="true"></span><br> <!-- eval control -->
<span data-evaluate="isNaN(NaN)"></span><br><!-- NaN control -->
<span data-evaluate="NaN === NaN"></span><br>
<span data-evaluate="parseInt('A') === NaN"></span><br>
<span data-evaluate="parseInt('1') === NaN"></span><br>
<span data-evaluate="parseInt('B') === parseInt('B')"></span>

If in your code you usually perform arithmetic calculations it is possible to arrive at a time when the NaN appear naturally, either in a type conversion (string for int, for example) or the sum with undefined's.

What is the concept, the math behind this anomaly and what makes NaN so confused that they cause NaN === NaN return a false value although humanly speaking should return true?

3 answers

11


The NaN, +inf and -inf are attempts to model exceptions, errors, indeterminations and overflows connected arithmetic. The floating comma representations (eg double of C) have conventions to represent these pseudo numbers.

There are several situations that return (fall apart) NaN, +inf or -inf.

Let’s start with a more intuitive case -- inf:

5 / 0    = +inf
-5/ 0    = -inf
n + n ...= +inf  se tiver havido overflow
log(0)   = -inf

The cases of NaN often result in operations with inf or a "indeterminate" 0/0:

0 / 0     = NaN
5/0+ -5/0 = +inf -inf = NaN
+inf * 0  = NaN
inf / inf = NaN
+inf -inf = NaN
sqrt(-1)  = NaN

NaN + ... = NaN
NaN / ... = NaN
NaN * ... = NaN

5 * +inf  = +inf
-2 * +inf = -inf
5  / inf  = 0

The various NaN are not equal to each other -- hence NaN == NaN be false. The comparison involving NaN and inf are in general undefined:

+inf > +inf      ???
NaN  > NaN       ???
0/0  > sqrt(-1)  ???

I suggest reading the manual data represent., where this subject is expanded.

6

Nan are generated when arithmetic operations have as values: Undefined or unrepresentable. As values, they do not necessarily overflow conditions. Nan also results from the conversion attempt and non-numerical values for numerical values for which the numerical value primitive be available.

This conversion of values generates binary arbitrary values preventing arithmetic operations, the obvious would be that NaN === NaN resulted in a positive boleano value, since we are looking at only the "Nan" as if it were a constant, and in fact it is not. For example "A" * "A" and parseInt("blabla") result in Nan but are completely different in their primitive numerical value.

Us floating point calculations (IEEE 754), NaN is not the same as the Infinity, although both are typically treated as special cases in real number floating point representations as well as floating point operations. An invalid operation is also not the same as overflow arithmetic (which can return an infinite) or an underflow arithmetic (which would return the smallest normal number, a normal number or zero).

A comparison with an Nan always returns an unordered result, even when compared to yourself. Comparison predicates are signaling or non-signaling; the signaling versions indicate the invalid operation exception for such comparisons. Equality and inequality predicates are not flags, so x = x Return false can be used to test if x is a silent Nan. The other standard comparison predicates are all signaling if they receive an Nan operand, the standard also provides non-signaling versions of these other predicates.

In short, Nan does not represent the set of all real numbers. And infinite real values will produce the same finite or infinite floating point result independent of substitutions. Preventing arithmetic operations from being performed on this type of data.

  • 1

    valores binários confusos impedindo operações aritiméticas This makes no sense. Every binary value is a discrete numerical value. There is no way to be confused nor how impossible or impractical the application of arithmetic.

  • I changed the term confuso for arbritários , thus makes more sense in this context. Since it has relation to the arbritarity in the precision arithmetic of such conversions that generate Nan.

1

The short answer is "Because they defined it that way".

To elaborate better and give a little more foundation, follows below an adaptation of Soen’s reply, whose author claims to have been a member of the IEEE-754 Committee - which in turn is responsible for IEEE 754 standard, defining the entire operation of floating point numbers (of which the NaN is part).

In addition to the translation/adaptation, I also joined the text with other sources (both Soen and other places) and some addenda of mine, in order to give a more general overview of why NaN not be like himself.


Adaptation of reply from Soen to the question "What is the justification for all comparisons with NaN return false?"

First, floating point numbers are not real numbers, and floating-point arithmetic does not satisfy the axioms of real arithmetic. A Law of Trichotomy (that every real number is either negative, or positive, or zero) is not the only property of real arithmetic that does not apply to floating point numbers, nor is it the most important. There are other cases, such as:

  • The sum is not associative.
  • The distributive law does not apply.
  • There are floating point numbers that do not have inverses.

This list could continue for hours... Anyway, it is not possible to specify an arithmetic type of fixed size that satisfies all the properties of real arithmetic we know. The IEEE 754 committee had to decide whether to follow all the rules or break some of them. The decision was guided by the following principles:

  1. When possible, have the same behavior as real arithmetic.
  2. When not possible, try to make violations predictable and easy to diagnose (or as close as possible).

For example, the predicate (y < x) is asking if y is less than x. If y for NaN, then he nay is less than no other floating point value, so the answer is necessarily false, for any value of x.

I said that the Trichotomy Law does not apply to floating point values. However, there is another similar property that applies. Clause 5.11, paragraph 2 of standard 754-2008 reads as follows::

Four mutually exclusive relationships are possible: "less than", "equal", "greater than" and "unorganized". The latter applies when at least one of the operands is NaN. All NaN should be considered as not ordered with respect to anything, including himself.

As much as the treatment of NaN may require extra code, it is usually possible (although not always easy) to structure the code in order to handle NaN's correctly. When it is not possible, an additional code may be required, but it is a small price to pay for the convenience that algebraic closure has brought to floating-point arithmetic.


Many may argue that it would have been more useful to maintain the Law of Trichotomy and the reflective property of equality (which says that "anything is the same as itself"), therefore define that NaN is different from himself does not seem to preserve any axiom with which we are familiar. It is understandable that many sympathize with this idea, but I think it is worth giving a little more context.

My understanding when talking to Professor Willian Kahan (author of this article and considered the "Father of Floating Spot") is that the definition of NaN != NaN is based on two pragmatic considerations:

  • x == y should be equivalent to x - y == 0, to the extent possible. In addition to being a real arithmetic theorem, this causes the implementation of the comparison (at the level of the hardware) is more efficient in terms of space consumption, which was of utmost importance at the time the standard was created. It is worth noting, however, that this rule is violated when x and y are equal to infinity, so this item is not a big reason by itself; although it could have been changed for example to (x - y == 0) or (x and y are both NaN)).
  • The most important thing is that there was no predicate like isnan() at the time when the NaN was formalized in the processor arithmetic 8087. It was necessary to provide programmers with a convenient and efficient means of detecting NaN and which did not depend on programming languages implementing an operation such as isNaN() (what could take years remember, it was another time). On this, Kahan wrote in cited article:

If there was no way to get rid of NaN, they would be as useless as the Indefinites (similar concept of the cray computers). As soon as a NaN is found, it would be better for the processing to be stopped rather than continue indefinitely until it reaches an indefinite conclusion. That’s why some operations with NaN should return results that are not NaN. What operations?

It is inevitable that people disagree about what these operations would be, but this does not give them the right to resolve these issues by making arbitrary choices. Any real (non-logical) function that produces the same floating-point result for all finite and infinite numerical values passed as argument should produce the same result if the value is a NaN.

Exceptions are the predicates x == x and x != x. These are respectively 1 and 0 for every infinite value or finite number x, but the reverse if x for NaN. These are the only exceptional differences between NaN and numbers in languages that do not have a constant similar to NaN or a predicate such as isNaN(x).

Perhaps this pragmatism was misguided, and the pattern could have compelled to create something like the predicate/operation isnan(). But that would have made it almost impossible to use the NaN in an efficient and convenient way, as the world would still have to wait several years before programming languages adopted it. I don’t believe it would have been a reasonable choice.

To put it bluntly: the result of NaN == NaN will not change. It is better to learn to live with it instead of complaining on the internet (note: the original question has a tone that can be understood as "complaint").

From here it is no longer the translation of Soen’s answer, but my conclusion when reading it.


My understanding

From what I understand, a set of decisions were made taking into account the context of the time (hardware/software, mathematical principles and design, etc). The NaN could have been implemented in several ways, for example an error/exception that stopped the execution of the program. A "special" value was chosen with a behavior "outside the curve" in order to allow the program to continue running (and it would be enough to check if the result was NaN, to then take appropriate actions, such as interrupting the algorithm).

In the end, they decided that NaN == NaN should be false for the reasons explained above. In a way it was an "arbitrary" choice, but at least there was a reasoning behind it and a whole technical basis (and of course they could also have defined that NaN == NaN is true, but the fact is they didn’t and now we have to live with it).

For me, this is the only correct answer to "Why is it so?". NaN He’s not the same as himself because they’ve decided he is. Any other "response" is actually talking about the reasons that - in my view - potentially led to the decision:

  • "NaN it’s not a number, so it doesn’t make sense that he’s the same as himself"
  • "NaN is equivalent to an indeterminate value (has no actual value), so it cannot be equal to anything"
  • "Nan is a special case"
  • etc....

These phrases, in my view, explain the reasoning more than can have led to the decision of what the real reason for it was taken. The real reason is: considering all that has already been said above (the context of the time, the principles of remaining true to real arithmetic to some extent, etc.), they eventually decided that NaN being different from himself was the "best" option.

<if they had decided that NaN is equal to himself, maybe today we were discussing whether it should not be different, using the same arguments above :-)</speculation>

Other answers and links I’ve researched always follow this line of explaining that NaN is a "special value", "different" and therefore has this behavior distinct from the "normal" numbers. In the end, it is not considered a fact number (so it is called "Not a Number"), is a type of placeholder representing an undefined value or state. This answer, for example, it argues that the name is not good because it confuses, and that it should be called "Numerical Exception" or something like that. This would make their distinctive behavior clearer and perhaps cause less strangeness when used (or not, now we have no way of knowing).

The NaN would, in a way, be the equivalent of mathematical concept of Undefined (Undefined), when an expression does not have an associated value. So much so that many mathematical operations that are considered undefined produce NaN in the programming languages that use IEEE 754, such as dividing zero by zero or subtracting one infinity from the other, which usually result in NaN (in addition to the others already mentioned in one of the answers). This, moreover, is the line of reasoning used by many to explain their peculiar behavior (if it has no definite value, it has no way to be compared to anything, nor to himself - but again, it explains one of the reasons that may have weighed for the final decision, but it is not "the answer" itself).

Ultimately, the answer is "Because they defined it that way" (based, of course, but still "arbitrary").

Browser other questions tagged

You are not signed in. Login or sign up in order to post.