In Python, is there any way other than `numpy` and `float('Nan')` to get the special constant`Nan`?

Asked

Viewed 528 times

4

I’ve been reading the website of Underhanded C Contest, in which the goal is to write subtly malicious code that looks normal at first glance. One of the common techniques mentioned was the use of not a number, or nan, constant that has some special properties; notably, any kind of comparison with nan results in False.

Thinking of a proof of concept in Python, I arrived at the following:

def maior_que_10():
    entrada = input('Digite um número: ')
    try:
        entrada_float = float(entrada)
    except ValueError:
        print('Erro!')
        return
    if entrada_float > 10:
        print('Maior que 10!')
        return
    elif entrada_float <= 10:
        print('Não é maior que 10!')
        return

    print('Inesperado!')

while True:
    maior_que_10()

The function correctly deals with invalid numeric values emitting an error and when looking inattentive, it seems never to reach the print('Inesperado!') because he checks > 10 and <= 10, but having input "Nan" executes the last line:

Digite um número: 11
Maior que 10!
Digite um número: 9
Não é maior que 10!
Digite um número: 10
Não é maior que 10!
Digite um número: foobar
Erro!
Digite um número: nan
Inesperado!

Theoretically, in less trivial code, one could hide malicious code after two if. This, however, depends on having a user input passed to float.

Is there any operation between variables that generates a nan otherwise?

I thought division by zero or root negative number, but result in exceptions, and no nan:

>>> math.sqrt(-1)
 ValueError: math domain error
>>> 1/0
 ZeroDivisionError: division by zero

2 answers

5

(* reread the whole question, I saw that I wrote an extensive answer on how to check a decimal point entry, but that doesn’t answer your specific question well - sorry. I will keep the answer why it can help beginners who fall here on account of the question title)

In newer versions of Python it is possible to do from math import nan - this puts the variable in the namespace nan which contains a number nan.

In older versions (prior to Python 3.5), the recommended was to put it in your code:

nan = float('nan')  

even (or use the expression float('nan') directly.

Furthermore it is important to keep in mind when dealing with Nan’s that one Nan value is never equal to another when compared to == (not equal to itself). The best way to know if a value is a Nan is to use the function isnan of module Math:

from math import nan, isnam

isnan(nan)

prints True.

That said about Nans - there are more things to consider about using float right on top of a string the user types. In particular, infinite values can be expressed with float('inf') (and negative infinity with "-inf"), and also accepted numbers with scientific notation, where an exponent factor of "10" can be added to the number after the letter "and":

In [95]: float("1e3")                                                                                    
Out[95]: 1000.0

So, if you really want to limit the input to positive or negative numbers, with decimal points, it’s better to "Parsing" them more carefully than simply calling float(entrada).

In general, when we talk about "doing the parse," many people first think of regular expressions. I consider regular expressions to be difficult to read and maintain, and people tend to put simple expressions, which do not correspond to all data possibilities.

by checking the data typed with regular expressions:

Python is a good language for regular expressions because luckily they didn’t invent to mix them with the language intax - you call them normal functions and pass a string with the regular expression you want to compare to your text - there are several functions in the module re of regular expressions - for example to "find all occurrences" (re.findall) or replace (re.sub). In this case, we simply want to see if a home expression with user input.

And in the rush someone might think "I want verse the user typed one or more digits, followed by an optional dot, followed by one or more digits" - that expression can be written as "[0-9]+\.?[0-9]+" - Just look at this and see that it is not good: what if the user type a "-" sign? What if there is only one digit? (the second part waits for one more digit after the dot - although the dot is optional) - result - whereas this expression may be married to "11", "23.2", "0.1", will not be married to "1", "-1", ". 23", etc...

To shorten the story, the regular expression that checks a decimal number, with an optional sign, with at least one valid digit, or no digit if there is a decimal point, and if there is a decimal point at least one digit after it is:

c = r"-?(?:[0-9]+|(?=\.))(?:\.[0-9]+)?$" 

(The Python regexps documentation is here - https://docs.python.org/3/library/re.html )

And you could do in your code:

import re

def maior_que_10():
    entrada = input('Digite um número: ')
    if not re.match(r"-?(?:[0-9]+|(?=\.))(?:\.[0-9]+)?$", entrada):
        print('Erro!')
        return
    entrada_float = float(entrada)
    ...

Check input with Python code

So, in the name of readability, and knowing what you’re doing, it might be worth using Python’s string manipulation functions: split, find, Count, isdigit to make a function that checks if a string is a well formatted decimal before trying to convert it to float.

You can do something like:

verifica_decimal(text):
   if not text:  # string vazia
      return False
   filtered = text.replace('-', '').replace('.', '')
   if not filtered.isdigit(): # há caracteres que não são nem dígito nem - nem .
       return False
   if '-' in text[1:]: # sinal 'perdido' no meio do número.
       return False
   if text.count('.') > 1 or text[-1] == '.': # Mais de um '.', ou '.' na última casa
       return False
   return True

def maior_que_10():
    entrada = input('Digite um número: ')
    if not verifica_decimal(entrada):
        print('Erro!')
        return
    entrada_float = float(entrada)
    ...

4


Is there any operation between variables that generates a Nan otherwise?

At the end of the Python Math module documentation you can find:

A Nan will not be returned from any of the functions above unless one or more of the input Arguments was a Nan;

That is - not usually. But it is possible to reach an Nan, for example if you try to subrair float.inf of float.inf - only in this case your problem becomes generating the float.inf:

In [100]: a = float("inf")                                                                               

In [101]: a - a                                                                                          
Out[101]: nan

Another way is to write down is to annotate the binary form of a "Nan" as an object of tpo bytes, or an integer - and use the "struct" module to convert those bytes back into a floating point, which then will be an Nan:

 struct.unpack("d", struct.pack("Q", ((2 ** 12 - 1) << 52) + 1))[0]

Or using ctypes:

In [199]: import ctypes                                                                                  

In [200]: class A(ctypes.Union): 
     ...:     _fields_ = [('i', ctypes.c_uint64), ('f', ctypes.c_double)] 
     ...:                                                                                                

In [202]: A(i=((2 ** 12 - 1) << 52) + 1).f                                                               
Out[202]: nan

Browser other questions tagged

You are not signed in. Login or sign up in order to post.