How does the numpy Prod method work?

Asked

Viewed 58 times

3

I’ve always used the method np.prod as a production operator. So if I want to know the factorial of 5, for example, I simply do:

import numpy as np

np.prod([5,4,3,2,1])

120

It turns out I was working today and I noticed an inconsistency in this procedure. I was counting the number of possible combinations of 12 people having their birthday on different days. Following my interpretation of np.prod done:

np.prod([365,364,363,362,361,360,359,358,357,356,355,354])

4433906698518895616

This value is wrong. The correct is obtained in the expression:

365*364*363*362*361*360*359*358*357*356*355*354

4657431227433109900901013888000

Why the np.prod returned the wrong result? How does this method work?

  • 1

    Isn’t it because of the overflow? A documentation says: "Arithmetic is modular when using integer types, and no error is Raised on overflow" - see also https://stackoverflow.com/q/39089618

2 answers

2

The array [365,364,363,362,361,360,359,358,357,356,355,354] is understood as an array of integer numbers. Thus, an overflow occurs in python if an integer value exceeds the maximum value that it is possible to represent:

np.prod([365,364,363,362,361,360,359,358])
-3541793775766646656

For example, in the multiplication above, the result was so great that negative numbers began to be produced. If the results continue to grow, they will become cyclical: they will grow so much that another overflow will happen and they will become negative again, repeating this process.

One way to avoid this is to declare the array as float. Simply add a dot at the end of each number and it will no longer be integer.

np.prod([365.,364.,363.,362.,361.,360.,359.,358.,357.,356.,355.,354.])
4.65743122743311e+30

That is, the limitation is not in the method prod in itself, but rather in the way python handles integers.

2


The Numpy library is mostly written in C with wrapper for Python. In this way, Numpy will follow the type limitations of C, and therefore the problem you have is derived from the integer limitation itself of the C language.

In C, int64 has a range of -9,223,372,036,854,775,808 until 9,223,372,036,854,775,807 and its expected result has the value of 4,657,431,227,433,109,900,901,013,888,000 what clearly int64 does not support.

You can follow the recommendation on reply from Marcus Nunes and use Floats to increase the range that the product can reach, but this will also eventually reach a limit defined by the language C.

To get around this problem, in Python 3 integers have unlimited sizes. This way, you can define a function that calculates the product and thus will have the expected result:

def produtorio(minha_lista):
    p = 1
    for elemento in minha_lista:
        p *= elemento
    return p

minha_lista = [365,364,363,362,361,360,359,358,357,356,355,354]
print(produtorio(minha_lista))

The above code will return the expected result: 4657431227433109900901013888000

Browser other questions tagged

You are not signed in. Login or sign up in order to post.