If possible, how can I work with integer values of 1 million digits or more in python?

Asked

Viewed 377 times

4

I am running some experiments with primes and needed to process huge integer numbers, but when trying to process a small routine there is an error in the following line:

inserir a descrição da imagem aqui

Overflowerror: int Too large to Convert to float

How can I correct that mistake?

it is possible to work with extremely long numbers in python?

  • Try math.fsum to add.

  • You can add your code to the question?

  • 3

    But do you need whole or float? The error says that it was not possible to convert to float, but if you need integers, there is no reason to do this conversion.

  • 1

    is a very objective question, that someone knowing the lignuage can answer without ambiguities, and without examples of the code used - it’s about the ability of the language. No reason to "downvote" or want to close!

  • 2

    The question refers to a code error without putting the code. It did not seem to me a question that has to do with the language. I thought something like, "I tried to make a code with big numbers and it was wrong. I don’t know why I made a mistake and I’m inferring that it’s a language limitation".

  • 1

    "If possible, how can I work with integer values of 1 million digits or more in python?" and "is it possible to work with extremely long numbers in python?" - what is the ambiguity of that? If you don’t know, you can just upvote and wait for someone who can answer - everyone benefits. Now if we go ruminating "maybe I knew how to fix the mistake he’s having if I only knew what he tried to do - let me punish him with donvotes for not putting the code in." (which is worse than commenting asking to put code - that’s up to ok - but not downvotes)

  • 3

    Maybe you don’t understand my point in this part of the code. It has to do with that recommendation because I don’t see how answering "yes, language works" and explaining will solve the error he posted in the question.

  • 2

    A example of a 1 million digit integer (yes, you can count) working. As it can take a long time to run, you can see this GIF animated.

Show 3 more comments

1 answer

7

Yes - it is possible - Python integers have no size limit. I wouldn’t know if the internal implementation for operations with these numbers will be efficient enough to do several operations with these numbers.

The mistake you are making is that at some point in your accounts you did an operation that resulted in a floating point number. In Python 3, a split, for example, does this - Floating point numbers are limited to the standard Float 64 (IEEE 754-2008), (or other type, depending on the CPU architecture). These numbers will lose accuracy - i.e., discard digits for any larger number than 2 ** 53 - which correspond to about 16 decimal digits (16, not 16000).

To "play around with some operations", you can certainly use Python’s native integers - just with the care of using the "entire division", which uses the double-bar operator //, instead of / that converts the result to float (and, of course, the module % works well with large integers).

If you’re going to do serious searches, there might be some large integer library that uses the GPU efficiently - and has connections with Python

(google)

Here - https://pypi.org/project/gmpy2/ - does not use the GPU, but uses the state-of-the-art libraries and arbitrary precision, with Python connection, ensuring that you will be able to use the maximum performance of your CPU and still experience from the interactive mode in Python or Jupyter notebook. The only reference to such a library using the GPU I found is called "gpump", but I couldn’t find instructions on how to install it, let alone something about being able to use it from Python (and anyway, it’s only for Nvidia hardware).

Below are some examples in an interactive section of ipython, creating and manipulating a 2 million digit number. (I use the extension %time from Ipython to display the time it takes python to calculate an expression):

In [2]: %time a = int('7' * 1_000_000)
CPU times: user 4.5 s, sys: 2.21 ms, total: 4.5 s
Wall time: 4.5 s

In [3]: %time a = int('7' * 2_000_000)
CPU times: user 17.9 s, sys: 10.9 ms, total: 17.9 s
Wall time: 17.9 s

In [4]: b = a // 3

In [5]: %time b = a // 3
CPU times: user 10.7 ms, sys: 979 µs, total: 11.7 ms
Wall time: 11.9 ms

In [6]: %time c = a % 33333333333333333333333333333333333
CPU times: user 19.2 ms, sys: 1.96 ms, total: 21.2 ms
Wall time: 21.3 ms

In [7]: c
Out[7]: 11111888888888888888888888888888888

In [8]: d = a / 4
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-8-a0e9c689eb57> in <module>()
----> 1 d = a / 4

OverflowError: integer division result too large for a float

(Note also that from Python 3.7 we can use _ as digit separator in whole numbers - there it is easy to write 2 million as 2_000_000)

Also - the time to create and convert a 1 million digit string number to an integer of that size was 4.5 seconds - 2 million digits, jumped to 17.9 seconds - which indicates that this conversion is a non-linear algorithm, and something like 3 or 4 million digits may already be impossible to create in this way - but the numerical operations with these numbers, once created, are very fast.

I ended up installing gmpy2 here - on Linux there may be packages for your distribution. As I use Python with virtualenv and install each package locally, to install in the Envelope I needed to install the packages devel of the binary libraries it uses (No Ubuntu/Debian would be packages -dev, installed as command apt) - these commands in the shell:

sudo dnf install mpfr-devel mpc-devel
pip install gmpy2

If I were to install in the system’s Python, it would simply be: sudo dnf install python3-gmpy2. In the link above there are the instructions for installation in the other operating systems.

Here the operations equivalent to those I did with Python native integers and their times:

In [13]: from gmpy2 import mpz

In [14]: %time a = mpz('7' * 2_000_000)
CPU times: user 111 ms, sys: 7.77 ms, total: 118 ms
Wall time: 119 ms

In [15]: %time b = a // 4
CPU times: user 1.65 ms, sys: 0 ns, total: 1.65 ms
Wall time: 1.69 ms

In [16]: %time c = a % 33333333333333333333333333333333333333333333
CPU times: user 5.24 ms, sys: 54 µs, total: 5.29 ms
Wall time: 5.37 ms

In [17]: d = a / 3333333333333

In [18]: type(d)
Out[18]: mpfr

That is: build an integer from string dropped from 18 seconds to 120 milliseconds - about 100 X faster. And the split fell from 11.9 milliseconds to 1.7 milliseconds - about 5 times faster. In addition, gmpy2 also includes arbitrary precision decimal types (such as Python decimal.Decimal) - with automatic split conversion - see the last example, numeric type becomes "mpfr".

PS. I used google to answer the final part of the question, but it’s not such a trivial search - you have to know what to look for - by no means did I mean "you could have used google instead of asking).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.