How to speed up a script to full speed in python?

Question

How to speed up a script to full speed in python?

Asked 7 years, 6 months ago

Viewed 2,432 times

4

I was playing with a method to approximate the PI number and I realized that the code runs very slowly in the Python 3.6 interpreter, and the process is only using 30% of the processor.

There would be a way to run this script in full swing?

from math import sqrt
from random import random

piavg = 0
samples = 10000
scale = 1000
progress = 0
debug_rate = 5
debug_clock = 0



def pis (scale):
    pi = 0
    points = []
    for i in range(scale):
        points.append([random(),random()])
    for p in points:
        if sqrt(p[0] ** 2 + p[1] ** 2) <= 1:
            pi += 1
    pi /= scale
    pi *= 4
    return pi
for i in range(samples):
    cpi = pis(scale)
    progress += 1/samples*100
    if debug_rate > 0 and debug_clock == 0:
        print(str(int(progress)) + "%  encontrado: " + str(cpi))
        debug_clock = debug_rate
    piavg += cpi
    debug_clock -= 1
piavg /= samples

print (piavg)

3 answers

5

Without analyzing your code in depth, just pounding your eye really quickly, I’ve already noticed a critical point that can be easily improved: the call of sqrt for the calculation of the "distance".

The calculation of the square root is quite costly, and you do it very often. If you avoid it, it will greatly increase the performance of your algorithm. In your case it is quite simple because your comparison is with the value 1, whose power of 2 is 1! So just take the call from sqrt (if compared with another value, you would use its power in comparison).

With that only change in the code, the execution time already falls by half (I also used the time measurement method suggested by colleague @Miguel in his answer).

. . .
def pis (scale):
    pi = 0
    points = []
    for i in range(scale):
        points.append([random(),random()])
    for p in points:
        if (p[0] * p[0] + p[1] * p[1]) <= 1: # <=== Alteração aqui
            pi += 1
    pi /= scale
    pi *= 4
    return pi
. . .

Here in my tests, the execution of its original code took approximately 14.5 seconds and produced:

. . .
99%  encontrado: 3.188
99%  encontrado: 3.108
99%  encontrado: 3.128
99%  encontrado: 3.008
3.1419439999999845

While the execution with the aforementioned amendment took approximately 8.6 seconds and produced:

. . .
99%  encontrado: 3.032
99%  encontrado: 3.208
99%  encontrado: 3.144
99%  encontrado: 3.176
3.1421183999999878

P.S.: This is just one of the points I found most critical for the performance. But it is also worth following the guidelines of our colleague Miguel in his answer.

Very well Luiz, what a distraction not to have noticed that.

– Miguel

2016/12/28 at 12:56
1

By combining our solutions we have managed to half, +- 4 secs, the execution time in relation to the question code, and your suggestion was the one that contributed the most

– Miguel

2016/12/28 at 13:01
1

wow, I had forgotten this property sqrt(xx) < y == xx < y*y, well thought out!

– StackOverflowToxicityVictim

2016/12/28 at 17:28

Browser other questions tagged python script runtime

You are not signed in. Login or sign up in order to post.

by Miguel • **29,306** points · Answer 1 · 2016-12-28T10:36:48+00:00

I was able to optimize, running with python 3.5, the running time in +- 1.2 secs. What I did to measure the running time was:

from math import sqrt
from random import random
import time

start_t = time.time()

#TODO O TEU CÓDIGO AQUI, fora os imports

print(time.time() - start_t)

Since with your code it takes between 8.0 secs and 8.3 secs, with the code I was able to optimize now takes between 6.9 and 7.2, follow the code:

from math import sqrt
from random import random
import time

start_t = time.time()

piavg = 0
samples = 10000
scale = 1000
progress = 0
debug_rate = 5
debug_clock = 0

def pis (scale):
    pi = 0
    points = ((random(),random()) for _ in range(scale)) # points e um gerador em vez de ser uma lista de listas
    for p1, p2 in points: # unpacking points, assim escusamos de estar sempre a aceder aos indices
        if sqrt(p1 ** 2 + p2 ** 2) <= 1:
            pi += 1
    return pi / scale * 4 # tudo numa so expressao

for _ in range(samples): # por convencao se nao se usa o i podemos deixar assim com _ , nao sei se otimiza a performance mas e boa pratica
    cpi = pis(scale)
    progress += 1/samples*100
    if debug_rate > 0 and debug_clock == 0:
        print("{}%  encontrado: {}".format(int(progress), cpi)) # escusas de tranformar em string de formatares assim, menos duas funcoes que executas aqui
        debug_clock = debug_rate
    piavg += cpi
    debug_clock -= 1
piavg /= samples

print (piavg)

print(time.time() - start_t)

What I did I left commented to you to understand, I’m still studying ways to optimize but this is a good start

by StackOverflowToxicityVictim • **439** points · Answer 2 · 2017-01-04T10:45:21+00:00

Thanks, with your help I managed to reach this result:

from random import random

scale = int(input("quantas amostras ? "))
repeat = int(input("quantas repetiçoes ? "))
pi = 0

def pic(sc):
    p = 0
    points = ([random(),random()] for i in range(sc))
    def dist (p1, p2, dist):
        d = p1 ** 2 + p2 **2
        return d - dist

    for p1, p2 in points:
        if dist(p1,p2,1) < 0:
            p += 1
    p = p * 4 / sc
    return p

print("calculando...")

for _ in range(repeat):
    pi += pic(scale)
pi /= repeat
print (pi)

This algorithm is considerably faster than my previous ones!!