Threads in python 3.4

Asked

Viewed 1,745 times

8

Well, I have a certain difficulty in using threads, I need to perform the following procedure:

I have several items to be compared with a variable, but this comparison takes a certain amount of time, so I would like to make several comparisons at the same time and if any comparison works true, stop all others and go to another part of the code.

I tried to use Join, but ended up delaying the program, because I need to wait for comparisons that became irrelevant from the moment I found what I needed. I also tried without Join, but the execution of other threads end up disrupting the flow of my program.

Something more or less like this:

Para cada 10 itens da lista 
Use cada item em uma comparação distinta 
Se alguma das 10 comparações retornar como True, feche todas as outras comparações e vá para outra função 
Senão, pegue mais 10 itens da lista e refaça a comparação

More generally I did as follows:

palavras = ['palavra1', 'palavra2', 'palavra3', 'palavra4']
nth = 2
threads = []
def execute(palavra):
    print('\ntentando palavra'+palavra)
    time.sleep(5)
    print(palavra+'finalizada')
for nome in palavras:
    threads.append(nome)
    if len(threads)>= nth:
        for item in threads:
            _thread.start_new_thread(execute, (item,))
        threads = []

However, it continuously starts threads. I needed to keep only 10, IE, he takes 10 items, compares and as he finishes a comparison, he takes one more item to stay in 10 comparisons.

  • Have you ever thought about doing a busy waiting inside the execute?

  • You want the threads are created according to the number of items in the list? It cannot be 10 threads for all items?

  • you say to another Sleep @Felipeavelar? @qmechanik can give 10threads to everyone, the important thing is to make 10comparisons at a time

  • @x-x can be another Sleep, but in the main program, as long as the number of threads equals 10, wait, then thread...

2 answers

2

The output near the point where you are, in this case, is to keep a "record" of how many threads you have already fired, and only add new threads when you hear "space". Your program creates a maximum number of threads, but has no code to add new threads (or pass new elements to existing threads) after reaching that maximum number. The logic for this by sweating 'while', 'if' and one or two variables to count how many threads are active, and firing more if the number of threads is less than its limit (in case 10).

The "standard" solution to this type of problem is little more elegant thing, however: it goes through the creation of a fixed set of threads - with the desired number of threads - this set, in literature is called "threadpool" - in practice it is a collection - which can be a list, where each element is one of your threads - which in this context is called "worker" (the working threads)

And in this case a data structure called "queue" - "Queue" is used that is fed by a main thread, and from which each worker thread pulls elements. This way, a worker thread can pull a new element to process once the previous job is done, regardless of what others are doing.

In other words: you place elements in the queue on the main thread - each of the previously created worker threads is in a continuous loop taking the next element of the queue and processing it.

You need some other way to pass information to the worker threads to say that the processing is over, and they can be terminated. Typically this is done by placing a "Marker" object in the queue, and threads stop when they find it. In your scenario however, this would require an extra logic to slowly queue the elements so that Marker does not end the queue (and you get back to your initial problem) - so for simpler scenarios: simpler solutions: a global variable "COMPLETE" is used,and set by a worker thread that finds the result.

Note that in both thread theory and implementation in lower-level languages, this would be much more complicated: there are race conditions for the global variable to be used,q ue would have to be taken into account - in the case of Python, the GIL (Global Interpreter Lock) takes care of it for us - and, Queues are already existing classes, internally using the necessary Ocks - so it is quite simple to use them without major worries.

(The price you pay for that is justly that if threads are CPU intensive in a pure Python algorithm, GIL is not released during the execution of the algorithm, and its gain using threads compared to a linear program will be quite small. The alternatives would be: use "multiprocessing" instead of "threading" - this puts each worker in a separate process, and ends the GIL problem (but you will need another mechanism other than the global variable to synchronize Workers) - Or, write your function execute in Cython, and use the call available in this Python super-set to release GIL.

Here, the example using Python3’s threading and Queue, with its scenario:

from threading import Thread
from queue import Queue
import random
import time

COMPLETE = False
class Worker(Thread):
    def __init__(self, queue, output):
        self.queue = queue
        self.output = output
        super(Worker, self).__init__()

    def run(self):
        while not COMPLETE:
            element = self.queue.get()
            self.execute(element)

    def execute(self, palavra):
        global COMPLETE
        print('\ntentando palavra'+palavra)
        time.sleep(1)
        print(palavra+' finalizada')
        if random.randint(0, 10) == 5:
            COMPLETE = True
            self.output.put(palavra)

def main(palavras, nworkers):
    queue = Queue()
    result_queue = Queue()
    threads = [Worker(queue, result_queue) for _ in range(nworkers)]
    for t in threads:
        t.start()
    for p in palavras:
        queue.put(p)
    result = result_queue.get()
    print ("O resultado final é:", result)


palavras = ['palavra_{}'.format(i) for i in range(50)]
main(palavras, nworkers = 10)

For more information, see the Queue documentation: https://docs.python.org/3/library/queue.html (even has an example similar to that there)

1

Wouldn’t it be better to create an executioner and give him your duties? For example, the following I will take a list of numbers up to 1000 and return the factorial. I will call Math.factorial on each element of the list and put it in a dictionary, running only 10 operations at a time.

from math import factorial
from pprint import pprint
from concurrent.futures import ThreadPoolExecutor 

lista = range(1000)

with ThreadPoolExecutor(max_workers=10) as executor:
    pprint({num: fat for num, fat in zip(lista, executor.map(factorial, lista))})

Basically the executor uses a map() to apply the function to each element of the list.

If you want to use multiprocessing. Pool instead of Concurrent.futures.Threadpoolexecutor also works, so you’d be doing real parallelism with multiple processes instead of Threads.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.