How to calculate the factorial of a list of integers using thread?

Asked

Viewed 396 times

0

Given an A vector of size N with only positive integers, calculate the factorial of each of them and store the result in a vector B.

To calculate the factorial, use the following function:

def fatorial(n):
  fat = n
  for i in range(n-1,1,-1):
    fat = fat * i
  return(fat)

Ways to develop your program should be:

A - Using the 4-threaded threading module;

B - Using 4-process multiprocessing module.

I started doing it this way here but it’s making mistakes, 3 days on this issue and I can’t solve it, someone help me please!

import threading

vetorA = [2,3,4,5,6,7,8]
def fatorial(n):
  fat = n
  for i in range(n-1,1,-1):
    fat = fat * i    
  return(fat)

calc = [fatorial(n) for n in vetorA]
t = threading.Thread(target=fatorial, args=[calc])
t.start()
  • 3

    Siemens, welcome to Sopt. First a few considerations. It would be good if the title of your question made direct reference to your problem itself. The current title has no appeal for people to help you and doesn’t say much about your problem. Something else. You mention an error but do not say what the error is. Edit your question and log the error log that is giving.

1 answer

3

Your program is well under way - the problem is, as you can see above, you didn’t bother to put the results of the function back together - you start the thread, and you don’t even try to get the return value.

Having understood this, comes the next question - as get return values from a thread? That’s not so trivial - the function fatorial is called in another thread, returns its value there - and there the value is lost - the class Thread Python does not have a way to bring that value back to the initial thread - (that is, there is no such call Thread.result().

So if you go in the literature, you’ll find that data communication between Threads is usually done by objects called Queues. In Python, threading queues are in queue.Queue in the standard library.

And a Queue works like this: the code running can insert a value in the Queue on one side - (in the Python Queue object we use the method .put), and in another thread/function the code that will use this result uses the method .get.

Ok - we have a puzzle piece - but first: you don’t want to open umathred for each number on your list (although it’s not technically incorrect, it’s strange to call Python lists "vectors" - better call lists). You want 4 threads - we usually call it "worker threads" (since your program runs on a "main thread" that will coordinate the distribution of the work - and that’s a fifth thread).

With the queues it is easy to understand how to do this: we created the 4 threads - and in each one the called function is not the one that calculates the factorial, but a "manager" function that receives two queues: one of input parameters, one of output data - this "manager" function receives data from the input queue, processes it in the factorial function, and queues back the results in the output list.

In the main thread, after feeding the queue that serves as input for the "managers", you collect the data from the output queue, and assemble the list with the results.

Two more problems you have to solve with this architecture: (1) How to keep the output data in order?? It is not the case that a number sent at position "1" of the queue generates the result at position "1" of the output - threads perform things out of order. (2) how to signal to Workers that the "job is over" and can stop the execution.

(I’m not going to code now - maybe I’ll continue later - but see if you understand, and try to do something with these explanations.)

continuing one more piece - this pattern of having "Workers" to work with threads is quite common - as this code gets a little laborious, even more when one wants to take into account all possible cases, as for example, capture possible exceptions that happen within the code in the thread, Python has for some time included the package "Concurrent." . An object of the type Executor in this package makes the times to keep the threads, the queues, manage the call of the functions that make the calculations inside the threads, and return the value of the same to the controller thread - you add tasks creating objects of type "Future" - and can then call the method .result() in those Future to get the function return value.

Since you are talking about learning, it is worth doing both ways - understand, as I described, that things happen, and then use the Current.Future, which allows the same thing, with error handling and everything, with much simpler code.

And as for using processes instead of threads: the two approaches are pretty much the same - to do "manually," you should use the multiprocessing.Process and multiprocessing.Queue instead of threading.Thread and queue.Queue. For the approach with concurrent.futures, the package has two ways to create the Executor: the class ThreadPoolExecutor and the class ProcessPoolExecutor, and that’s the only difference to use threads or processes.

hint of the result If you pass very small numbers, or few numbers, you won’t even see a difference in performance - numerical computing, even in Python, in modern Cpus is very fast. However, when you notice the difference you will see that only the form with multiple processes will have some advantage - the shape with threads should be slower than the same calculation using a single thread. To understand what happens - and I believe this is the challenge of this exercise, I suggest reading the answer here:

Browser other questions tagged

You are not signed in. Login or sign up in order to post.