What is parallel programming?

Asked

Viewed 5,066 times

15

I have some doubts about parallel programming:

  • What is parallel programming?
  • It is a technique or a programming paradigm?
  • Is there a specific language for parallel programming?
  • Where parallel programming is best used?
  • 3

    Although this has not been asked, it is worth mentioning to help understand the answers: a parallel program can share memory between processes or not. Characteristics such as those cited in Miguel Angelo’s answer (pure functions, immutability) are good mainly when memory is shared - otherwise, a process would not access the memory of the same, so whatever... In the case of different address spaces, it is not necessary to worry about the memory access, but the overhead to transmit data from one process to another is greater.

3 answers

15


What is it? When to use?

Whenever two processes are independent, by definition, parallelization can be used. So far no news... the hard thing is to ensure an environment where parallelization does not cause a bigger problem than the benefit, and it is precisely there that enter languages, data structures and techniques which make it easier to program in a parallelizable way.

I would not say that parallel programming is by itself a paradigm, but rather a set of elements that make it easier to have parallelism and gain the benefits brought by it.

Languages

Functional languages

They are examples of languages that go more in the direction of parallelization, mainly with functions called pure, that is, that they do not cause side effects and that they do not depend on anything other than what is explicitly passed on to them.

Parallelization usually occurs over pure functions that must be applied to a data collection. Since the function is pure, one can delegate each element of the collection to a parallel processor, either on the same machine or on separate machines.

Examples:

There is a endless list of languages closer to the parallelizable approach.

Data structures

Immutability

There are also data architectures that get along well with the parallelization, and the most remarkable to my see healthy unchanging structures. They allow the programmer not to worry about effects when manipulating data, because each time it does so, new instances are created instead of modifying the original object. The previously presented pure functions benefit greatly of immutable structures, as this guarantees that there will be no side effects through the past arguments.

Techniques/Paradigms

Map/Reduce

There are also techniques that use parallelization to make huge data processing (e.g. processing of Big Data). A remarkable one is the call Map/Reduce, that follows the paradigm of divide and conquer. This paradigm (or technique) uses pure functions and also immutable structures.

For those who know Javascript, Map and Reduce are here in the same direction as the methods map and reduce available in Arrays:

  • Map: for an input collection an output collection is produced.
  • Reduce: for an input collection an aggregate result of the elements of the original collection is produced.

I give this example of Javascript because it is now that people have more familiarity (I think), and so I bring closer to them what Mapreduce is when they don’t seem to understand than this is about.

I would not say that there are great differences between technique and paradigm, beyond the paradigm to be applied as a foundation, so I am presenting both under the same title.

Examples:

On the hardware

Image rendering

Parallel programming also exists on the hardware layer. For example, on your video card. The rendering of images is clearly one of the main applications of parallelism, since there are millions of pixels to be rendered and they can all be processed independently of each other (speaking in a very low way just to give an idea).

9

Parallel programming consists of dividing a computational task into two or more sufficiently independent instances to be executed in parallel - for example in two different processing cores, or even in two different machines. Not every task is parallelizable, it depends a lot on the characteristics of the problem and the relations of dependence between the tasks to be performed.

An example of an easily parallelizable problem is "increment in one all the values of this huge vector" (divide the vector into N parts and have each process perform the task in one of the parts). An example of a problem that is difficult to parallelize is "make each element of this enormous vector have its value added to all the elements after it".

The task of parallelizing a program and the tools and techniques used are not in themselves a paradigm, and one can adopt this technique in any language that allows firing more than one process, which supports multi-threading (since the threads are implemented in such a way that they can be distributed over more than one process), or support network communication (for distribution of processing by more than one machine). However, one can design a whole language or framework with parallelism in mind (see "concurrent programming"), as for example the language Erlang. If in the "philosophy" of language/platform parallelism is the norm, and not the exception, then I believe one can speak of paradigm in these cases...

About when to use, I’d say whenever possible, necessary and advantageous. That is, if the problem is not parallelizable (or if the benefit of parallelizing is negligible) there is nothing to do; if a sequential alternative has a performance beyond satisfactory, it does not compensate for the extra complexity of parallelizing (and may even worsen performance)and finally if the overhead parallelization (critical sections/Locks, separation of memory and inter-process communication, network communication, etc.) is nullifying most of the benefit, perhaps not compensating for parallelizing for now (on the other hand, what is not advantageous today can be in the future, when the conditions of hardware and communications change).

  • Concise answer +1. I just didn’t understand the beginning of the last paragraph. I don’t think so but the text suggests that whenever possible I should use parallelism. It seems that the list is reversed. Since parallelism can result in loss rather than gain, I should not use it only if it is advantageous (1º)? Since it’s complex and laborious I shouldn’t use it only if the advantage is necessary (2º)? And if it is advantageous it is because it is possible (though expensive, as for example. to change the platform), and if it is necessary, it is possible (3º) bear the cost?

  • @Caffé I followed your suggestion, and gave a simpler essay. What I meant was that there were 3 conditions that need to be met simultaneously, and the order that I would verify them is first the [theoretical] possibility, according to the need and third the expectation of gain in doing so.

6

What is parallel programming?

Briefly, this is when you divide a computation between various machines or processors that run at the same time.

Don’t confuse parallelism with competition. For example, an operating system will have several processes running concurrently but only one of them uses the CPU at a time.

It is a technique or a programming paradigm?

I think you’re more technical than paradigm. Usually when thinking of paradigm is something you can use for any program and parallel programming is something that is best used case by case.

Is there a specific language for parallel programming?

Parallel programming is a very difficult problem, especially when parallel processes need to do some form of communication or resource sharing among themselves. There are several programming languages and frameworks that try to support parallel programming but each will have its strengths and weaknesses. Parallel programming is not like object-oriented programming, functional programming or logic programming in which you can come up and talk "look, the X language is an example of what it means to be a parallel programming language".

That said, an example I can give is CUDA and other languages to write parallel programs that run on Gpus.

Where parallel programming is best used?

Parallel programming is something you do when you want to take advantage of the fact that you have more than one processor available so you can try to find the answer to your problem faster. In situations where performance is not important it is not worth making a complex parallel implementation and depending on the algorithm you have to implement it can be more or less difficult to create a parallel version.

A good example of the kind of thing that goes well with parallelism is when you have "obviously parallelizable" computations. For example:

  • To render 3d images each processor can be responsible for a different piece of screen.
  • In some numerical algorithms such as multiplication of vectors and matrices, each processor can be responsible for a section of the separate vector or matrix and can join the partial accounts at the end.
  • To compile a program with multiple modules, it is possible to use parallelism to compile more than one module at the same time (make option "-j" serves for this)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.