What ways to measure the performance of an algorithm?

Asked

Viewed 2,951 times

11

If I have, for example, some sorting algorithms (Merge Sort, Quick Sort, Bubble Sort...) in which way(s) I can know the efficiency of each one?

4 answers

14


First, an algorithm is not "efficient" or "inefficient", it has only a number of steps proportional to the size of its inputs. A implementation of this algorithm is going to have an efficiency, whether in memory, in time, in energy consumption, etc. What you can do is evaluate the algorithm under several metrics and then try to correlate these metrics with the corresponding computational cost (e.g.: the more steps the algorithm has, the more instructions the CPU will have to run).

The main metric is - as already mentioned - the number of steps that the algorithm will perform relative to the "size" of its(s) input(s). To understand this, I recommend reading the question "What is the complexity of an algorithm?". This analysis is a first step in determining efficiency, but is far from being a complete analysis (for example, Quicksort - worst case O(n^2) - usually perform better in the average case than multiple worst-case algorithms O(n*log(n))).

Another metric is the Locality of Reference - especially if the volume of data is high, the access to resources necessary to implement the algorithm (e.g., memory, both primary and secondary) has significant influence on performance. If an algorithm has an access pattern that fits well into the architecture used (e.g., an algorithm that accesses data close to each other, in an architecture that makes heavy use of the cache) then your implementation should be more efficient than another that has an irregular access pattern, even if the number of steps is the same.

(another example of the above case are algorithms that work best if the input is sorted; often sort the input and then use the algorithm ends up performing better than just using the algorithm - the extra cost of sorting is offset by the lower cost of the algorithm, making the overall performance better)

There are several other factors to consider, those already more linked to the implementation of the algorithm than to the algorithm itself - the deviation patterns of the algorithm (see Branch Prediction), the possibility of parallelization (whether in multiple CPU cores or perhaps via GPU), the presence or not of critical sections (i.e. whether the implementation needs to be thread-safe or not), etc. Some of them can be evaluated a priori, others only a posteriori (implemented, and became slow, let’s now find out why).

Finally, it remains to actually implement and measure the actual efficiency of implementation in a given environment (for example via Profiling, as suggested by Shura16). Always remembering to use statistics correctly in order to determine under what circumstances the algorithm behaves in one way or another (e.g.: " if everything else is the same, but the system has more memory, what happens?" ), find out why ("here it slows down because the OS does not stop doing swap") and determine whether the blame lies with the algorithm or the implementation ("ah, the algorithm uses a stack, but my stack implementation has a problem").

  • 1

    +1 for Branch Prediction. The most voted question from the Soen: http://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array

4

There are two major ways to quantify the efficiency of an algorithm, the empirical method and the analytical method.

These shapes may vary according to the aspect of efficiency you want to measure. You can evaluate the amount of memory used by the algorithms or the intensity of the processor use, but by the way you put the question you probably want to classify how long the algorithms run (most used method).

Empirical method

Code execution time in practice. For this method there are several ways to measure, some of them depend on the programming language you are using.

Analytical method

Represent through an order of magnitude the execution time of the algorithm. This method is independent of the machine being used, the operating system or the number of programs running because to measure the quantity only the code is used. Bigo notation.

1

I will then recommend the use of profiling tools. Each language has its tools, and may have generic tools.

Some IDE’s usually have a built-in tool, but they all display a report about a program’s performance.

1

Speedup calculation
SU(p) = Ts/Tp(p)
SU=Speedup
Ts= serial time
Tp=parallel time
p = n of processes

calculating efficiency E(p)=SU(p)/p

If SU > 1 the parallel version reduced the runtime (it became faster than the sequential one) If SU < 1 the parallel version increased the running time (slower than the sequential one)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.