How to compare relative run time of fast algorithms?

Asked

Viewed 158 times

9

The @Sorack asked the following questions about performance:

So far, he’s got two answers to the first question (the @Maniero answer and the @Isac answer, which has been deleted) is one to the second question (my answer).

For the replies from @Maniero and @Isac, I noticed that they were measuring, in addition to the running time of the algorithm itself, also the running time and maintenance of the loops. For my answer, I know there is a problem in measuring due to the use of lambda call.

In normal situations, where the time of the algorithm being tested t_i is much longer than loop maintenance time l_i (t_i >> l_i), the impact of l_i tends to be negligible. But in the case in question, times are possibly comparable l_i ~= t_i.

I suggested two measures to try to counteract the impact of l_i for Maniero, but he said that in both cases did not work, but for other platforms. I think it’s safe to extrapolate and say that it also doesn’t work for Java.

Out of curiosity, the suggestions were:

  1. Rotate the loop with a function of noop
  2. Make the time count as granular as possible, just before and just after executing the method

So my question is, in Java, how to make statistically relevant measurements for when the running time of the algorithm is comparable to the maintenance time of the loop?

PS: I accept targeting for other languages, but the focus itself is Java.

  • 1

    Just to inform you that I think you can measure reasonably well if it’s just the overhead tie. Of course you have to do more than one test to help understand how the progression occurs and get a sense of how great the interference is. You must have some tricks to do and calculate interference, even if not precisely.

  • 1

    You can do tests measuring effective execution with profilers and other forms that take the low level. Even if it ensures that a loop empty is not optimized, helps, but does not guarantee anything, because the execution happens to be different, the problem is that the combination of codes affects the performance of difficult to analyze. It’s kind of a butterfly effect, it doesn’t seem, but it has details that change everything.

1 answer

4


The only way to eliminate the time consumed by the loop is by eliminating the loop itself and running the desired code. The problem with this is that, as the algorithm tested is usually so fast, it is difficult to get the run time accurately.

As the performance test usually aims to compare different algorithms and, in both tests, a loop is used, its interference is negligible. There are several other things that interfere much more in the end result than loop.

Testing the performance of small code snippets is extremely complicated, as several factors interfere with the final result. Both JVM, OS and hardware can perform optimizations that are only possible when the code snippet is tested in isolation, but not when that snippet is part of a larger system. Other software being used in the same environment at the time of testing may also interfere with performance.

In that response of one of the links placed by you, for example, as the variable resultado is not used for anything, is just being incremented, it is quite possible that the JVM defines it as Dead code, perform optimizations and do not execute that part of the code.

Another problem that should be taken into account is the "heating" of the JVM (JVM Warmup). As the JVM loads some classes of Lazy form, the loading of certain classes could occur within the range where the time measurement takes place, interfering with the final result.

In addition, the code is not necessarily executed in the order it is written. Therefore, there is no guarantee that the execution of the algorithm occurs between the time measurements.

To escape these and other microbenchmarking pitfalls, a thorough knowledge of the implementation of the JVM where the code runs is required. Fortunately, there are tools that take these and other problems into account and eliminate or at least significantly reduce such impacts in the end result.

Two of them are:

  • Very interesting the Caliper. From what I understood (or the examples I have noticed so far), every method annotated with @Benchmark takes as a parameter an integer to indicate the number of repetitions. It does not foresee a very obvious way in my view to disregard the impact of loop management, however. Already the documentation of the JMH I still need to actually look, but by the description it seems well suited to the proposed end, so it promises

  • I had to read the JMH examples to share this line: http://hg.openjdk.java.net/code-tools/jmh/file/1ddf31f810a3/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_11_Loops.java#L48

Browser other questions tagged

You are not signed in. Login or sign up in order to post.