Performance in Java repeat loops

Asked

Viewed 1,830 times

2

What is the difference in performance between the three types of ties highlighted below.

List<Foo> list = new ArrayList<Foo>();

for (int i = 0; i < list.size(); i++) {
    //CODE
}

for (Foo foo : list) {
    //CODE
}

Iterator<Foo> iterator = list.iterator();
while (iterator.hasNext()) {
    //CODE
}

2 answers

5


First of all we must consider that many performance tests are wrong. Make a micribenchmark is not so simple when executing the three codes by several internal and external factors:

  • Other programs running in the background may interfere differently in testing.
  • With multiple tests in one run, one can affect the others. For example, the first test may have instantiated elements and the Garbage Collector runs only in the second test.
  • The test may not reflect a real use context. In your example, you do not include access to the element. As Ricardo’s answer has already mentioned, the neck of a noose for or while used to browse a list (List) may be in the method get(), that checks the accessed index and then needs to recover the element of the internal vector. However, I cannot agree with his reply, because the Iterator also uses internally the method get().
  • JVM performs optimizations both in bytecode and during execution, and may vary depending on the version.
  • The result varies greatly for different types of Collections. This answer in the SOEN says there are no significant differences.

Your test is much like this benchmark. Without access to the elements, the result obtained by the author was:

For each loop :: 110 ms

Using Collection.size() :: 37 ms

Using [int size = list.size(); int j = 0; j < size ; j++] :: 4 ms

Using [int j = list.size(); j > size ; j--] :: 1 ms

In summary, a loop decreasing the variable is the fastest in this scenario and this is probably due to the fact that comparing a variable to zero is more efficient.

The website Mkiong also produced a benchmark. The result:

Performance Test Result

However, this was much questioned. View comments to see that different people got different results on different platforms and with minor code modifications.

When considering the amount of code executed, the Iterator really seems to be the least efficient. See the methods code next() and hasNext() of Iterator used in the ArrayList, extracted from JDK 6:

public E next() {
        checkForComodification();
    try {
    E next = get(cursor);
    lastRet = cursor++;
    return next;
    } catch (IndexOutOfBoundsException e) {
    checkForComodification();
    throw new NoSuchElementException();
    }
}

public boolean hasNext() {
        return cursor != size();
}

Note that the methods size() and get() from the list are called anyway. There is no magic to retrieve the next element. In addition, a check is made whether the list has been modified by another method. It is good that this is done only by comparing two integers, but the call to the method and the comparison end up having some impact.

Finally, through a static analysis, we can affirm that a bond for using an auxiliary variable (i) and a fixed limit value, that is, without calling size() with each iteration is more efficient.

In practice, however, the scenario can change dramatically if the programmer gets "lazy" to assign the element to a variable. Who has never seen something like the example below?

for (int i = 0; i < list.size(); i++) {
    if (list.get(i) != null) {
        list.get(i).metodo();
    }
}

Another factor that can change a lot is if we don’t use the ArrayList. The LinkedList, for example, it has complexity O(n) to find any element, as it needs to go through the elements one by one to find the desired element.

And still thinking about the quality of a software as a whole and not just in the perspective of performance, links for..each has the great advantage of being more "clean" (less typing, easier understanding) and less prone to errors (limit overflow, start counter at 1 instead of zero and so on).

Another important observation, thinking about the performance issue, is that the biggest bottlenecks are in random access to data and in the definition of the data structure itself. Although some consider the concern with these details to be exaggerated, the truth is that it makes a lot of difference to choose the appropriate structures to store the data. The concept that memory and CPU are "cheap" falls apart as we begin to deal with more than a few thousand records, for example. The choice of commands to traverse the defined structure comes as a consequence of decisions already made.

To conclusion is that there is no definitely more efficient bond for all cases. The most important emphasis is on defining an appropriate data structure for each case, considering memory consumption and the complexity of access to a specific element of that structure.

2

Use the Iterator and the "foreach" will not cause any performance problems, since they have similar speeds while walking Collections. The same cannot be said for the for normal. The problem is in index access get(i) for for which degrades the loop speed.

In the ONLY in English has a good and thorough answer on this. In addition, the different types of implementations of the List and their respective uses can cause performance improvement (or worsening), as is the case of ArrayList or LinkedList.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.