What defines a stable sorting algorithm?

Asked

Viewed 6,299 times

23

It is known that there are several ways to sort the data of a collection, some examples are the famous Bubble Sort, Insertion Sort and Selection Sort.

I heard some algorithms are stable and others don’t. What defines a stable sort algorithm? Any of the three in the example are stable?

If possible I would like some example of some sort of sorting algorithm that is stable and a code representation that is easy to read (pseudo-language, Python or C#).

  • 5

    Now there is a very complicated theme, data ordering.

2 answers

18

A sorting algorithm is considered stable when it manages to preserve the order of record of equal keys, in other words if the records appear in the ordered sequence in the same order they are in the initial sequence. An example of a stable algorithm, ordering the sequence of numbers (keys) with letters (records)

3[a], 2[b], 2[c], 1[d]

Obligatorily the result will be:

1[d], 2[b], 2[c], 3[a]

Non-stable algorithms subject the elements associated with the objects to be ordered: 1[d], 2[c], 2[b], 3[a]

An algorithm is stable when numbers with the same value appear in the output array in the same order they are in the input array. This property is important when the satellite data accompanying the elements being ordered must be transported along with the element. The counting sorting algorithm is stable as it reads the intermediate array backwards when creating the resulting vector. But it is the maintenance of this stability that requires the algorithm to use an auxiliary array. If the stability property did not need to be maintained, the algorithm could already work on the initial array itself, using less memory.

examples:

Ordering Bubble Sort (stable)

for(i=n-1;i>0;i--)
      for(j=0;j
            if( v[j] > v[j+1])
                  swap(v[j], v[j+1]);    

Sorting by inserction Sort (stable)

for(j=1; j
      chave = v[j];
      i = j-1;
      while(i >= 0 && v[i] > chave){
            v[i+1] = v[i];
            i--;
      }          
      v[i+1] = chave;
}

Quicksort ordering (Not stable) #include

using namespace std;

int partition(int vec[], int left, int right) {
  int i, j;

  i = left;
  for (j = left + 1; j <= right; ++j) {
    if (vec[j] < vec[left]) {
      ++i;
      swap(vec[i], vec[j]);
    }
  }
  swap(vec[left], vec[i]);

  return i;
}

void quickSort(int vec[], int left, int right) {
  int r;

  if (right > left) {
    r = partition(vec, left, right);
    quickSort(vec, left, r - 1);
    quickSort(vec, r + 1, right);
  }
}

Quicksort da stdlib. h

#include
int compara(const void *pa , const void *pb){
                               int a = *(int *)pa;
                               int b = *(int *)pb;
                               return a-b;
}
qsort(v,n,sizeof(n) , compara);

I have a Java implementation of Bubble Sort, inserction Sort and Selection Sort.

public class Main {

    public static void main(String[] args) {
        int[] vetor = { 3, 2, 2, 1 };
        System.out.println(Arrays.toString(bubbleSort(vetor)));
        System.out.println(Arrays.toString(insertionSort(vetor)));
        System.out.println(Arrays.toString(selectionSort(vetor)));
    }

    public static int[] bubbleSort(int vetor[]) {
        for (int i = vetor.length; i >= 1; i--) {
            for (int j = 1; j < i; j++) {
                if (vetor[j - 1] > vetor[j]) {
                    int aux = vetor[j];
                    vetor[j] = vetor[j - 1];
                    vetor[j - 1] = aux;
                }
            }
        }
        return vetor;
    }

    public static int[] insertionSort(int[] vetor) {
        for (int i = 0; i < vetor.length; i++) {
            int valor = vetor[i];
            int j = i - 1;
            while (j >= 0 && vetor[j] >= valor) {
                vetor[j + 1] = vetor[j];
                j--;
            }
            vetor[j + 1] = valor;
        }
        return vetor;
    }

    public static int[] selectionSort(int[] vetor) {
        for (int i = 0; i < vetor.length; i++) {
            int indiceMinimo = i;
            for (int j = i + 1; j < vetor.length; j++) {
                if (vetor[j] < vetor[indiceMinimo]) {
                    indiceMinimo = j;
                }
            }

            int valor = vetor[indiceMinimo];
            vetor[indiceMinimo] = vetor[i];
            vetor[i] = valor;
        }
        return vetor;
    }

}

in this link there are more examples

14


It is important where you have repeated data on roll whole. Classification occurs based on some key that provides the basis element for the order decision. If there are two elements with the same key a stable algorithm will instead place the element that appears first before the repeated element that appeared next, then it turns out that the absolute position of the element somehow ends up being part of the key. No stable algorithm guarantees the order of the repeated elements.

But you don’t need to use the position as a tiebreaker, it can be a second set, or a third set, and so on.

The exact mechanism to guarantee this characteristic is not defined, in theory it is possible to make any algorithm produce a stable result with the secondary support mechanism, But obviously if you’re going to do this, it’s better to use another algorithm that does what you want at once. Using a second algorithm will cause an extra processing cost. An example is Quicksort, which is the most popular algorithm, it’s usually unstable, but there’s a version of it that can achieve stability at an extra cost.

It is common for an unstable algorithm to perform a little better than stable ones at least in certain situations, so it may be a good choice if there is no repeater ordering requirement. This can be critical according to the key distribution. Think of a collection of data whose keys are all repeated (obviously already ordered), have algorithm that can sort in complexity O(N), others will take the same time, or a middle term if none were repeated. Just as it has algorithms that will benefit from an already ordered collection while others do not. If it is guaranteed that there are no repetitions, the choice does not care either.

Wikipedia has a good example showing the preservation of order by position in stable and not stable algorithm:

ALgoritmo estável/não estável

Note that there was a tie-breaker by suit.

And there’s also the demonstration when the tiebreaker is a second key, so here first orders the suits and then the numbers:

Algoritmo com chave composta

There in the article is a list of algorithms classifying them as stable or not stable.

Example that Quicksort that is unstable in C# can be seen in Rosetta Code.

Example that Merge luck that is stable in C# can be seen in Rosetta Code.

There can be seen examples of other languages and other algorithms that have the same feature.

  • I love your answers @bigown. "A nonstable algorithm guarantees the order of the repeated elements". Does this mean that the algorithm can be wrong? Give the wrong result?

  • 3

    @Thank you. I don’t remember the whole text, but I think it says something about it there. It is not wrong result, it is expected, error only if it was unexpected. In this type of algorithm it is specified that the moment you find repeated elements, the order of these repeats is not guaranteed. In the stable algorithm it is guaranteed that they will appear in the same order that was originally in the collection. This only applies to repeaters. Then have problem that the repeaters may appear in any order, in problem they need to maintain the order of the repeaters that was before classifying.

  • @mustache is a Big Data, hold my +1

Browser other questions tagged

You are not signed in. Login or sign up in order to post.