Stable vs unstable sorting

Asked

Viewed 807 times

2

What defines a stable sort algorithm?

In this question has already been talked a little about what is stable and unstable ordering, but I still do not understand the advantage of using an unstable.

In what cases can we use an unstable sort? It is always preferable to use a stable?

Depending on the data structure we use we need to be careful with unstable ordering?


A sorting algorithm is considered stable when it manages to preserve the equal key record order, in other words if the records appear in the ordered sequence in the same order they are in the initial sequence.


This explanation did not realize, what would be the records? Ex:

int vec[5]={4,2,5,1,7};

If I wanted to sort this vector what would preserve the order of the records?

  • 1

    In a stable sort, in case of "tie" (2 equal elements), the relative order between them is not changed. In your array {4,2,5,1,7} it makes no difference because there are no repeated elements. But suppose I have users with name and age, and I want to sort only by age: {(João,30),(Maria,25),(José,20),(Ana,25)}- with a stable algorithm, in cases of tie (Maria and Ana are the same age), the order between them is maintained, so the result would be {(José,20),(Maria,25),(Ana,25),(João,30)} - Maria was before Ana in the original array, and the stable algorithm maintained this order.

  • 1

    And I believe that in this case "records" means "elements". In its array, 4 is a record, 2 is another, etc. In my example, each user is a record. Therefore in a stable algorithm "the records (the elements/numbers/users) appear in the ordered sequence in the same order they are in the initial sequence", but only in the case of a tie, that is, when there are "equal key records" (key would be the value being considered in the sort - in your example is the number itself, in my example is age), so "equal key records" are equal numbers or users of the same age.

  • I know that definition, but what was wrong with changing Maria and Ana, That’s what I don’t understand, I don’t think it’ll be anything serious.

  • I believe that this is already explained in the answer of Maniero below and in the question you linked. In short, if keeping the original order is a requirement, use a stable algorithm. If you do, choose what you think is best. It’s not a matter of being "bad"

  • @hkotsubo Yes, after I noticed that it was explained there. Thanks again

1 answer

4


Here I go by Razor of Occam: if all solutions give the same result I choose the simplest. Which can be the most efficient.

In general, if you have 2 "josé" and nothing else that differentiates them, it makes no difference which one enters first after being classified and therefore it makes no difference which algorithm to use. But if the sorted list being assembled needs to consider the entry order in the list then the stable algorithm is mandatory. Basically the question you should ask is whether the position in the original listing is part of the tiebreaker or not, if it is an important information need to use the stable algorithm.

Any unstable classification algorithm can be transformed into stable if it modifies the classification key by manually adding the position, provided it is available.

As always, it’s a question of what warranties you need and what compromises you accept. Some algorithms give up some efficiency to provide stability. But this is not guaranteed, depending on the comparison the stable may have more efficiency than the unstable.

It’s not that you need to be careful with a certain data structure, but careful with the desired need.

For example, a scattering table has no clear position, so it makes no sense to require stability if this is the source.

Remembering that there can only be instability in the event of a tie. If the structure ensures having unique key stability is guaranteed for any algorithm. So the example cited does not matter if the algorithm is stable or not, after all there is no tie in 2 elements of it. Read again the question linked because you still don’t understand what stability is.

I particularly prefer the stable whenever it makes no difference or that it doesn’t matter, but it’s common to make some difference.

It may be that in the future you need the original order, even if you don’t need it now, then you would have to redo the classification in the original, if available.

In addition, stable algorithms tend to have more resource consumption predictability.

  • Thank you very much, this matter of ordering seems to have a lot to talk about, I usually choose the easiest to implement and with good performance, as I said I do not look much if it is stable or not, I will probably have to find out more about this. As all the sorting algorithms are already done I make one copy paste and change only what is necessary, being quite easy to implement a QuickSort as an example

Browser other questions tagged

You are not signed in. Login or sign up in order to post.