What are the differences between the implementations of the Set class on Dart?

Asked

Viewed 141 times

2

The documentation says that Set is A collection of objects in which each object can occur only once.

In the documentation of this class there is a small summary of the different classes that implement Set:

Iteration on the elements of a set can be ordered or disordered Examples:

A HashSet is not ordered, which means that its iteration order is not specified, LinkedHashSet iterates in the insertion order of its elements and a set of elements SplayTreeSet repeats the elements in the order of classification.

Roughly it seems that the difference is summed up in the ordering and repetition of elements.

What is the practical difference between them to get the best use of each?

Based on this question

1 answer

2


I don’t know the specific implementations of Dart, I know what these structures are in general. Soen’s question linked little has to do with this because there is about maps and here about sets. It helps to explain the different reasons, because I imagine that what changes from there to here is the uniqueness of the value.

They are all one set, that is, a data set that does not repeat itself. The difference between them is the internal implementation and therefore there are different performance commitments in each type of operation or even memory consumption, so each can be more appropriate than the other in each scenario. All store the same things and can do essentially the same (except).

Because I don’t know about internal implementation I’m going to talk presumably, but if the structure has a name for something universally known and actually implements another, the technology is very bad.

Hashset

It’s basically a hash table which ensures that values do not repeat. Any type of table hash can’t guarantee data order, whether by key or value. It has performance commitments O(1), that is, constant, takes the same time no matter the size of the structure. Just remembering that the fact of not having order prevents accessing the data in a direct way.

Linkedhashset

It’s been questioned that the name of this is wrong (hash also should be because indicates the implementation detail to ensure certain feature, this is bad because after either can not change or gets weird call one thing and be another, in fact I think the team that takes care of Dart very weak, every answer I give I see a problem, not enough to be a PHP, but for something that started recently I could not make the same mistakes).

From what I can understand, it’s a structure that maintains these two internal structures, that is, it has a table hash so it can maintain the same characteristics of the structure above, but it can give order (the same as the insertion) because it also has a list on. The bad compromise is that it occupies more memory by having two internal structures.

The algorithm complexity depends on some implementation details, under normal simple conditions it is the same as the previous structure, i.e., O(1), but there are cases that is O(N) since there are operations that can only be done correctly in the linked list structure (e.g.: insertion in the middle of the list).

I have my doubts whether I should wear one array instead of a linked list, I just can’t guarantee it because I haven’t seen all the arguments about why they adopted this, but it seems like another mistake.

Splaytreeset

As you can imagine it implements a structure called splay Tree and which is obviously a tree mounted in a specific way to have more performance in the items that are accessed more frequently. The documentation is terrible and the answer linked just copied what’s on Wikipedia without understanding what it is. I imagine it provides you with classified data when you want it (read what I already read Linkei before to understand the differences), if not provide this the structure would be very wrong. They could have opted for any other tree and all offer virtually all operations in logarithmic complexity O(logN), this was chosen because it allows an optimization to access faster the most frequently accessed items. As far as I know the gain is derisory and many people question whether it is worth the additional effort to do this, including because it has some commitments.

Again it’s a shame the name says how it’s implemented internally.

From what I read the default, namely the Map and the Set uses the most expensive data structure in memory and the difference is not small, for me another mistake.

I imagine you don’t accept values here null, or at least accept only one element of this value, the documentation says nothing, would have to investigate internally.

So really the difference has to do with order and classification, in addition to performance commitments and memory consumption that were not cited in Soen’s response. The map has the same commitments but accepts repeated values.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.