6
I use a lot of Python dictionaries:
dic = dict() #ou
dic = {}
dic['113654'] = {'nome': 'João', 'idade': 21}
dic['853872'] = {'nome': 'Maria', 'idade': 27}
dic['427845'] = {'nome': 'Fernando', 'idade': 33}
dic['720720'] = {'nome': 'Lucas', 'idade': 16}
They make it very easy to check if a key exists and take its value:
cod = '853872'
if cod in dic:
return dic[cod]
My question is: is using dictionaries that way efficient? Python dictionaries implement an efficient way to find keys (hash, tree, etc.)? Or when my dictionary gets too big you should use a more traditionally efficient data structure like a tree or a hash for example?
Sources that support this answer (and a reference to those who want to go deeper in this subject) are: https://wiki.python.org/moin/TimeComplexity (dictionary search complexity and other structures) and https://wiki.python.org/moin/DictionaryKeys (shows how the dictionary works and how it uses
hash
).– AlexCiuffa
If it is basically a hash already solves my problem. Thank you very much!
– Jônatas Trabuco Belotti
@Alexciuffa excellent links!
– Jônatas Trabuco Belotti
About not being ordered - the Python dictionaries code was refactored in version 3.6, and from 3.7 it was made official that dictionaries respect the key insertion order - (like Collections.Ordereddict, which now goes to the museum)
– jsbueno
@jsbueno so that means he is no longer a hash and yes a tree or something similar, and therefore no longer has to be O(1), can be O(logN) or O(N), which would be terrible and I doubt that it is, or even to be O(1) would have two structures, one to ensure order and one to give to O(1), which would occupy a lot of space.
– Maniero
is a hash, but auxiliary structures, parallel to hash records, allow the insertion order to be retrieved. I didn’t see the code in C as it was, but it should be very interesting to have a look. The insertion and lookup remain O(1). There’s some extra logic in removing keys - so I don’t know if you’ve changed the time.
– jsbueno
@jsbueno So you’re confirming that there’s an extra structure to maintain order and take up a lot more space? You have references to all this?
– Maniero
"Too much space" in the second structure can be as compact as a link to the next hash in each input - 8 bytes per record more - is not so much, compared to the typical size of a Python object that will be in "value" (from the order of tens of bytes anyway).
– jsbueno
It’s really hard to find the reference - the interesting thing is that even with this extra data, as it was in a complete refactoring that created a new layout in memory for the data , even so the new dictionaries are 20% smaller than the old ones, on average - I think the Issue presenting the new code centralizes the discussions: https://bugs.python.org/issue27350 here is the source at "high level" https://morepy.blogspot.com/2015/01/faster-more-memory-efficient-and-morehtml.
– jsbueno
Yeah, Python already has a big space waste of course so it won’t make so much difference and if there was already waste for another reason that now solved it can get comparatively interesting, although it’s not as efficient as possible if you disregard the specific comparison. Must be what in C# is
KeydCollection
https://answall.com/q/56001/101.– Maniero