What is Hashcode and what is its purpose?

Asked

Viewed 14,872 times

13

I have observed that the use of hashcode is used in comparisons, but what does the term mean? Its use in programming is specific in comparison?

  • Specifically for functional programming I do not know, but there is a possible duplicate on this subject (in Java): http://answall.com/questions/11108/qual-a-import%C3%A2ncia-de-implement-o%C3%A9todo-hashcode-in-java

1 answer

14


The term hash "usually" is translated by scattering. Some people may say it’s shredding (the die until it’s a little bit).

Usages

Table hash

The main purpose is to use in key tables hash that need to organize the data in a simple numerical form.

A table hash can locate an element almost always in complexity O(1), that is, no matter the size of the table it always finds at the same time. This is the same as with a array.

How can he put several data that are not numbers, or at least are not simple and sequential numbers and remain so fast?

It takes the real die by applying a formula to find out which number that data corresponds to. This calculation is done in what we call hash Function. A good function tries to prevent two different data from generating the same result. Although it is likely that it does not generate repeats, it happens, especially in large volumes of data. When this occurs, the complexity goes to O(m), where m is the amount of repetitions. As there is not much repetition, we say that the complexity of a table hash is almost O(1). Only it is not 1 when it has repetition. In theory in an extreme case O(1) could turn O(n), ie, having to analyze everything. In practice that doesn’t happen, unless you make this joke:

function getHashCode() { return 42; }

Then all keys will generate the same result and only have repetition in the table.

The hash code can only be calculated at the time you need it, or can be stored on the object by swapping processing time for occupied space.

If an object cannot have a way to calculate its hash reliably, it cannot be used in a table of hash. Of course every object can be hasheable, but whether this will be well organized or not depends on the language and the programmer.

Languages like Java and C# force every object to have a function that calculates the hash. Whether or not to store (cache) is each object’s problem. Some people say that this is an error and that it should have an interface Hashable to use on objects that can make good use of the code. Others say that since virtually every object can be used as a key, then it is better to oblige at all.

An object can use the default implementation or make its own. Often it has to do itself using the parts of the object that are relevant to identify the object.

The comparison is not so good, but let’s say that it is the RG of the object (in Brazil, the identity document of the person), which may even have repeated because the states control it. It is true that the complete identity of the RG object would be its number and issuing body, there would be unique.

switch

Some languages use this to use more complex structures in case of switch.

The case needs to be fast, can not make complex comparisons, needs to have an easy localization pattern. It closely resembles a table hash. Using the hash code of string can make the switch more powerful and accept a type that is very useful in a multiple selection.

Integrity

In some cases the code can be recorded or transmitted together to verify the integrity of the data. Then he figures it out again and see if it matches what was transmitted to see if there was any corruption. It’s not a very reliable way, but it can be used if you know what you’re doing and can handle some possible error.

Comparisons

Other uses can be done perhaps to optimize some algorithm. The code hash can serve to identify that something is different more quickly than looking at a huge object. It together with a piece of the object can almost ensure oneness. It alone is not so useful for this. Not that this is usually used in practice.

In general it is used for comparisons, but nothing prevents it from being used for other things. It can be creative.

Other types

Not to be confused with other types of hash code, usually huge ones that are used to sign or authenticate objects. This one is very small, usually 4 bytes.

Extra

I don’t see why it would be different in functional programming, unless there may be optimizations of the Pattern matching as quoted on the switch.

What I can say is that in functional programming objects tend to be more conducive to table use hash since they are immutable. This is an important feature of the object to use in scattering table.

Example of implementation

public override int GetHashCode() {
    unchecked {
        int hash = 17;
        hash *= 23 + field1.GetHashCode(); //cada um tem sua própria fórmula
        hash *= 23 + field2.GetHashCode();
        hash *= 23 + field3.GetHashCode();
        return hash;
    }
}

I put in the Github for future reference.

A int can have your code as the object itself. Indeed it is so it’s done (no . NET Core).

Some people prefer the fact 31 others prefer 16777619 and base 2166136261.

Documentation of the C#.

  • Why is this multiplication done? Why can’t we just return the sum of the gethashcode of the fields?

  • @Danover I think it is best explained in https://answall.com/q/353963/101. I need to see if I write a better answer in https://answall.com/q/356435/101.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.