How do you relate the hash which is also called the python dictionary to the encryption hash function?

Asked

Viewed 953 times

8

I would like to understand how the encryption 'hash' function (in which passwords for example are encrypted) relates, with the key-value hash in programming (also known as 'dictionary' in Python for example).

  • 2

    From what I understood of the answers, the hash is the result of a calculation or algorithm applied over a value that must always generate the same result when applied to the same value. In the case of the dictionary, it serves to generate an identifier that will determine the address of an item and in the case of the password serves to generate a value whose purpose is to dispense knowledge or storage and hinder the determination of the original password.

3 answers

10


The hash function, in general, is a function that takes arbitrary size data and transforms that data into a numerical alpha value.

As you noticed, the hash function is used in different contexts within computation. Each context requires the hash function to obey (or not) certain types of properties.

Among these properties are determinism, definition of intervals, uniformity, invertibility and collision treatment.

  1. Determinism

A hash function should always generate the same value for an input. Thus, the hash function gets very close to the mathematical function model.

Some python versions do not obey this property. This is because python generates a Seed(random) that will be used in the hashing. This type of situation should be avoided if someone wants to work persistently (write on disk). For, the values that have been saved in an execution for a given will be different from the values generated in a new execution.

  1. Intervals

Some applications require the hash function to generate values within a fixed numerical range. An example of this type of application is the encryption algorithm SHA-1 that generates a value of 160 bits.

Others require the interval to be dynamic. The python dictionary, which uses the value generated by the hash function as the index of an array, expands as new key-value pairs are inserted.

  1. Uniformity

Hashing functions with defined range must ensure that each position of the range has equal probability of being generated. The reason for this is that two different data may generate the same value (collision). Collisions are costly operations to treat. Depending on the case, they don’t even need to be treated.

  1. Invertibility

Encryption applications require it to be difficult to find a data from the value generated by a hashing function.

The implementation of a hash function varies greatly depending on the problem it should solve.

A simple example of Hashcode is what has been implemented to generate a string hash value in Java (useful for use in maps/dictionary):

public int hashCode() {
   int hash = 0;
   for (int i = 0; i < s.length(); i++)
      hash = (hash * 31) + charAt(i);
   return hash;
}

The value 31 was chosen because it is easy to implement using low-level logic (shifts) and is a prime number (for some unknown reason prime numbers have a smaller number of collisions).

You can also take a look at the algorithm of Rabin-Karp to view the hashing application in a pattern search algorithm in a text.

In your question you talk about the hash function used in encryption to "encrypt passwords". However, note that encryption is a different process than hashing. When using hashing, the goal is to receive a data and generate a numerical alpha value for that data (a given value can be generated by different data). In the case of encryption, you will modify the data to make it unreadable for those who do not know the method used when encrypting. That is, encryption always has a guaranteed return (for those who know the password).

  • 1

    Great answer, plus one. Just one detail: the output of a hash is not necessarily an alphanumeric data - this is only one of the possible representations of this output (another widely used is Base64, and there are also hash functions that incorporate in the output the salt used and perhaps other important information). And as for the use of hashes in password protection, the question "How to hash passwords securely?" has more useful information.

5

The hash is a mathematical algorithm that will take one string and transform into another, so that it is not possible to reverse. The hash is usually used in encryption to save passwords, as you may have noticed.

Regarding the relationship of hash cryptography with the dictionary of python is that one uses the other in its structure.

A data dictionary is a dynamic structure that allows you to store values through a key (usually string). To store these values it hashes, i.e., takes the key string and applies the hash algorithm of the encryption, and discovers the position at which the value is.

A very simple example of dictionary storage:

function buscaNoDicionario(string chave) {
  /* assumimos neste exemplo que a chaveReal será um valor numérico, indicando uma posição de memória */
  int chaveReal = algoritmoHash(chave);

  return arrayInterno[chaveReal];
}

It may seem strange that there is a function that does these calculations, but this is usually managed by the language itself. In your everyday life you should write something like:

meuDicionario["nome"] = "Jean"
meuDicionario["idade"] = 5000

And at the time of running the program the compiler translates this to something like:

adicionaNoDicionario(meuDicionario, "nome", "Jean")
adicionaNoDicionario(meuDicionario, "idade", 5000)

Note that the hash used in the dictionary has very different requirements than what is used in cryptography. I can highlight some points:

  • It will usually generate a numerical value as output, indicating a memory position;
  • You have to do as much as possible to avoid collisions, that is, two different inputs cannot generate the same output;
  • It should be fast, because the idea is not to keep information safe, but to improve the way to access the data.

4

The dictionary is a table of hashes. Each hash is obtained through a function that calculates its value to determine in which Bucket shall be inserted. The hash It’s just a way to make it easier to locate what you want quickly. The actual value that was used to calculate, the key, needs to be stored too, if you need to know it.

The encryption calculates the hash and only stores it, without the real content, after all we want to hide the real data.

For every need the size of the hash will vary and even so, but not only, the calculation formula is a little different.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.