How do the SHA family hash algorithms work?

Asked

Viewed 16,707 times

13

I would like to understand how the hash algorithms of the SHA family work (Secure Hash Algorithm), such as the SHA-1, SHA-2, SHA-3, and understand the differences between them.

I seek didactic answers that make me understand the processes used by these algorithms step-by-step, emphasizing the logical and mathematical part, and if possible, bit operations.

Note that I do not ask for code examples, although they are valid for the explanation.

  • 1

    To my knowledge, each hash algorithm works differently (maybe there is some similarity between them, but I’m not sure). Unless you’re looking for a very high-level answer, I think it would be better to ask separately by algorithm, or the question should be very broad.

  • 1

    I did what you suggested. Actually, the algorithms are different. I divided it into two questions, the other question is available here: http://answall.com/questions/43495/comorty-function-algorithmof-hash-md5 The division is even better for future references.

  • 1

    Blz, but just in advance: SHA-1 does have a relationship with SHA-2 (and SHA-1 is similar to MD5), but SHA-3 as I understand it is radically different from the first two. The point is that the name "SHA" (as well as the name "AES") was given to the winner of competitions promoted by American government agencies, who select the best hash/cipher based on a number of criteria, for standardization. That is, they do not necessarily come from the same authors, nor do they follow the same logic (including the original name of SHA-3 is "Keccak" and the AES is "Rijndael").

  • I thought the difference between them was not so great, only some "magical" improvements between one and the other. Well, I need to understand the step-by-step approach to basing them on my graduate work. I have used these algorithms a few times, but I confess that the operations they use to perform the "magic" confuses me a little. Mainly the low level operations.

  • hehe "a little"? I get a headache just reading "hash" and "algorithm" in the same sentence... : P The most I could understand was the RSA and look there.

  • They confuse a lot, hahaha... I tried to understand the AES (quoted in another question), but I have not been very happy in my studies so far. For this reason I asked for a "didactic" answer. My advisor said that just basing these algorithms on TCC is not enough, I need to understand how they work. And in fact, I agree.

  • "Understand" at what level? Cryptography is an extremely specialized domain, much more pure mathematics than computing. It seems to me excessive for a simple TCC, I don’t know. Maybe someone can break the algorithms in stages and give the general idea of each, but say why Each stage exists and what it represents, I find it quite unlikely... I hope I’m wrong, though. Then I will try to write something down in the AES question, and you evaluate whether it is enough for you or not. In the hashes, I won’t even risk.

  • 2

    A good product is Daniel Balparda de Carvalho’s book - Cryptography Methods and Algorithms.But in the face of the nature of cryptography do not expect to find abundant material on the Web.

  • @mgibsonbr need to understand the logical and mathematical part. It is not necessary to detail why they were developed in one way or another. That level of detail would give a master’s dissertation... I need to understand the steps and how they work. The operations that happen, especially those involving the jobs with bits. It is not necessary to explain the causes that make the algorithm safe, the importance of the algorithm, applications, or things like that. I will edit my questions soon, at the moment I can not do it.

  • @Motta thank you for the indication. I will consult the material you indicated =)

Show 5 more comments

2 answers

7


SHA1 (Source):

SHA1 implements a keyless hash algorithm, which takes a up to 264 bit message and produces a summary of the 160-bit message and is used for checking the integrity of the message. It is based on the design principles of the MD4 and MD5 hashing algorithms (Memory Digest 4 and 5). SHA1 is considered the successor to the MD5, one of the first and most used hash algorithms, processing 512-bit blocks sequentially. Each processing is done by block with about 80 steps. SHA-1 and its working Its logic can be divided into five steps:

1) Add fill bits: every message will have bits added, the final size being 512 congruent to 448. Thus, a message can have 1 to 512 fill bits, which consists of a bit with value 1 followed by how many bits 0 are necessary.

2) Add message size: a 64 bit block is added to the message, being treated as an integer with no 64-bit signal. This block contains the original message size (before filling).

3) Initialization of the MD buffer: a 160-bit buffer is used to keep the intermediate and final results of the hash function. This buffer can be represented as five 32-bit registers, which are initialized with the following value (in hexadecimal): A = 67 45 23 01 B = EF CD AB 89 C = 98 BA DC FE D = 10 32 54 76 E = C3 D2 E1 F0

4) Message processing in 512-bit blocks: the center of the algorithm is a compression function consisting of four iterations, which have 20 steps each. These iterations are similar, but use a different logical structure.

5) Output: after processing all 512-bit blocks, the output of the last iteration provides the 160 bit hash.

SHA2 (Source or Cached source):

The SHA-2 algorithm model follows the same structure adopted in the SHA-1 algorithm. Similarly, they implement iterated summary functions, again following the Merkle Damgard paradigm. The values of 224 and 384 correspond to the values of 256 and 512, being only a truncation of the output values.

For example, SHA-256 processes, as in SHA-1, messages with a maximum value equivalent to 264 bits and also uses 32-bit words.

The construction of the SHA-2 algorithms is very similar to the construction of the SHA-1 algorithm. The differences consist of the number of blocks, the number of rounds of the compression functions, the use of left and right bit shifts and the size of the constant that defines the input and output messages of the algorithm. The SHA-256 processes 264-bit messages and works with 32-bit words. The compression function input has 512 bits. Its state variable contains 256 bits, which in turn generates another variable with 256 bits. These features are identical in the SHA-512 algorithm, with the size peculiarities such as: 2128-bit message, 64-bit words, and function input consisting of a 1024-bit block.

The message-filling in the SHA-2 algorithm occurs analogously to what happens in the SHA-1 algorithm. That is, briefly, add a bit 1 at the end of the message, add 0 until the word has the size proportional to the number of blocks, and finally concatenate with bits to signal the size of the message to the last block. The summary computation starts with the message being divided into 512-bit blocks and the initialization of the state variables. Then a series of logical operations is performed described by functions. These operations are displacements, XOR(or exclusive), AND and NOT. Some of these SHA-2 family algorithms make use of the same functions previously present in the SHA-1 algorithm structure. The 512-bit message is divided into 16 words, each with 32 bits is expanded to 64 32-bit words. Subsequently, for each word an algorithm round is held, but one particularity of SHA-2 is that each round is composed of a different constant. Finally, the initially generated state variables are summed module 2 (XOR) with the resulting values of the rounds.In all algorithms of the SHA-2 family, these procedures are the same, Vari-ando as only 32-bit sizes for 64-bit for the message blocks.

In the SHA-2 algorithm family, operations are simple and perform well. However, the algorithm is strongly sequential, not allowing easy parallelization. Each round of the algorithm can only be computed before each word with its then 32 bits having been calculated previously. This requires a sequence to perform computation. The last block that contains the fill to ensure that all are 512-bit multiples uses, as said, the fill functions that today are implemented in software that require minimum processing cost.

SHA3 (Source or Cached source):

This algorithm makes use of the sponge paradigm, composed of two phases of processing. The first one divides the message into blocks and absorbs it into internal states.These states are originated from a state being initialized with zeros and then iterated with rounds that have five mappings, that disseminate and distribute the elements in the states. Once finished, the function moves to the crushing phase, which in turn inserts the application of the mapping functions until you have the number of bits that meets the output size at the chosen security level.

More information on the links I left in each algorithm. I hope I helped.

-5

Secure Hash Algoritm (SHA) Secure Hash Algorithms (NIST) are a family of cryptographic hash functions published by the National Institute of Standards and Technology (NIST) as a Federal Information Processing Standard (FIPS). The SHA includes some versions, the SHA-1 takes a message of up to 264 bits and produces a summary of the 160-bit message and is used for checking the integrity of the message. SHA-2 is very similar to the construction of the SHA-1 algorithm. The differences consist of the number of blocks, the number of rounds of the compression functions, the use of left and right bit shifts and the size of the constant that defines the input and output messages of the algorithm. SHA-3 This algorithm makes use of the sponge paradigm, composed of two processing phases. The first one divides the message into blocks and absorbs it into internal states.These states are originated from a state being initialized with zeros and then iterated with rounds that have five mappings, that disseminate and distribute the elements in the states. Once finished, the function moves to the crushing phase, which in turn inserts the application of the mapping functions until you have the number of bits that meets the output size at the chosen security level.

  • 2

    A copy of Wikipedia without formatting and source citation is not a good answer, besides being plagiarism...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.