Messagedigest and hash class with MD5 in java

Asked

Viewed 665 times

4

I’m testing the creation of Hash using this class, and I’ve seen on several websites, including on Soen, the use of snippets similar to the following to create hashs using md5, but without much explanation of how it works (in English at least):

    String s = "teste1234";
    MessageDigest m = MessageDigest.getInstance("MD5");
    m.update(s.getBytes("UTF-8"), 0, s.length());
    System.out.println("MD5: " + new BigInteger(1, m.digest()).toString(16));

My question is, what is happening in this code until the hash creation?

1 answer

5


The class MessageDigest provides functionalities of hashing.

The term Digest refers to a type of "summary" of the data, that is, nothing more than a hash makes, generating a relatively small byte sequence independent of the original data size.

The line:

MessageDigest m = MessageDigest.getInstance("MD5");

Recovers an instance that will use the algorithm MD5 through the method Factory getInstance. It is analogous to the use of other Apis as Calendar.getInstance(), for example, where different types of calendars can be returned.

The algorithms that are supported by Java on all platforms are:

  • MD5
  • SHA-1
  • SHA-256

Now that we have the algorithm set, let’s go to the next line:

m.update(s.getBytes("UTF-8"), 0, s.length());

Actually that’s the same as:

m.update(s.getBytes("UTF-8"));

Here, the method update defines the message that will be summarized, ie the content where the hash will be applied.

You can call this method several times to compose a longer message, so it is possible to process contents larger than the available memory.

Now that we have the algorithm and the content to be processed, let’s go to a stretch of the last line:

m.digest()

The method digest completes processing and returns, in this case, the hash MD5 of the message, then returning to the initial state, ready to receive a new content and generate a new hash.

The return of digest is a sequence of bytes representing a large number. So the above implementation uses a constructor of BigInteger to convert bytes to a number. This constructor takes two parameters:

  1. signum: the sign of the number, that is, whether it is positive or negative. The value 1 treats the number as positive.
  2. magnitude: the number itself. When we talk about bytes, it’s easy to forget that everything in computing is represented numerically in binary. What this routine does is to extract exactly the value represented by the byte sequence, ignoring any numerical representation format, which is why the signal has to be passed in another parameter.

In the sequence, the method toString(radix) is called with the value 16 and converts the number to a text in hexadecimal format (base 16).

This could be rewritten in a way that, in my opinion, is more evident:

String message = "teste1234";
byte[] hash = MessageDigest.getInstance("MD5").digest(message.getBytes("UTF-8"));
System.out.println("MD5: " + new BigInteger(1, hash).toString(16));

Functional code in Ideone

  • Just a question, in the Biginteger constructor, you are passing 2 arguments, I went to see in the documentation and says that the first is a signum. I didn’t understand very well from the documentation, what this parameter is for?

  • I tested here, gave error in vector bytes[], unknown type and has no import suggestion. The Digest method does not accept String as parameter either.

  • @Diegof I ended up leaving some typos because I was without an environment to test, but now it’s fixed.

  • I also added more explanation about the constructor argument of BigInteger.

  • Interesting this parameter of toString(), tested with other databases and it returns very different values according to the basis.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.