Function Encode() and hash creation

Asked

Viewed 902 times

2

I’m using Python 3.6 to make a program in which the person types an MD5 hash, then the program saves the hash in a variable, reads a txt file and plays the content inside a list, where each name separated by , is an item on that list.

After that, the program enters a loop where it encrypts to list item (txt) and compares with the hash typed. If the comparison is True, then he discovers the word that is there in the hash.

Follows the code:

passmd5 = input("Digite o hash MD5: ")  #dega a hash desejada

lista = open('worldlist.txt', "r") #abre o arquivo txt
worldlist = lista.read() #ler todo conteúdo como uma string
worldlist = worldlist.split(", ") #Quebra a string por palavras separadas por ', '
descripto = hashlib.md5() #Variável que será utilizada para criptografar cada item da lista

for item in worldlist: #loop que percorre cada item da lista
    descripto.update(item.encode('utf-8')) #caso eu nao use o encode, o python retorna o seguinte erro: Unicode-objects must be encoded before hashing
    if descripto.hexdigest() == passmd5: #Verifico se o item criptografado é igual ao hash passado, se sim, descubro a palavra
        print ("-----------------------------------")
        print ("Sua Hash MD5: ", passmd5)
        print ("Hash Descriptograda: ", item)
    print (descripto.hexdigest())
    print (item)

I use the two prints of the end to see how the output is, because the comparison of the if is not working.

I realized that when I give one print(item) output is the item of worldlist correctly, but when I use the print(item.encode("utf-8")) one b is added in front of the item, getting like this: b'fulano'. So I guess that’s why the comparison never works out, he compares fulano with b'fulano'. (Encrypted, of course!)

I wonder if someone can help me make it work and also give a few touches on the code, because I’m learning.

  • A hint is to use with when manipulating files, it closes the file automatically: https://www.pythonforbeginners.com/files/with-statement-in-python

2 answers

5


By your code, I’m assuming you’re using the module hashlib.

The problem is that the method update, according to the documentation, is cumulative: calling update(a) and then update(b) is equivalent to calling update(a+b). For example:

import hashlib

md5 = hashlib.md5()
md5.update(b'a')
print(md5.hexdigest()) # hash de 'a'
md5.update(b'b')
print(md5.hexdigest()) # hash de 'ab'

First I call update with a, and then with b. Like the calls of update are cumulative, the final result is the hash of ab. The output of this code is:

0cc175b9c0f1b6a831c399e269772661
187ef4436122d1cc2f40dc2b92f0eba0

The first is the hash of a, and the second, of ab. Call update with a and then with b is the same as making a single call with ab:

md5 = hashlib.md5()
md5.update(b'ab')
print(md5.hexdigest())

This code also prints 187ef4436122d1cc2f40dc2b92f0eba0.

Just to compare, to know the hash of only b:

md5 = hashlib.md5()
md5.update(b'b')
print(md5.hexdigest())

The result is 92eb5ffee6ae2fec3ad71c777531578f.


How you created your variable descripto outside the loop, the calls of update are being accumulated, so the hash being calculated is not of each of the words, but of all the words being concatenated.

In the first iteration of loop, update is called with the first word. In the second iteration, update is called with the second word, but as this method is cumulative, the resulting hash will be from the first word concatenated with the second. And so on and so forth...

The solution is to build the object again at each iteration:

for item in worldlist:
    descripto = hashlib.md5() # criar um novo md5
    descripto.update(item.encode('utf-8'))
    ...

You can see the difference in this example:

words = ['teste', 'teste', 'teste']
# criar md5 fora do loop
md5 = hashlib.md5()
for item in words:
    md5.update(item.encode('utf-8'))
    print(md5.hexdigest())

I created a list that contains 3 times the same word. So the result should be the same hash printed 3 times, right? Wrong:

698dc19d489c4e4db73e28a713eab07b
f6fd1939bdf31481d27ac4344a2aab58
1ceae7af21732ab80f454144a414f2fa

The first hash corresponds to teste. The second hash corresponds to testeteste, since the calls from update are cumulative. And the third hash corresponds to testetesteteste.

Creating a new md5 with each loop iteration brings the correct result:

for item in words:
    md5 = hashlib.md5() # criar md5 a cada iteração
    md5.update(item.encode('utf-8'))
    print(md5.hexdigest())

How I’m creating a new one md5 every iteration of for, the calls of update do not accumulate, and the result is the hash of teste printed 3 times:

698dc19d489c4e4db73e28a713eab07b
698dc19d489c4e4db73e28a713eab07b
698dc19d489c4e4db73e28a713eab07b

About the syntax b'etc', to reply from @Sidon already explains well what it is.

It is also worth remembering that hash nay is the same as cryptography, and the MD5 is already considered an algorithm "obsolete".

  • 1

    Dude, perfect! Thank you very much, already test and already worked. Thanks also for the tips at the end of your reply, I sincerely did not wise.

3

Give "a few touches on the code" became vague, I will try to elucidate specifically your problem:

the "b" in front of the object, indicates that this object is of type bytes, see the example below:

my_obj = b"abc123"
print(my_obj)
b'abc123'

print(type(my_obj))
<class 'bytes'>

To "convert" it to string, Voce needs to decode it, thus:

my_str = my_obj.decode("utf-8")
print(my_str)
abc123
print(type(my_str))
<class 'str'>

As I am not involved with all its context (you would have to indicate the Imports in your example) I believe you would have to decode after decryption.

Perhaps the examples below can "clear" further:

item = "item1234"
print(item)
item1234

item = "item1234".encode("utf-8")
print(item)
b'item1234'

item = "item1234".encode("utf-8").decode("utf-8")
print(item)
item1234

Suggestion: Always try to ask the question with a code that can be copied and reproduced in the simplest way possible for those who will try to help, for example, try to copy the code you posted and play in a python terminal.

  • 1

    Thanks! I managed to solve thanks to your help.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.