False error opening decrypted document with Python Crypto library 2.7.9

Asked

Viewed 109 times

0

I tested the following script in Python, using Python 2.7.9, with some adaptations made by me, available at https://stackoverflow.com/questions/23900280/how-can-i-encrypt-docx-files-with-aes-pycrypto-without-corrupting-the-files:

# Cifra e decifra documentos nos formatos .pdf, .docx e .rtf
# Adaptacao de Marcelo Ferreira Zochio

from Crypto import Random
from Crypto.Cipher import AES
import hashlib

def pad(s):
    padding_size = AES.block_size - len(s) % AES.block_size
    return s + b"\0" * padding_size, padding_size

def encrypt(message, key, key_size=256):
    message, padding_size = pad(message)
    iv = Random.new().read(AES.block_size)
    cipher = AES.new(key, AES.MODE_CFB, iv)
    enc_bytes = iv + cipher.encrypt(message) + bytes([padding_size])    
    return enc_bytes

def decrypt(ciphertext, key):
    iv = ciphertext[:AES.block_size]
    cipher = AES.new(key, AES.MODE_CFB, iv)
    plaintext = cipher.decrypt(ciphertext)
    return plaintext

def encrypt_file(file_name, key):
    with open(file_name, 'rb') as fo:
        plaintext = fo.read()    
    enc = encrypt(plaintext, key)
    with open(file_name + ".crp", 'wb') as fo:
        fo.write(enc)

def decrypt_file(file_name, key):
    with open(file_name, 'rb') as fo:
        ciphertext = fo.read()
    dec = decrypt(ciphertext, key)
    with open('decifrado_' + file_name, 'wb') as fo:
        fo.write(dec)

key = 'chave'
hash_object = hashlib.md5(key.encode())

while True:
    filename = raw_input("Arquivo a ser trabalhado: ")
    en_de = raw_input("En (cifrar) ou De (decifrar)?")
    if en_de.upper() == 'EN':
        encrypt_file(filename, hash_object.hexdigest())
    elif en_de.upper() == 'DE':
        decrypt_file(filename, hash_object.hexdigest())
    else:
        print("Escolher en ou de!")

    cont = raw_input("Continuar?")
    if cont.upper() == 'N':
        break

It works perfectly, however, when opening documents in . docx and . odt decrypted (deleting the .crp extension and leaving the original) Windows warns that the document is corrupted, and if I wish to recover that document; choosing yes, it recovers it normally and then just save it.

This does not happen with . pdf or .txt. It has something to do with Word or Open Office character formatting?

  • To make sure your program is encrypting and decrypting files correctly, I suggest you calculate the hash of the file before encryption and after decryption, if the hashes are not equal, surely the file has been corrupted. On Windows, you can use the Winmd5 to obtain the hashes.

1 answer

2


AES is an algorithm of block cipher, and works with fixed size blocks of 16 bytes or 128 bits, no more, no less.

This means that your implementation should take some factors important under consideration:

  1. The AES works only with input data of multiple sizes of 16 bytes;

  2. Dice underage that the block size needs to be "completed" (padding) until they reach block size;

  3. Dice bigger that the size of the block needs to be "fragmented" into pieces of the same block size, and of course, "completed" when necessary;

  4. Smaller size blocks that were "completed" (padding) during encryption operation need to be "truncated" (unpadding) in order to recover the data in its original size during the decryption operation;

  5. When encrypting files, their original size must be stored along with the encrypted data to enable truncation (unpadding) last block during deciphering operation;

  6. File encryption and decryption must happen in parts (Chunks), avoiding storing the entire file in memory at once before portioning.

Its implementation violates the item 5, which is certainly the cause of the destruction of the original data.

Another point is that your implementation saves the encrypted file into representation base64, this is not necessary, encrypted data can be written in binary format.

Its implementation loads the file to be encrypted/decrypted completely into memory! This creates a limitation if the available memory is smaller than the file size.

Based in this and in this reference, follows a class capable of encrypting and decrypting files and data in the correct manner without corrupting the content:

import os
import hashlib
import base64
import struct
from Crypto.Cipher import AES
from Crypto import Random

chunksize = 64 * 1024
BS = 16
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
unpad = lambda s : s[:-ord(s[len(s)-1:])]

class AESCipher:

    def __init__( self, key ):
        keydigest = hashlib.sha1(key).digest()
        self.key = keydigest[:16]

    def encrypt( self, raw ):
        raw = pad(raw)
        iv = Random.new().read( AES.block_size )
        cipher = AES.new( self.key, AES.MODE_CBC, iv )
        return base64.b64encode( iv + cipher.encrypt( raw ) )

    def decrypt( self, enc ):
        enc = base64.b64decode(enc)
        iv = enc[:16]
        cipher = AES.new(self.key, AES.MODE_CBC, iv )
        return unpad(cipher.decrypt( enc[16:] ))

    def encrypt_file( self, in_filename, out_filename ):
        iv = Random.new().read( AES.block_size )
        encryptor = AES.new(self.key, AES.MODE_CBC, iv)
        filesize = os.path.getsize(in_filename)
        with open( in_filename, 'rb' ) as infile:
            with open( out_filename, 'wb' ) as outfile:
                outfile.write( struct.pack('<Q', filesize) )
                outfile.write(iv)
                while True:
                    chunk = infile.read(chunksize)
                    if len(chunk) == 0:
                        break
                    elif len(chunk) % BS != 0:
                        chunk += ' ' * (BS - len(chunk) % BS)
                    outfile.write(encryptor.encrypt(chunk))

    def decrypt_file( self, in_filename, out_filename ):
        with open(in_filename, 'rb') as infile:
            origsize = struct.unpack('<Q', infile.read(struct.calcsize('Q')))[0]
            iv = infile.read(16)
            decryptor = AES.new(self.key, AES.MODE_CBC, iv)
            with open(out_filename, 'wb') as outfile:
                while True:
                    chunk = infile.read(chunksize)
                    if len(chunk) == 0:
                        break
                    outfile.write(decryptor.decrypt(chunk))
                outfile.truncate(origsize)


# Definindo uma chave
chave = "Oi! Eu sou uma chave de tamanho indefinido!"

# Cria uma instancia do Objeto De/Cifrador AES
aes = AESCipher( chave )

# Testando cifragem de dados/texto
cifrado = aes.encrypt( "Eu sou uma mensagem super secreta." )
decifrado = aes.decrypt( cifrado )

print cifrado
print decifrado

# Testando cifragem de arquivo
aes.encrypt_file( "secreto.txt", "cifrado.bin" )

# Testando decifragem do arquivo
aes.decrypt_file( "cifrado.bin", "decifrado.txt" )

Possible Exit:

$ python AESCipher.py
zceFuiV9RTqFsBSY2AYcWMUXqYqI5+3yR08DsH/GeofcSFsg1KpjN4KKL+MaUq4Qmfa9uMFjXL4Ng41giNMGUQ==
Eu sou uma mensagem super secreta.

Verifying signature MD5 (hash) of the archives:

$ md5sum secreto.txt cifrado.bin decifrado.txt
77daefe247686325a5da08e556aba4f0  secreto.txt
e2645f7b2d0af5f79b5108707bb3a13d  cifrado.bin
77daefe247686325a5da08e556aba4f0  decifrado.txt
  • But can you point out that you’re wrong in the question code? Because then, you can even solve a punctual O.P. problem, but anyone coming after will be able to just copy and paste your entire code, or spend a lot of time comparing the codes to find out what was wrong. In my opinion this site is much more useful when we explain the problems of the codes (which does not prevent in any way the post of the functional integer code along with the explanation)

  • 1

    @jsbueno: done!

  • Thank you very much!! :-)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.