How to allocate an immense amount of memory?

Asked

Viewed 352 times

7

I created an Onfly encryption algorithm that encrypts bytes without the need to copy them to a new copy of those same bytes. This is both to encrypt and to decrypt these byte lists. As in the scheme below, the same string that enters will be the same that will be decrypted, I did so thinking about the cost of memory that would have when using large files.

inserir a descrição da imagem aqui

For example, consider the file fileBytes whose there is a 4 MB (4,194,304 bytes) PNG image stored in it. My algorithm only increases 1 byte of the original size, and encrypts the remaining bytes of the file, keeping its size and aspect.

Both to decrypt, is deducted 1 byte in this file.

My question is this, if I use an arbitrarily large file in the encryption, besides using a huge processor usage the application memory will be added along with the file size, and if I use a 10 GB file on a 4GB RAM computer an exception will be thrown OutOfMemoryException.

See in the example below how this would occur:

// as keys em CRYRAZ também são outras Arrays de bytes.
byte[] key = new byte[] { 12, 51, 63}; 
// cria uma nova instância de CRYRAZ.
Cryraz clientCriptografia = new Cryraz(key);

// a mensagem que será usada para encriptar
string Mensagem = "Olá, mundo!";
// bytes da mensagem
byte[] Mensagem_bytes = System.Text.Encoding.Default.GetBytes(Mensagem);

// Mensagem_bytes será o nosso alvo. Atualmente ela se encontra decriptada e original, vamos encriptar ela agora.
Cryraz.EncryptData(ref Mensagem_bytes);

// Pronto, Mensagem_bytes está encriptado.

This is how Cryraz works, unlike the other encryption methods that you copy the bytes, thus getting the original and encrypted version in the same code, unlike Cryraz, it directly encrypts the original.

This is the body of the method for encrypting (I cut unnecessary parts of checks and algorithm security options):

public void EncryptData(ref byte[] entryByteData) {
    // ...
    for (int i = 0; i <= entryByteData.Length - 1; i++) {
        // ação recursiva em todos os bytes de entrada
        // é aqui a ação de criptografia
        // no final, o byte é atribuído à array:
        entryByteData[i] = ((byte)a);
        // obs: "a" é o novo byte encriptado
    }
    // aumenta o tamanho original da array para inserir o hash de segurança
    Array.Resize(ref entryByteData, entryByteData.Length + 1);

    // cria o hash de segurança e insere-o na array, sem esse hash, não há
    // descriptografia, é parte do algoritmo
    byte hash_x = performKeyHash(key);
    entryByteData[entryByteData.Length - 1] = hash_x;

    // coleta as variáveis inutilizadas
    GC.Collect();
}

Complete code: https://pastebin.com/wKQn0S5N

Even with the GC.Collect(); at the end of the method there is a huge amount of memory being used in the application, because of reading the bytes of the files that will be used.

For the algorithm to work, it is necessary to know the byte index relative to the array being encrypted and to have the exact input and output size.

In short, how do I get the application to "cache" all the bytes of a very large file without problems with memory? There is a way to "dribble" this?

1 answer

6


Obviously Voce is forbidden to read your entire file at once. The only alternative you have left is to process the file in blocks:

public static void Encripta(string src, string dest){
    Directory.CreateDirectory(dest);
    File.Create(dest);
    var buffer = new byte[4096];
    using(var reader = File.OpenRead(src))
    using(var writer = File.OpenWrite(dest))
    {
        int bytes;
        while((bytes = reader.Read(buffer, 0, buffer.Length)) > 0){
            //chama o seu encryptData com buffer aqui
            writer.Write(buffer, 0, bytes);
        }
    }
}

You will also have to adapt your current algorithm. Either the encryption part or the decryption part. For what I realized every time it encrypts an array, the algorithm adds a byte at the end (a kind of checksum). But the problem now is that Voce just wants it to be done in the last array.

public static void Encripta(string src, string dest){
    Directory.CreateDirectory(Path.GetDirectoryName(dest));
    var buffer = new byte[4096];
    var cipher = new Cryraz(new byte[]{10, 11, 12});

    using(var reader = File.OpenRead(src))
    using(var writer = File.Create(dest))
    {
        int bytes;
        while((bytes = reader.Read(buffer, 0, buffer.Length)) > 0){
            Array.Resize(ref buffer, bytes);
            //O algoritmo de encriptacao tem que ter um parametro adicional que indica se este é ou nao o último bloco.
            cipher.EncryptData(ref buffer, reader.Position == reader.Length);
            //chama o seu encryptData com buffer aqui
            writer.Write(buffer, 0, buffer.Length);
        }
        writer.Flush();
    }

}

Likewise, the decryption algorithm always reads the last byte of the array to get the checksum. The problem is that the checksum is only present in the last array. What this means is that you have to read the value beforehand.

public static void Desencripta(string src, string dest){
Directory.CreateDirectory(Path.GetDirectoryName(dest));
var buffer = new byte[4096];
var cipher = new Cryraz(new byte[]{10, 11, 12});

using(var reader = File.OpenRead(src))
using(var writer = File.Create(dest))
{
    reader.Position = reader.Length - 1;
    //le o checksum do fim do ficheiro, para decriptar todos os blocos
    var checkSum = reader.ReadByte();
    reader.Position  = 0;
    int bytes;
    while((bytes = reader.Read(buffer, 0, buffer.Length)) > 0){
        Array.Resize(ref buffer, bytes);
        cipher.DecryptData(ref buffer, (byte)checkSum);
        writer.Write(buffer, 0, buffer.Length);
    }
    writer.Flush();
}

Complete code

  • int bytes would be the current byte position that was run?

  • @Cypherpotato Nao. It is the size of bytes that was read into the buffer. If the file has for example only 1000 bytes then it will only enter the while 1 time and bytes will have the value 1000

  • I did the implementation of this method in the class, but it didn’t work, the output size of the file got bigger than the original size and the decryption decrypted the file but it got corrupted. I don’t know what happened.

  • @Cypherpotato I can only help you if you publish the full code to encrypt and decrypt.

  • I updated the question with a link.

  • 1

    @Cypherpotato I updated the answer

  • There was a break. I implemented the class changes Cryraz and put the class Sample in the test class. I used a 25MB video, then deciphered it using native code, and the file was corrupted. Code: https://pastebin.com/5bCt4XRf

  • I like the idea of streams, but I don’t know how to manipulate streams. My idea suggests reading all the items in the stream normally, ignoring the last one that would be the Hash, and at the end of the reading, the checksun validation would be done. The same thing would be for decryption, to do the byte-by-byte operation, and when it reaches the last byte, discard it, because it would already know what to do with it.

Show 4 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.