Why only an Encoding works in the algorithm?

Asked

Viewed 135 times

1

Here’s the deal: I have an encryption module that encrypts a byte[] and out another byte[] encrypted, and at the end of the output a checksum is placed; the checksum is a single byte generated by the application made by the asymmetric key, so you can verify if the decryption matches the keys, input and output.

The problem is, if I do byte[] -> byte[] works perfectly, encrypting and decrypting. But if I convert these byte[] for Strings, only work if I use an Encoding, and gives error invalid checksum if I use another encoding.

string TextoParaEncriptar = "Olá, mundo!";
string encriptado = cipher.EncryptString(TextoBytes); // ok, encripta normalmente
string decriptado = cipher.DecryptString(encriptado); // beleza também

Above the code works, the field decriptado has a value of "Hello, world!" but both methods used the Encoder Encoding.Default, which is variable according to the running machine. Now if I specify Encoder, it gives error:

string TextoParaEncriptar = "Olá, mundo!";
string encriptado = cipher.EncryptString(TextoBytes, Encoding.ASCII); // ok, encripta normalmente
string decriptado = cipher.DecryptString(encriptado, Encoding.ASCII); // checksum inválido

These are the code for methods to encrypt/decrypt strings:

    public string EncryptString(string inputString) => EncryptString(inputString, Encoding.Default);
    public string EncryptString(string inputString, Encoding byteEncoder)
    {
        byte[] strBytes = byteEncoder.GetBytes(inputString);
        EncryptByteArray(ref strBytes);
        return byteEncoder.GetString(strBytes);
    }
    public string DecryptString(string inputString) => DecryptString(inputString, Encoding.Default);
    public string DecryptString(string inputString, Encoding byteEncoder)
    {
        byte[] strBytes = byteEncoder.GetBytes(inputString);
        DecryptByteArray(ref strBytes);
        return byteEncoder.GetString(strBytes);
    }

Encryption and decryption codes:

    public void EncryptByteArray(ref byte[] inputData)
    {
        if (k == null || k.Length == 0) throw new NullReferenceException("Key cannot be emtpy.");
        if (inputData == null || inputData.Length == 0) return;
        CryrazCore processor = new CryrazCore() { Positions = k };
        {
            processor.ComputeByteArray(ref inputData, false);
            Array.Resize(ref inputData, inputData.Length + 1);
            byte checksum = processor.PushChecksun();
            {
                inputData[inputData.Length - 1] = checksum;
            }
        }
    }
    public void DecryptByteArray(ref byte[] inputData)
    {
        if (k == null || k.Length == 0) throw new NullReferenceException("Key cannot be emtpy.");
        if (inputData == null || inputData.Length == 0) return;
        CryrazCore processor = new CryrazCore() { Positions = k };
        byte dataChecksum = inputData[inputData.Length - 1];
        byte processorChecksum = processor.PushChecksun();
        if(dataChecksum != processorChecksum) throw new NullReferenceException("Invalid key for this data. Checksum check failed.");
        {
            inputData[inputData.Length - 1] = 0;
            Array.Resize(ref inputData, inputData.Length - 1);
            processor.ComputeByteArray(ref inputData, true);
        }
    }
  • processor.ComputeByteArray(ref byte[], boolean): It is the method that processes byte-by-byte of byte[] received.
  • EncryptByteArray inserts the byte checksum at the end of the chain, the method DecryptByteArray remove it before processing decryption.

Why is giving error, even using the same Encoding to encrypt and decrypt only when the Encoding byteEncoder is not Encoding.Default? How do I fix this?

Updating

If I use Encoding Western European (ISO) ISO-8859, whose is SBCS (Single Byte Character Set), which is a byte for each character, the algorithm works normally. But I still don’t understand.

The algorithm runs through all bytes received by GetBytes() and places a checksum at the end of that string of bytes, and then converts them to a string using a GetString(byte[]) by what came encrypted, after decrypting that same encrypted string, says that the last byte was changed.

  • Can provide encryption and decryption algorithms?

  • @Brunocosta updated the question.

2 answers

5


The function of a Character encoding is to encode a certain "text" for bytes and back to text. For a encoding be complete, every possible string of text must have a byte representation (a encoding incomplete only supports a subset of possible characters - for example ASCII). However, not every possible byte string needs to match a valid text in any encoding. If you use an arbitrary sequence of bytes and try to convert to text, it is possible that nothing coherent comes out of it.

Hence, if you try to represent the encrypted byte sequence (which is indistinguishable from a random byte sequence) in a encoding any great chance those bytes do not represent any valid text. Especially in UTF-8, which has very strict rules as to what the first bits represent.

So I suggest using another kind of representation for your encrypted string - say base 64 or maybe Hex. Example:

public string EncryptString(string inputString) => EncryptString(inputString, GetCryrazStringEncoder());
internal string EncryptString(string inputString, Encoding byteEncoder)
{
    byte[] strBytes = byteEncoder.GetBytes(inputString); // Texto usa encoding
    EncryptByteArray(ref strBytes);
    return Convert.ToBase64String(strBytes); // Bytes aleatórios usam base64
}

public string DecryptString(string inputString) => DecryptString(inputString, GetCryrazStringEncoder());
internal string DecryptString(string inputString, Encoding byteEncoder)
{
    byte[] strBytes = Convert.FromBase64String(inputString); // Bytes aleatórios usam base64
    DecryptByteArray(ref strBytes);
    return byteEncoder.GetString(strBytes); // Texto usa encoding
}
  • 1

    Fantastic! Thank you very much for the clarification.

2

Being very superficial: ASCII does not support accentuation.

If you expect to receive accentuation - or binary text - use Encoding.UTF8.

See how they behave in simple conversions.

var original = "Olá, Hello World";
Console.WriteLine("Original: " + original);

var ascBytes = Encoding.ASCII.GetBytes(original);
var backFromASCII = Encoding.ASCII.GetString(ascBytes);
Console.WriteLine("ASCII: " + backFromASCII);

var utfBytes = Encoding.UTF8.GetBytes(original);
var backFromUTF8 = Encoding.UTF8.GetString(utfBytes);
Console.WriteLine("UTF8: " + backFromUTF8);

var iso8859 = Encoding.GetEncoding("ISO-8859-1");
var isoBytes = iso8859.GetBytes(original);
var backFromISO = iso8859.GetString(isoBytes);
Console.WriteLine("ISO-8859: " + backFromISO);

The exit will be:

// Original: Olá, Hello World
// ASCII: Ol?, Hello World
// UTF8: Olá, Hello World
// ISO-8859: Olá, Hello World

See working on .NET Fiddle.

  • Are bytes changed with encoding? Because I tried to do with UTF-8 for UTF-8 and also gives error in Checksum, says invalid same.

  • I updated my question with some clues to find the problem.

  • But the problem is in ASCII. That’s what I showed in the answer. See that with ISO-8859-1 it is also possible to make real conversion proof .

  • Strange that if I use the UTF-8 also gives error.

  • 1

    Play on the net fiddle and put it here I take a look.

  • I updated the question with . NETF.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.