How does the computer "know" the alphabetical order when comparing two chars?

Asked

Viewed 332 times

8

I have a question about Caesar’s cipher program:

#include<iostream>
#include<string.h>

using namespace std;

int main() {
   cout<<"Enter the message:\n";
   char msg[100];
   cin.getline(msg,100); //take the message as input
   int i, j, length,choice,key;
   cout << "Enter key: ";
   cin >> key; //take the key as input
   length = strlen(msg);
   cout<<"Enter your choice \n1. Encryption \n2. Decryption \n";
   cin>>choice;
   if (choice==1) //for encryption{
      char ch;
      for(int i = 0; msg[i] != '\0'; ++i) {
         ch = msg[i];
         //encrypt for lowercase letter
         if (ch >= 'a' && ch <= 'z'){
            ch = ch + key;
            if (ch > 'z') {
               ch = ch - 'z' + 'a' - 1;
            }  
            msg[i] = ch;
         }
         //encrypt for uppercase letter
         else if (ch >= 'A' && ch <= 'Z'){
            ch = ch + key;
            if (ch > 'Z'){
               ch = ch - 'Z' + 'A' - 1;
            }
            msg[i] = ch;
         }
      }
      printf("Encrypted message: %s", msg);
   }

I’m having doubts in the passageway:

if (ch >= 'a' && ch <= 'z'){
    ch = ch + key;
    if (ch > 'z') {
        ch = ch - 'z' + 'a' - 1

I did not understand the comparison with the letter 'a'. How does the program know that the alphabet is from "a" to "z", without a definition? Is this standard C/C++? If you could also explain to me the char comparisons (ch>='a'), thank you in advance.

  • 1

    this is called "ASCII table", where each character, in this case the letters from "a" to "z" has a code: https://pt.cppreference.com/w/cpp/language/ascii

  • On the face, I notice that the **If and the If should be written as if.

  • @Ricardopunctual Put that as an answer, no matter how simple.

  • 1

    If one of the answers below solved your problem, you can choose the one that best solved it and accept it, see here how and why to do it. It’s not mandatory, but it’s good site practice to indicate to future visitors that it solved the problem (remembering that you can only accept one of them) - if they haven’t solved it, feel free to comment on what you missed, that if this is the case, we can update the answers

2 answers

9

How the program knows that the alphabet is from "a" to "z", without a definition?

Of course there is a definition. Like a college professor used to say, computers are "dumb" machines because they only do what we say. If he "knows" the alphabetical order, it’s because someone put that rule there.


The char, despite the name, it is a numerical type. Actually he should call himself byte, for deep down he is that.

What happens is that this number can be interpreted as a character, using its respective value in the ascii table:

char c = 'a';
printf("%d\n", c); // 97
printf("%c\n", c); // a

In the ASCII table, the letters have consecutive values that coincide with the alphabetical order. Then the letter a has value 97, the letter b is 98, etc. Detail that capital letters have different values: A is 65, B is 66, etc..

Anyway, since these values in the background are numbers, I can do operations with them normally, such as increasing their value and comparing with others:

char c = 'a';
printf("%d\n", c); // 97
printf("%c\n", c); // a

c += 3; // somar 3 ao valor do char
printf("%d\n", c); // 100
printf("%c\n", c); // d
// o código entra em ambos os if's abaixo
if (c > 97) {
    printf("maior que 97\n");
}
if (c < 'f') { // o caractere "f" tem o valor 102
    printf("antes de f\n");
}

Remembering that this is not restricted to only letters:

char c = '!';
printf("%d\n", c); // 33
printf("%c\n", c); // !
if (c < '}') { // entra no if, pois o valor do caractere "}" é 125
    printf("ok\n");
}

Therefore, in this passage:

if (ch >= 'a' && ch <= 'z') {
    ch = ch + key;
    if (ch > 'z') {
        ch = ch - 'z' + 'a' - 1;

I’m comparing whether the char is among 'a' and 'z' (which would be the same as checking if he’s between 97 and 122, but using the characters makes the code easier to understand, in my opinion).

The second if checks whether the result has exceeded the z (for example, if the result is 123, it matches the character {), then it corrects, making it correspond to a letter (since the idea of the Cipher of Caesar is to return to the a if the letter exceeds z).

But there’s actually a problem there, for the char, depending on the compiler, it may be Signed for default, which means that its values will be from -128 to 127, so if the value exceeds 127 will occur a overflow and this will become negative (example).

One way to solve it is to ensure that the value does not exceed 127:

char msg[5] = "azAZ";
int key = 10;
for (int i = 0; msg[i] != '\0'; i++) {
    char ch = msg[i];
    if (ch >= 'a' && ch <= 'z') {
        ch = 'a' + (ch - 'a' + key) % 26;
    } else if (ch >= 'A' && ch <= 'Z') {
        ch = 'A' + (ch - 'A' + key) % 26;
    }
    msg[i] = ch;
}
printf("%s\n", msg); // kjKJ

See here this code running.

In doing ch - 'a' i "normalize" the character value for its position relative to the letter a (i.e., the a will be zero, b will be 1, etc). Next I will be the key and take the rest of the division by 26 (thus, if the resulting value is greater than 26 - which means it would exceed the z - i guarantee that I return to the beginning of the alphabet). Finally, we add the resulting value with a to obtain the corresponding letter. The same applies to uppercase letters.


Note: the idea of mapping characters to a numerical value is used today, and is not limited to ASCII (to delve into the subject, read here).

0

I don’t know anything about C or C++, but I know a little bit about C#, and in C# if you can increase a char, that is to say:

char caractere = 'A';
caractere++;

Doing this the compiler itself will read to "translate" the variable to ASCII, and increment, so the character variable will be B. So I believe it is the same thing in C/C++, if is based on the ASCII table. So if the letter in the table is greater than 97, which represents A in decimal, and less than 172, which represents Z, it will continue the code.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.