Does the scanf record string on pointer that does not have a defined size?

Asked

Viewed 1,026 times

2

I’m trying to understand how the strings in C. I noticed that even not defining in any part of the code any limit for the vector TextoUm[] (which follows in the code below), the function scanf can record a string in this vector correctly. But the same success does not occur if, in the function scanf, i replace the vector by a pointer of char (initialized with a string or not).

Below is a code showing exactly these cases of the vector and pointer I quoted (although I read that "vector" is a "pointer").

#include <stdio.h>

int main ()
{
    char TextoUm[]="";
    char *TextoDois="";

    printf ("\n\n\tDigite um nome para \"TextoUm\":\n\n\t");

    scanf ("%s", TextoUm);

    printf ("\n\tO texto digitado foi: %s", TextoUm);

    printf ("\n\n\tDigite um nome para \"TextoDois\":\n\n\t");

    scanf ("%s", TextoDois); /* Aqui ele dá erro */

    printf ("\n\tO texto digitado foi: %s", TextoDois);

return 0;
}

After running this code in the debugger, it presented the following error:

Program Received Signal SIGSEGV, Segmentation fault.

About this error, Wikipedia says that it is an attempt to access an invalid or non-existent address, but it did not help me understand why this error only occurs with the pointer and not with the vector.

So my question is the title. The scanf can record a string on a pointer of char, even if there is no limit size set for this pointer?

1 answer

3


Run X right

Understand that C lets you do everything, and requires you to take responsibility for everything. This is both a positive and a negative point of language. She is close to Assembly. So just not "working" what has no way to work.

But working and being right are very different things, I live talking about it and almost nobody listens. In C is even more important. You have to know every detail of the language before you use it correctly. "I’ve heard it" doesn’t work well with C, you have to learn right.

Scanf()

First of all understand that scanf() is something useful and in some applications it can be used smoothly, but the function is not used for more serious applications that need validation. Or is used only as part of a larger algorithm.

One of the problems is that you can type anything and it accepts. It has some means to control, but they are not always sufficient, and are almost always not used in simple examples. So it is easy to corrupt memory because the function writes up in not reserved location for the variable, after all C allows "all".

Using it too much in a simple example can give the idea that it is what will be used in real applications.

Declared variables

The two statements of variables work, but are wrong, they do not reserve space for the object string, that is to say, it does not have a memory location available for the character sequence it is intended to store. You only have a memory address for somewhere, but that place is not set and reserved. To better understand read What are and where are the "stack" and "heap"?.

The first declaration should make room for the string in stack, but the placeholder is 0. Either you would need to put the size to be reserved in the brackets, or you would have to put a string of characters of the desired size (the compiler counts how many are).

The second declaration should make room for the string in the heap (probably), but this was not even tried. The correct would be to call the function malloc() requesting memory for the operating system (or the internal system) and returning the address of this placeholder.

The size value used will only hold memory, will not impose any limit on anything in C. If your code tries to write outside of this reserved area, it will work. But it will give you a huge problem, losing data, locking the application (if you’re lucky) or opening security gaps, since it’s not the right one.

Something like this works and is almost certain (it is not yet 100% safe code):

#include <stdio.h>
#include <stdlib.h>

int main() {
    char TextoUm[10]; //note a mudança aqui
    char *TextoDois = malloc(10); //note a mudança aqui
    printf("Digite um nome para \"TextoUm\":\n");
    scanf("%s", TextoUm);
    printf("O texto digitado foi: %s", TextoUm);
    printf("\nDigite um nome para \"TextoDois\":\n");
    scanf("%s", TextoDois);
    printf("O texto digitado foi: %s", TextoDois);
}

Behold working in the ideone. And in the repl it.. Also put on the Github for future reference.

As in his code nothing had been reserved in the heap he picked up a wrong address (where the "") and that address caused a memory error by not being able to write to it (it is a protected memory area).

Then you can use char, but has to initialize correctly. The error is not scanf(), he’s just the symptom that something was done wrong before.

In the array of char it seems that it worked correctly, but only by coincidence and because it is an exercise, if it were a real application, it would be more problematic. It’s a worse mistake because it hasn’t been detected.

Completion

I take this opportunity to say that array looks like pointer but it’s not the same thing.

There are other points on the subject, but this cannot turn into a full tutorial. In fact I’ve already answered here on the site about all that you’re learning. Just search or look at my profile. Other people have answered as well. Examples:

  • 1

    Thank you very much for the excellent answer. I came to think that as the first scanf saved the string in the array, the scanf automatically set/reset the allocation size for the string, based on the amount of characters it received from the user. Now I see that this is not so easy :). Nor did I suspect that hackers would exploit this buffer overflow, and that it is related to my problem with scanf... thank you very much again.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.