strcpy is merging numeric format with other chars

Asked

Viewed 198 times

6

I don’t know if I made myself clear in the title, but when using strcpy() to copy a char* to another when I put a format like this "teste" it works normally, but when I put a string 3 letter format (digits in case), for example "2000" it ends up merging this value for the destination with the next value the next time I use strcpy(), follows the code:

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct
{
    char nome[80];
    char ano[4];
    char diretor[80];
} Filme;

void analise(Filme *filme, const char *arg1, const char *arg2, const char *arg3)
{
    Filme _filme = {
        .nome    = malloc(2),
        .ano     = malloc(2),
        .diretor = malloc(2)
    };

    strcpy(_filme.nome, arg1);
    strcpy(_filme.ano, arg2);
    strcpy(_filme.diretor, arg3);

    memcpy(filme, &_filme, sizeof _filme);
}


void carregar(Filme filmes[])
{
    analise(&filmes[0], "E o Vento Levou", "1939", "Victor");
    analise(&filmes[1], "teste", "998", "bar");
    analise(&filmes[2], "Os Passaros", "1963", "Alfred Hitchcock");
}

int main()
{
    Filme filmes[1000];

    carregar(filmes);

    printf("\nmain:\n");

    printf("- Nome:    %s\n", filmes[0].nome);
    printf("- Ano:     %s\n", filmes[0].ano);
    printf("- Diretor: %s\n", filmes[0].diretor);

    printf("----------------\n");

    printf("- Nome:    %s\n", filmes[1].nome);
    printf("- Ano:     %s\n", filmes[1].ano);
    printf("- Diretor: %s\n", filmes[1].diretor);

    printf("----------------\n");

    printf("- Nome:    %s\n", filmes[2].nome);
    printf("- Ano:     %s\n", filmes[2].ano);
    printf("- Diretor: %s\n", filmes[2].diretor);

    return 0;
}

Notice I ran this:

analise(&filmes[0], "E o Vento Levou", "1939", "Victor");
analise(&filmes[1], "teste", "998", "bar");
analise(&filmes[2], "Os Passaros", "1963", "Alfred Hitchcock");

When running the problem the output is this:

main:
- Nome:    E o Vento Levou
- Ano:     1939Victor
- Diretor: Victor
----------------
- Nome:    teste
- Ano:     998
- Diretor: bar
----------------
- Nome:    Os Passaros
- Ano:     1963Alfred Hitchcock
- Diretor: Alfred Hitchcock

See that in "Gone with the Wind" and "Gone with the Birds" the years were merged with the director’s name, 1939Victor and 1963Alfred Hitchcock, already in the case of:

analise(&filmes[1], "teste", "998", "bar");

It has the correct output. I understand that I should make the year with int, but I’m learning C and I’d like to better understand this part of the memory, I assume it was some typo of mine.

  • 1

    not just declare char ano[5]; within the type Filme?

  • @Why Marcelouchimura? Is there reason, has explanation for this? And if the values are dynamic, how to ensure that do not pass these 4 digits, for example for some error the user type 99800, as I would?

  • 1

    I think what @Marcelouchimura was trying to say is that the terminator lacked the space, which Maneiro already focused on his answer.

  • 1

    @Guilhermenascimento although more common in royal code, let the strings dynamic size becomes more difficult to manage. To learn I recommend not to use this at first. Anyway the year seems to me a clear case of being fixed size and should not be allocated out. The biggest reason to have a separate allocation is when you don’t know the size of the object, along with the reason when you don’t know the lifespan (of course when you know the size and it’s too big or the lifespan is longer than you can control locally).

  • @Isac was the missing explanation, and personally I think it’s nice for people to come here and say, "Do XYZ," but without explaining the motivations of this, it’s hard to determine why and learn. After all, we are not seeking to "just make a code work", we are seeking to truly learn. There’s a lot of code that works, but that doesn’t mean they’re right.

  • 1

    I agree with you 100% and that’s what I always try to do. Avoid promoting the blind copy of "things that work". So much so that there have been questions that I’ve answered with two or three code responses working, and I’ve only answered to explain the problem, so that the person can actually figure out where they went wrong and improve in the future

  • @Guillhermenascimento, have you learned the reason for [5] now??

  • @Marcelouchimura I think it’s already pretty obvious that yes. And the problem of releasing comments "do XYZ" without explaining the reasons, already learned why this is not good for the community

  • @Guilhermenascimento, is not good for the community or is not good for those who have no interest (or time) to search and want to hand kissed?

  • @Marcelouchimura has nothing to do with kissing, you’re not getting it, see my score, see how much I answer, see how I try to be judicious, I’m not asking for code ready ... this site is not a forum for technical support, I asked for guidance on how to work with strcpy, which is a totally valid question about the C string API, you just came in and made a vague, normal, and acceptable comment, but I asked why this is also fully valid, after all do Xyz without the least complicates explanation, even more for those who are studying.

  • @Guilhermenascimento, your score shows you have plenty of time to stay here.

  • @Marcelouchimura a very prejudiced view without analysis and superficial to yours, Missing you look at the content of the posts and know how to receive a constructive criticism instead of getting personal and playing the victim, I made a fully valid inquiry into the purpose of the site, "Why 5?", simply could you have taken as something constructive and formulated a brief explanation, we’d be fine, but it’s clear that you’re trying to turn this into some kind of inquisition based on a trial of your own, totally superficial.

Show 8 more comments

2 answers

5


This code has some problems.

  • There’s nothing to wear malloc() if the area that the string should be already reserved within the structure. It could even allocate if desired, and maybe makes sense for the names, but then you need to declare as const char * and not [tamanho]. You have to be much more careful when you do that. What’s even leaking memory. At high volume that would be tragic. And allocating only 2 characters where you need several also does not work very well, is that in this case coincidentally works.
  • Not that it’s a problem, but I see no reason to create a local structure, initialize its members and then copy to the array. Write directly to array, without creating anything intermediate, initialize or copy.
  • As strings have a terminator, so you need to make room for it. The specific problem you’re encountering is that it has 4 bytes reserved for the year, so the 4 characters of the year are placed there, and a 5th. is placed next. When you name the director, its first character goes on top of the terminator of the string of the year. Then when it will read the year it has no end, only finalize at the end of the name of the director, so it is all together. If it had 5 bytes, the terminator would be preserved and everything would work ok.

That’s how it works:

#include <string.h>
#include <stdio.h>

typedef struct {
    char nome[81];
    char ano[5];
    char diretor[81];
} Filme;

void analise(Filme *filme, const char *arg1, const char *arg2, const char *arg3) {
    strcpy(filme->nome, arg1);
    strcpy(filme->ano, arg2);
    strcpy(filme->diretor, arg3);
}

void carregar(Filme filmes[]) {
    analise(&filmes[0], "E o Vento Levou", "1939", "Victor");
    analise(&filmes[1], "teste", "998", "bar");
    analise(&filmes[2], "Os Passaros", "1963", "Alfred Hitchcock");
}

int main() {
    Filme filmes[1000];
    carregar(filmes);
    printf("\nmain:\n");
    for (int i = 0; i < 3; i++) {
        printf("- Nome:    %s\n", filmes[i].nome);
        printf("- Ano:     %s\n", filmes[i].ano);
        printf("- Diretor: %s\n", filmes[i].diretor);
        printf("----------------\n");
    }
}

Behold working in the ideone. And in the repl it.. Also put on the Github for future reference.

  • But assuming the value is dynamical, then end up passing the "80" in ". name" for example, maybe then I should use strcpy_s to try to make sure it’s "80"?

  • 1

    This is much safer, I didn’t get into this question. Making code ready for production is always much more difficult than these examples we see in books, blogs, tutorials and responses in the :) Validations are always necessary in everything, especially from outside data and will be used in C.

  • Note that the strcpy_s() is not mandatory. There are cases you can trust that the string is well formed. Obviously (almost) never when it comes from external source.

3

What happens is that your struct occupies 164 bytes in memory:

  • 80 bytes for the name, including the null terminator;
  • 4 bytes for the year, you are not considering the null terminator here;
  • 80 bytes for director, including null terminator.

The compiler will assign the following shifts for each of these fields:

  • nome: 0 bytes offset since the beginning of the struct in memory.
  • ano: 80 bytes offset from the beginning of the struct in memory.
  • diretor: 84 bytes offset since the beginning of the struct in memory.

So, if we ask the first case for example, the layout looks like this (where is the null terminator and is garbage, which can have any value):

E o Vento Levou∅□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□1939Victor∅□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□

When you do that:

printf("- Ano:     %s\n", filmes[0].ano);

At runtime, the generated code will load the memory address of filmes, add 164 0 to get filmes[0] (164 = sizeof Filme, 0 is the index) and then add 80 to get the position of filmes[0].ano (80 is the position of ano within the struct). From this position it will print a string (due to the %s) starting at that calculated position and ends at the first null terminator found. It turns out that the year does not have a null terminator, and therefore it will end up invading the subsequent memory region of the field diretor.

Besides, that code makes no sense:

Filme _filme = {
    .nome    = malloc(2),
    .ano     = malloc(2),
    .diretor = malloc(2)
};

What you probably wanted was this:

void analise(Filme *filme, const char *arg1, const char *arg2, const char *arg3) {
    strcpy(filme->nome, arg1);
    strcpy(filme->ano, arg2);
    strcpy(filme->diretor, arg3);
}

To solve your problem without using int for the year, one possibility is to change the size of the field ano to 5. If the other fields should have a maximum of 80 characters instead of 79, then change their size to 81 in order to make sure that the null terminator is there.

Another alternative that avoids having to resize the size of the fields is to specify a maximum size for the strings in the printf:

printf("- Nome:    %.80s\n", filmes[0].nome);
printf("- Ano:     %.4s\n", filmes[0].ano);
printf("- Diretor: %.80s\n", filmes[0].diretor);
  • Thank you for the explanation about the "garbage", I already had a notion about this, but your visual example was very useful even, still yes the solution with printf escapes a little to what I wish, it is not that it does not work, but it is that it is a little impracticable since I was seeking control by the very values of the struct, that looking so now it seems to me unnecessary to fix the size, still I believe that strcpy_s would be more feasible to solve all this. Anyway +1 by visual explanation.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.