Program error
The error of your program is to process the data within the struct
as if they were string
without leaving room for the character \0
.
Remember that in the linguagem C
one string
is a array/vetor
unidimensional
finished by the character \0
. Example:
That eh string 0
This is an array of characters
Character array is not string
The character \0
indicates where a string
ends. If a array
be treated as string
but he doesn’t have the \0
then language cannot guess where its end is.
Observe:
char Codg[10];
Here we have a vetor
to store exactly 10 caracteres
and in your file .csv
the code has exactly 10 caracteres
, then there won’t be room for the \0
. Without the \0
you cannot use this vector as a string
(as you do in the printf
), because the program has no way of knowing where the end is.
Another problem is when you use the fscanf
with the argument %10[^;];
to read the code, here you are using this vector as if it were a string
. If you’re using as a string
then the fscanf
will put the \0
at the end of this vector, at the position Codg[10]
. Notice:
Codg[0] = 'R'
Codg[1] = 'O'
Codg[2] = '2'
Codg[3] = '1'
Codg[4] = '/'
Codg[5] = '0'
Codg[6] = '1'
Codg[7] = '/'
Codg[8] = '0'
Codg[9] = '3'
Codg[10] = ' 0'
Here we have a problem, the position Codg[10]
is not part of your vector (remember that 10 positions is from 0 to 9). The problem with this is that the \0
is theoretically in a memory position that is not reserved for you and so this area can be overwritten at any time (which would be bad).
Right away we have:
char Regiao[10];
The interesting thing is that this vector is declared right after the vector of the code and with that they are side by side, that is, the position Codg[10]
is equal to Regiao[0]
(Codg[10]
is not part of the vector code, but the position Codg[10]
is what comes after the Codg[9]
, this is the end). So when you add something to Regiao[0]
then the character \0
code will be lost. Notice how these vectors look after reading content to Regiao
:
Codg[0] = 'R'
Codg[1] = 'O'
Codg[2] = '2'
Codg[3] = '1'
Codg[4] = '/'
Codg[5] = '0'
Codg[6] = '1'
Codg[7] = '/'
Codg[8] = '0'
Codg[9] = '3'
Region[0] = 'N' // Codg[10] = ' 0' was overwritten
Region[1] = 'o'
Region[2] = 'r'
Regiao[3] = ’t'
Region[4] = 'e'
Region[5] = ' 0' // Indicates the end of the string
Regiao[6] = '' // I left it empty, but in reality there will be some garbage in this place
Regiao[7] = '' // I left it empty, but in reality there will be some garbage in this place
Regiao[8] = '' // I left it empty, but in reality there will be some garbage in this place
Regiao[9] = '' // I left it empty, but in reality there will be some garbage in this place
Note that the \0
which indicated the end of the code was lost. Now if you use a printf
to print the contents of the code as if it were string
then the program will print everything until it finds a \0
, like the \0
is after the name of the region so the content of the region will also be printed.
Notice that the fundamental difference between these two vetores
is that one the \0
stays out of vetor
and so can be overwritten while the other \0
stays inside the vetor
and cannot be overwritten (may be overwritten manually, but this is not the case now).
For your problem to be solved it is necessary that the \0
always stay inside the vector, to do this just increase in 1
the size of the vector, that is, if a string
will have at most 65
letters, so your vector has to have 66
positions (one more for the \0
). In your code it would be something like this:
typedef struct {
char Codg[10 + 1]; // Adicionando +1 para o \0
char Regiao[10 + 1]; // Adicionando +1 para o \0
char UF[2 + 1]; // Adicionando +1 para o \0
char Data[10 + 1]; // Adicionando +1 para o \0
}dados_cov;
Now that part I don’t understand:
for(int i = 0; i < 3; i++) //Por algum motivo quando leio o arquivo .csv vem com 3 caracteres aleatórios
fgetc(file);
I ran your code without it and it worked normal.
Your final code would look something like this:
#include <stdio.h>
typedef struct {
char Codg[10 + 1];
char Regiao[10 + 1];
char UF[2 + 1];
char Data[10 + 1];
}dados_cov;
int main(void) {
FILE *file;
dados_cov D[10];
file = fopen("COV.csv", "r");
/*
for(int i = 0; i < 3; i++) //Por algum motivo quando leio o arquivo .csv vem com 3 caracteres aleatórios
fgetc(file);
*/
if(file)
for(int i = 0; i < 10; i++) {
fscanf(file,"%10[^;];%10[^;];%2[^;];%10[^\n]\n", D[i].Codg, D[i].Regiao, D[i].UF, D[i].Data);
printf("%s - %s - %s - %s\n", D[i].Codg, D[i].Regiao, D[i].UF, D[i].Data);
}
fclose(file);
return 0;
}
Just noting that the ideal here would be a program that parses CSV through the separator and goes on to make strcat for the strings of the structure. There is no way to guarantee the correct sizes from the CSV if any of the fields changes size, so the ideal here would be not to use char[] to store strings, but char* or even implement your Dynamic library strings. Parse is relatively simple because CSV has a field stop and may or may not use "" for strings.
– Andre Cavalcante