Help to understand hash usage error

Asked

Viewed 48 times

0

I am making a program to read the words of a text file, generating a hash code to use as index that will store the word in a vector that fits up to 1000 words.

Function generating the hash number:

int a, i,j,soma = 0;
i = strlen(str);
j = 1;
char * c;
for(a = 0; a< i; a++){
    soma += (int)str[a] * pow(7, j);
    j++;
}
soma = soma%1000;
printf("Hash: %d\n", soma);
return soma;

To struct vector:

typedef struct vetor{  
    char * palavra;  
}Vetor;

And the code:

char word[1000];
Vetor * vet = (Vetor *)malloc(1000*sizeof(Vetor));

FILE * file;
file = fopen("teste.txt", "r");


while(fgets(word, 1000, file) != NULL){
    char * p;
    p = strtok(word, " \n \r \t");
    while(p != NULL){
        strupr(p);
        int hashPos;
        hashPos = geraHash(p);

        printf("Palavra: %s  Hash: %d\n",p, hashPos);
        printf("\n");

        vet[hashPos].palavra = p;


        p = strtok(NULL, " \n \r \t");
    }
}

    printf(" %s " , vet[670].palavra);
    printf(" %s " , vet[801].palavra);
    printf(" %s " , vet[867].palavra);
    printf(" %s " , vet[846].palavra);


fclose(file);

The test.txt text file contains only the lines: 1st: "abc def" and 2nd: "Asd cd".
Exit: Essa é a saída do programa

I don’t understand why the same word appears in two different positions generated by the hash function. Could anyone tell me where I’m going wrong?

1 answer

0

The problem is related to the shape of the strtok works and with the fact that you are saving the original pointer returned by strtok.

Note the reservation that the documentation does in relation to the string passed to strtok:

Notice that this string is modified by being Broken into smaller strings (tokens).

That is, the string passed is modified, and so when you save the word:

vet[hashPos].palavra = p;

You are setting the pointer to a string that will be modified by strtok.

The solution is to create a duplicate string from the one you go to. There is already a function to do this, the strdup, that allocates the space needed for the string with malloc. This implies that if you no longer need these strings you have to free on them to have no memory leak.

Then you just need to change the instruction that saves the word to:

vet[hashPos].palavra = strdup(p);

Naturally it could do the duplication by hand at the expense of malloc, strlen and strcpy:

vet[hashPos].palavra = malloc(sizeof(strlen(p) + 1)); // +1 para o terminador
strcpy(vet[hashPos].palavra, p);

Example execution on my machine:

inserir a descrição da imagem aqui

  • Got it. I did as suggested and it really worked. Thank you very much for the clarification!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.