Separating small strings from a giant string

Asked

Viewed 1,092 times

0

Hello. I have to do a function to read a huge string and then divide it into small strings for each field. Each field would be separated by ; example:

BRUNNY;PR;MG;T;Câmara dos Deputados, Edifício Anexo;4;, gabinete nº;260;Brasília - DF - CEP 70160-900;3215-5260;3215-2260;08;21;[email protected];BRUNNY;Exma. Senhora Deputada;BRUNIELE FERREIRA GOMES

What I thought until then was:

int i,a = 0; char str[1000];
scanf("%[^\n]s", str);
for(i = 0; i < strlen(str); i++)
{
   if (str[i] == ';')
   {
    /** Essa parte eu não consegui pensar em como transferir a palavra encontrada para a sua variável. */
    a = i + 1;
   }
}

I couldn’t find a way to pass each word to its variable, since it would look like this: name, party, Uf, address ....

2 answers

2

The simplest solution would be even using the function strtok of the library of c, which lets you read word by word based on a tab.

My answer is all the same as in the documentation except that I created an array of strings to store the various values found. It would obviously be impossible to store the various values in loose variables.

Code:

int i = 0;
char str[1000];
scanf("%[^\n]s", str);

//primeiro achar a quantidade de separadores para criar o array com o tamanho certo
char *letra = str;
int separadores = 0;

while (*letra != '\0'){
    if (*(letra++) == ';') separadores++;
}

char* palavras[separadores]; //criar o array de palavras interpretadas
char *palavra = strtok(str, ";"); //achar a primeira palavra com strtok

while (palavra != NULL){ //se já chegou ao fim devolve NULL
    palavras[i++] = palavra; //guardar a palavra corrente e avançar
    palavra = strtok(NULL, ";"); //achar a próxima palavra
}

Note in the particular call that is made to find the second and subsequent words:

palavra = strtok(NULL, ";");

Who gets the value NULL. This causes the strtok continue in the last word that had been searched, as indicated in the documentation:

Alternatively, a null Pointer may be specified, in which case the Function continues Scanning Where a Previous Successful call to the Function ended.

It is also relevant to indicate that the strtok changes the original string, so if you need to use it later in the code you should make a copy of it before finding the words. The most suitable function for this would be the strcpy:

char str[1000];
char strOriginal[1000]; 
strcpy(strOriginal, str); //copiar de str para strOriginal
//resto do código

Example working on Ideone

  • 1

    If you are interested in maintaining the original state of str, just make a copy using strcpy for another variable and use str to tokenize.

  • 1

    @Jeffersonquesado Yes I did not talk about it but yes it is relevant to indicate that the strtok actually changes the original string. Thank you for reminding

  • some college trauma that I wish other people wouldn’t go through :) After all, it’s not everyone who remembers RTFM.

  • 1

    @Jeffersonquesado I fully agree. The c is full of little details that end up giving you headaches and time consuming when you don’t know about them.

0

I suggest you use the function strtok() to extract the tokens contained in the line and the functions malloc(), realloc(), strdup() and free() to allocate a completely dynamic string list.

Here is an (tested) example of the proposed idea:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


char ** strsplit( const char * src, const char * delim )
{
    char * pbuf = NULL;
    char * ptok = NULL;
    int count = 0;
    int srclen = 0;
    char ** pparr = NULL;

    srclen = strlen( src );

    pbuf = (char*) malloc( srclen + 1 );

    if( !pbuf )
        return NULL;

    strcpy( pbuf, src );

    ptok = strtok( pbuf, delim );

    while( ptok )
    {
        pparr = (char**) realloc( pparr, (count+1) * sizeof(char*) );
        *(pparr + count) = strdup(ptok);

        count++;
        ptok = strtok( NULL, delim );
    }

    pparr = (char**) realloc( pparr, (count+1) * sizeof(char*) );
    *(pparr + count) = NULL;

    free(pbuf);

    return pparr;
}


void strsplitfree( char ** strlist )
{
    int i = 0;

    while( strlist[i])
        free( strlist[i++] );

    free( strlist );
}


int main( int argc, char * argv[] )
{
    int i = 0;
    char ** pp = NULL;

    pp = strsplit( argv[1], ";" );

    while( pp[i] )
    {
        printf("[%d] %s\n", i + 1, pp[i] );
        i++;
    }

    strsplitfree( pp );

    return 0;
}

Test #1:

$ ./split "Alpha;Beta;Gamma;Delta;Epsilon"
[1] Alpha
[2] Beta
[3] Gamma
[4] Delta
[5] Epsilon

Test #2:

$ ./split "Nome;Sexo;Data de Nascimento;Cidade de Nascimento;Estado Civil"
[1] Nome
[2] Sexo
[3] Data de Nascimento
[4] Cidade de Nascimento
[5] Estado Civil

Test #3:

$ ./split "BRUNNY;PR;MG;T;Câmara dos Deputados, Edifício Anexo;4;, gabinete nº;260;Brasília - DF - CEP 70160-900;3215-5260;3215-2260;08;21;[email protected];BRUNNY;Exma. Senhora Deputada;BRUNIELE FERREIRA GOMES"
[1] BRUNNY
[2] PR
[3] MG
[4] T
[5] Câmara dos Deputados, Edifício Anexo
[6] 4
[7] , gabinete nº
[8] 260
[9] Brasília - DF - CEP 70160-900
[10] 3215-5260
[11] 3215-2260
[12] 08
[13] 21
[14] [email protected]
[15] BRUNNY
[16] Exma. Senhora Deputada
[17] BRUNIELE FERREIRA GOMES

Browser other questions tagged

You are not signed in. Login or sign up in order to post.