Find strings within *.txt files in C


Viewed 2,446 times


I’m a beginner and I’m dying to do a program and C++ that requires doing two functions.

1) Searching a *.txt file for a specific string;

2) Through some type of index indicate the position of this string within the file.

To try to explain better, for example, I have the text "Name: Luiz Fernando Oliveira". What I need is a program that finds the string "Name:" and indicates the position of the file so that I can read "Luiz Fernando de Oliveira" and save in another file.

I know you have plenty of material on this but I can’t get the "pieces to fit". Any help is welcome.

Grateful from now on.

  • 1

    Why not read the entire contents of the file by placing it in a string a in memory and then seek Nome: within that string, returning to the position where it was found?

1 answer


What you need is:

  1. Open the input file.

  2. Determining the file size.

  3. Put all file contents in a string in memory.

  4. Look for Nome: in that string, finding the appropriate position.

  5. Separate the name in another string.

  6. Open the output file.

  7. Write to output file.

  8. Close both files.

To open the file, use the function fopen. In step 1, you should open in binary read mode ("rb"). In step 6, use binary writing mode ("wb" or "ab"). Look at my other answer for more details.

To do step 2, according to that my old answer, use this:

fseek(fp, 0L, SEEK_END);
int sz = ftell(fp);
fseek(fp, 0L, SEEK_SET);

In step 3, you use a malloc to allocate enough memory for the string and use the fread to read the contents of the file.

A possible way to do step 4 would be:

  • Make a function that fetches a string within another string. int busca_string(char *agulha, char *palheiro, int tamanho_agulha, int tamanho_palheiro). The analogy used is to find a needle in the haystack, where the haystack is the contents of the file and the needle is what you look for there.

  • In this function, you can use two loops for one within the other. The external loop traverses each character of the file’s read string (the palheiro). The internal loop compares from the position of the external loop, if the characters found in the palheiro match the same string of characters you are looking for (Nome:, which is the agulha).

  • Use break in the inner loop when what you find in the palheiro does not match what you are examining in agulha.

  • After the end of the internal loop, but still inside the external loop, check if the internal loop is over, and if it is over, give a return 1;.

  • If the outer loop ends, give a return 0;.

  • Be careful not to access memory beyond the limit of either of the two strings.

Step 5, I don’t know. You did not say how you will know where the name you were looking for ends so that you can separate it from the subsequent content. However the initial position of this content is the resulting position of step 4 plus the size of the agulha.

For step 7, use fwrite.

For step 8, use fclose. Don’t forget to call free for each malloc.

If you prefer to use text instead of binary mode ("r", "w" and "a" instead of "rb", "wb" and "ab"), swap the fwrite for fprintf and the fread for fgets. But in this case, the size of the allocated memory area may end up being insufficient if line-break conversions of the type occur \r -> \r\n or \n -> \r\n. Therefore, I recommend using binary mode for reading. In writing, use whatever mode you think best (but to avoid surprises, you might want to stick to binary).

  • Thank you very much Victor! Cleared up quite the ideas here.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.