How to loop in openMp to count lines from a text file?

Asked

Viewed 435 times

0

How to loop using the library OpenMP to count lines a file?

#pragma omp parallel for
for (string line; getline(file, line); ) {
    count++;
}

This way he doesn’t execute, it seems he only accepts for normally, where the loop from such a number to such a.

  • Do you really need the library? You can do this with pure C++ . See my answer.

  • @Lucashenrique, I do need.

3 answers

2

The problem is that it is not possible to determine the number of iterations at the beginning of the loop. Knowing where each line starts requires having read the previous line and knowing the total number of lines requires that all lines have already been read. Inside the loop you wrote count++;. That is, to know the value of count and increment it needs that the previous iteration has already been completed. Finally, there is nothing parallelizable in this code.

Some solutions to this can be:

  1. Read all the lines of the file in a previous and eternal array over the Paralelized form array in a similar way to how you intended to do.

  2. Map the file into memory pages (the operating system has functions for this), identify the start and size of each line, and store this pair of integers in an array. Finally eternal over the shape array.

  3. Create a production thread that will read the file line by line and one or more consumer threads, which will process the read lines. Here I don’t think Openmp will help, but the standard library has primitive classes that can help.

I am assuming, of course, that the processing of each line is much more costly than the act of reading the file line. This is, however, unlikely. Read/write operations on the disk are the bottleneck in most cases, and you cannot parallelize the disk.

  • I got through this code. int countLines(ifstream &file) {
 
 int count = 0;

 #pragma omp parallel reduction(+:count)
 for (string line; getline(file, line); ) {
 count++;
 }

 return count;
}

  • 1

    @Macario1983 but as I said, the code will not actually run in parallel.

0

Hello, try the following solution pass all file contents to an array, then parallelize the operations in the array, or then try to parallelize with using mpi-IO, I hope it helped.

  • 3

    It would be great if you could add examples for the steps you suggested. (And welcome to Stackoverflow!)

0

You can do this with C++ "pure":

#include <fstream>
#include <string>
int main()
{
    std::ifstream exemplo;
    exemplo.open("exemplo.txt");
    std::string linha;
    for(int i = 0; std::getline(exemplo /*envia*/,linha/*recebe*/); i++);
}
  • 1

    His example is already "pure C++ ". He wants to parallelize with WIPO.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.