Mini search engine

Asked

Viewed 91 times

6

I am making a mini search system for urls that are stored in a.txt file, one per line, my problem is the performance issue because I have 5000 urls stored (only) and it is already slow in searching, on the server side I have:

if(isset($_GET['search'])) {
    $urls = array();
    if(trim($_GET['search']) != '') {
        $urls = search_in_urls($_GET['search']);
    }
    echo json_encode($urls);
}
function search_in_urls($search) {
    $file = fopen("urls.txt", "r");
    $urls = array();
    while(!feof($file)){
        $url = fgets($file);
        if (strpos($url, $search) !== false) {
            $urls[] = $url;
        }
    }
    return $urls;
}

Vi this reply accepted in the Soen and would like to implement it, but how to return all lines where a match with my $search?

The search solution I saw that I would like (unless there are better options) to implement was:

if(exec('grep '.escapeshellarg($search).' ./urls.txt')) {
    // do stuff
}

That is, there is some way not to go all the lines looking for a match (set of chars equal to the one I wrote) and return that corresponding line in full? Or that at least for this search to be much faster?

  • You will have to use some indexing mechanism. Is this only for study purposes or is it for a real system? Because if it’s for a real system I suggest using ready-made tools like Elasticsearch.

  • It’s just for fun/curiosity, I also know that the performance would be better if I had the urls stored in a database, that’s what you’d do in a real project. Obgado @Viniciuszaramella

  • Good luck: http://homepages.dcc.ufmg.br/~Nivio/cursos/ri16/transp/Indexing.pdf

  • There is no shell command that can be used? that returns the line(s) all(s) where there is match? @taiar. That way you could inject the command into the exec(...

  • 1

    The grep you posted does just that. I can’t figure out the result you’ll get in terms of performance. According to the documentation http://php.net/manual/en/function.exec.php. you must pass a second "output" variable in which the lines will be inserted and it was able to find a match.

  • 1

    @Viniciuszaramella We were with the same doubt, I think it only lacked him to pass a same output. See my answer.

Show 1 more comment

1 answer

5


If I understand the question correctly, you want it:

$matches = array();
if(exec('grep '.escapeshellarg($search).' ./urls.txt', $matches)) {
    // $matches contém as linhas encontradas
}

Just pass to the exec reference to an array where the output will be stored. And the output of the grep are just the lines of the file where there were regular expression Matches.

  • But does this return the whole line(s) to(s)? . Obgado, already I will test

  • Yes, all in all. It’s like grep works.

  • Really the problem was where the results were going. Obgado once again

Browser other questions tagged

You are not signed in. Login or sign up in order to post.