Convert file to txt

Asked

Viewed 603 times

0

I had the search on the net but found nothing concerning to do upload of any file and convert it to txt and then do a search for the text you have inside and get results.

Does anyone know how to do what they can do links for me to study the case.

2 answers

2


Convert binary files to .txt will not make a search work, each file has its own format.

For each file type you will have to use a method to extract the data and save them in one .txt, some example:

  1. XML use http://php.net/manual/en/class.domdocument.php

    Example:

    //Caminho que o seu arquivo xml foi salvo
    $xml = file_get_contents('arquivo.xml');
    $frases = array();
    
    $dom = new DOMDocument;
    $dom->loadXML($xml);
    $books = $dom->getElementsByTagName('*');
    foreach ($books as $book) {
        $frases[] = $book->nodeValue, PHP_EOL;
    }
    
    //Salve o $vetor em um txt, assim:
    file_put_contents('arquivo.xml.txt', implode(' ', $frases));
    
  2. CSV use http://php.net/manual/en/function.fgetcsv.php

    //Caminho que o seu arquivo xml foi salvo
    $handle = fopen ("arquivo.csv", "r");
    $frases = array();
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $frases = array_merge($frases, $data);
    }
    fclose ($handle);
    
    file_put_contents('arquivo.csv.txt', implode(' ', $frases));
    
  3. XLS will probably have to use a library to facilitate how https://code.google.com/p/php-excel-reader/ or http://sourceforge.net/projects/phpexcelreader/

These are just a few examples, for each format you implement in your application you will have to use a new script.

I believe that there is no "magic" solution ready for this, you will have to take what exists and create an application based on this.

To make the appointment, let’s assume that you saved everyone .txt in a folder, then you should do a search similar to this:

$consulta = 'Palavra';
$arquivos = array();

if ($dh = opendir($dir)) {
    while (($file = readdir($dh)) !== false) {
        if (is_file($dir . $file)) {
            $data = file_get_contents($dir . $file);
            if (stripos($data, $consulta) !== false) {
                 $arquivos[] = $file;
            }
        }
    }
    closedir($dh);
}

echo 'A consulta "', $consulta, '" encontrou ', count($arquivos), ': ', implode(', ', $arquivos);
  • 3

    And if it is an image you will need to do OCR. And if it is PDF you will need (...). And if it is (...) you will need OCR (...).

0

I’m not sure I understand your question, but if you want the list of files listed in a txt file, use the following command:

exec( "dir $diretorio /s > $NomeArquivo.txt" );

I hope I’ve helped.

Hug

Browser other questions tagged

You are not signed in. Login or sign up in order to post.