Read a portion of the file contents

Question

Read a portion of the file contents

Asked 7 years, 1 month ago

Viewed 197 times

-1

I have a large. txt file which is basically like this:

1000#
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.#

1001#
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.#

I want to read only the respective requested ID, but only the text between the tags.

Example, I want to fetch the value of the text 1001. Ai would be returned this way:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.

Here is the code that to try to implement:

$file = "id.txt";
$f = fopen($file, 'rb');
$found = false;
while ($line = fgets($f, 1000)) {
    if ($found) {
       echo $line;
       continue;
    }
    if (strpos($line, "1000") !== FALSE) {
      $found = true;
    }
}

With it I can reach the ID value, but read everything from this direction down! ! I want it to stop at the end of #, IE, read the text between #TEXT#.

4

Add to your question the attempts you have already made.

– Pagotti

2018/03/19 at 10:43
Excuse my question, I totally forgot my attempts.

– Marcelo Cordeiro

2018/03/19 at 14:40

2 answers

0

There are several means such as:

With regular expressions:

Search for a single record

$data = '1000#
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.#

1001#
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.#';

function buscarTexto($id, $data) {
    $re = '/(?<=^'.$id.'#\r?\n).+?(?=#$)/sm';

    return preg_match($re, $data, $match) ? $match[0] : null;
}

Brief explanation of the regular expression

/
(?<=^ #Casa a linha do id registro sem capturá-la
    $id #id do registro
#\n)
.+? #Casa o texto e o captura
(?=#$) #Casa o delimitador final do texto sem capturá-lo
/smx
#o flag s faz com que o ponto capture as quebras de linhas
#o flag m faz com que os operadores ^ e $ casem o inicio e final das linhas
#o flag x não foi utilizado no código mas faz com que o espaços presentes na expressão não sejam considerados.

Load all records into an array where the key is id.

function buscarTextos($data) {
    $re = '/(?<id>^\d+)(?:#\r?\n)(?<text>.+?)(?:#$)/sm';
    $result = [];

    if(preg_match_all($re, $data, $matches, PREG_SET_ORDER))
        //Converte o resultado da expressão em um array id => texto
        foreach($matches as $m) 
            $result[$m['id']] = $m['text'];

    return $result;
}

Brief explanation of the regular expression

/
(?<id>^\d+) #Armazena id do registro em um grupo separado
(?:#\n) #Casa o delimitador do id e a quebra de linha
(?<texto>.+?) #Armazena o texto em outro grupo
(?:#$) #Casa o delimitador final do texto
/gsmx
#o flag g faz com que sejam capturadas todos os registros
#o flag s faz com que o ponto capture as quebras de linhas
#o flag m faz com que os operadores ^ e $ casem o inicio e final das linhas
#o flag x não foi utilizado no código mas faz com que o espaços presentes na expressão não sejam considerados.

No regular expressions

Read row by row and return the desired record

function buscarTexto2($id, $data) {
    $rows = explode("\n", $data);
    $id = $id.'#';
    $text = '';
    $found = false;

    foreach($rows as $r) {
        //Remove possíveis espaços em branco
        $r = trim($r);

        //Verifica se a linha corresponde ao id do registro selecionado
        if($r === $id) 
            $found = true;
        //Caso tenha encontrado o registro
        elseif($found) {
            //Aqui é assumido que o texto pode ter diversas linhas
            $text .= $text == '' ? $r : PHP_EOL.$r;

            //Então caso a linha lida termine com $
            if(substr($text, -1, 1) == '#')
                //Retorna o texto
                return substr($text, 0, -1);
        }
    }
}

Read row by row and store all records in an array where the key is id.

function buscarTextos2($data) {
    $rows = explode("\n", $data);
    $id = null;
    $text = '';
    $result = [];

    foreach($rows as $r) {
        //Remove possíveis espaços em branco
        $r = trim($r);

        //verifica se algum registro está sendo processado no momento
        if($id === null) {

            //Ignora linhas em branco caso nenhum registro esteja sendo processado no momento.
            if(!$r)
                continue;

            //Armazena o id e desconsidera o último caractere que é o $
            $id = substr($r, 0, -1);
        } else {
            //Aqui é assumido que o texto pode ter diversas linhas
            $text .= $text == '' ? $r : PHP_EOL.$r;

            //Então caso a linha lida termine com $
            if(substr($text, -1, 1) == '#') {
                //Adiciona o registro ao array
                $result[$id] = substr($text, 0, -1);

                //E se prepara para o processamento de um novo registro
                $id = null;
                $text = '';
            }
        }
    }

    return $result;
}

In all functions it was assumed that #(hashtag) is the last character of the line.

You can test the code on the following link http://phpfiddle.org/main/code/y1kq-jg7w

If you want to use them with the file content do the following:

$dados = file_get_contents('caminho do arquivo');

$texto = buscarTexto(1000, $dados);
//ou
$texto = buscarTexto2(1000, $dados);
//ou
$textos = buscarTextos($dados);
//ou
$textos = buscarTextos2($dados);

Hello Hwapx, thanks for the collaboration. I think I was not very clear in my question. I noticed that using the $(dolar) character can give coflito, as it can be interpreted by the command in another way. I preferred to change from $ to # thus: 1000# Lorem ipsum dolor sit Amet, consectetur adipiscing Elit, sed do eiusmod tempor incididunt. # 1001# Lorem ipsum dolor sit Amet, consectetur adipiscing Elit, sed do eiusmod tempor incididunt. # I also forgot to mention that this text is inside an .txt. file and I am looking for a way to try to read only the text without the ID

– Marcelo Cordeiro

2018/03/19 at 15:06
I changed the functions to use #, the functions searchText and searchTexto2 return only the text of the record, in the link http://phpfiddle.org/main/code/y1kq-jg7w you can test them and see their result, I made them to receive the text directly which you can get from the file with $conteudo = file_get_contents('caminho do arquivo').

– HwapX

2018/03/19 at 15:17
Here there is no line with this code $content = file_get_contents('file path'), could resend again?

– Marcelo Cordeiro

2018/03/19 at 15:23
I added an example of reading the file in the reply.

– HwapX

2018/03/19 at 15:30
Still not being read, nothing appears. Remember that it is to read between the two #. 1001#TEXT HERE# 1002#TEXTHERE 2#

– Marcelo Cordeiro

2018/03/19 at 15:37
Did you set the file path? based on your question should look like $file = 'id.txt';
$dados = file_get_contents($file); try to see if the file is being read by placing a var_dump($dados); in the line following the file_get_contents.

– HwapX

2018/03/19 at 15:46
When I add var_dump($data); returns multiple IDS, but no request

– Marcelo Cordeiro

2018/03/19 at 15:53
The strange thing about putting all the text contained inside the id.txt file in $data =' '; it returns without problems, but does not do the same by accessing the file.

– Marcelo Cordeiro

2018/03/19 at 16:00
It can be due to the format of the end of the file line, updated the functions in the answer, perform a new test with all of them.

– HwapX

2018/03/19 at 16:03
My dear friend Hwapx, I realized now the reason for the errors, within the text there are, probably special characters that are harming. See here the file I want to work on: http://brasilro.com/id.txt

– Marcelo Cordeiro

2018/03/19 at 16:04
There have actually been problems with the functions that use regular expression, however I was successful with the ones that do not use it(searchTexto2 and searchTexts2).

– HwapX

2018/03/19 at 16:25
All settled! Thank you very much! I can have you as a friend?

– Marcelo Cordeiro

2018/03/19 at 16:40
@Marcelocordeiro Claro

– HwapX

2018/03/20 at 03:05

Show 8 more comments

Browser other questions tagged php

You are not signed in. Login or sign up in order to post.

by Israel Merljak • **943** points · Answer 1 · 2018-03-19T12:15:17+00:00

To solve this problem you can use regular expressions.

As you apparently already own the id that you want to read and it is quite obvious the pattern that follows this text file (if all the content is formatted that way). Soon you can write a regular expression like this:

/[valor_do_id]\$\n(.*)\$/

An example using the function preg_match of PHP would be more or less that way:


  $conteudo_do_arquivo = '....';
  $id = 1000;

  $matches; // vai armazenar os resultados da regex.
  preg_match("/" . $id . "\$\n(.*)\$/", $conteudo_do_arquivo, $matches);

   // [0] vai conter toda a string compativel com o regex
   // [1] vai conter apenas o valor do 1º grupo de captura, tudo entre '()'.
  print_r($matches);

You can read more about the preg_match function on php.net

You can also use the Regexr to test your expressions beforehand.