How to get the values within multiple tags?

Asked

Viewed 871 times

1

I have the following html page:

<!DOCTYPE html>
<html>

    <head>
        <title>Exemplo</title>
    </head>
    <body>
        <div id="text">Valor 1</div>
        <div id="text">Valor 2</div>
        <div id="text">Valor 3</div>
    </body>

</html>

I’m using the following PHP function to pick up the text between a tag:

    function capturar($string, $start, $end) {
    $str = explode($start, $string);
    $str = explode($end, $str[1]);
    return $str[0];
}

Example of use:

 <?php
$url = file_get_contents('http://localhost/exemplo.html');
$valor = capturar($url, '<div id="text">', '</div>');
echo $valor;

However, when there is more than one identical tag with the text between them different, it only takes the text between the first tag.

What I would do to get all texts between that tag (<div id="text">, </div>) ?

  • Using regex you could achieve something much more precise.

  • It is highly recommended not to change the question, and there is an answer to that. Ask a new question if you have another question.

  • Pedro, if you have other questions, open a new question, but do not edit by completely changing an existing one, especially when there are already answers. Incidentally, you’ve already asked for it here. Take the [tour] to understand how the site works.

1 answer

4

PHP already has native functions to handle HTML. I don’t believe that using REGEX for this purpose, is recommended.

First you take HTML, using file_get_contents or cURL, how are you using the file_get_contents leave so:

$html = file_get_contents('http://localhost/exemplo.html');

Then, supposing that there was no error in getting the content, create a DOM and an Xpath of that content, so we have how to manipulate it:

$DOM = new DOMDocument;
$DOM->loadHTML($html);
$XPath = new DomXPath($DOM);

Now, just search what we want by using Xpath:

$divs = $XPath->query('//div[@id="text"]');

If this is found, we can loop it. Already to display the content we use the nodeValue:

foreach($divs as $div){
    echo $div->nodeValue;
    echo '<br>';
}

In the end you will have:

$html = file_get_contents('http://localhost/exemplo.html');

$DOM = new DOMDocument;
$DOM->loadHTML($html);
$XPath = new DomXPath($DOM);

$divs = $XPath->query('//div[@id="text"]');

foreach($divs as $div){
    echo $div->nodeValue;
    echo '<br>';
}

Upshot:

Valor 1
Valor 2
Valor 3

Also, you should not repeat the same id. Ids must be unique, have more than one element named texto is incorrect.

  • No, that’s right. I confused myself with another question he asked. A lot of confusion.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.