Picking up div’s contents inside an HTML

Question

Picking up div’s contents inside an HTML

Asked 5 years, 9 months ago

Viewed 134 times

2

How do I get all the values inside <div class='conteudo'></div>?

I’ve tried it like this:

$links = "<ul><li>CONTEUDO
<div class='conteudo'>CORPO 1</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 2</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 3</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 4</div>
</li></ul>";

$conteudo2 = explode('</li>', $links);

foreach($conteudo2 as $key) {    
    echo ''.$key.'';    
}

But it takes all the content of the tags, not just the value from inside the div.

1 answer

Browser other questions tagged php html

You are not signed in. Login or sign up in order to post.

by hkotsubo • **55,826** points · Answer 1 · 2019-11-15T13:33:35+00:00

The problem of using explode is that it breaks the string without taking into account the semantics of HTML (i.e., the meaning of each tag, the separation between what is a tag and what is the content of it, etc).

To manipulate an HTML content the way you need it, you can use DOMDocument:

$links = "<ul><li>CONTEUDO
<div class='conteudo'>CORPO 1</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 2</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 3</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 4</div>
</li></ul>";

$dom = new DOMDocument();
$dom->loadHtml($links);
$xpath = new DOMXPath($dom);
// procura elementos div com classe "conteudo"
foreach ($xpath->query('//div[@class="conteudo"]') as $div) {
    echo $div->textContent. "<br>";
}

So I look for all the elements div that have the class "content" (using the syntax of XPATH), and print their respective values. The output of the above code is:

CORPO 1
CORPO 2
CORPO 3
CORPO 4

The above code works if inside the div only has a simple text. But if inside the div have other tags and you want all this content, you need to use an auxiliary function to get the HTML of the internal content (the function below has been taken from here):

$links = "<ul><li>CONTEUDO
<div class='conteudo'>CORPO 1</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'>CORPO 2</div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'><p>CORPO 3 <span>teste com <strong>outras tags</strong></span> dentro do div</p></div>
</li></ul>
<ul><li>CONTEUDO
<div class='conteudo'><span>CORPO 4</span></div>
</li></ul>";

function innerHTML(DOMNode $element) { 
    $innerHTML = ""; 
    $children  = $element->childNodes;
    foreach ($children as $child) { 
        $innerHTML .= $element->ownerDocument->saveHTML($child);
    }
    return $innerHTML; 
}

$dom = new DOMDocument();
$dom->loadHtml($links);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//div[@class="conteudo"]') as $div) {
    echo innerHTML($div). "<br>";
}

The exit is:

CORPO 1
CORPO 2
<p>CORPO 3 <span>teste com <strong>outras tags</strong></span> dentro do div</p>
<span>CORPO 4</span>