0
I’m using gift in PHP to get the link of a tag , where through "getattribute" I can get that link by the href attribute.
Script by Crawler:
<?php
//carregamento da url
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile("http://www.linkdosite.com.br");
//pega somente os links
$links = $dom->getElementsByTagName('a');
//array que armazena o valor do crawler
$getLink = array();
$nlinks = 0;
foreach ($links as $pegalink) {
//aqui pega cada link
$link = $pegalink->getAttribute('href');
$termo = 'detalhe';//Termo para diferenciar dos demias links e pegar apenas os que contenham o termo
$pattern = '/' . $termo . '/';//Padrão a ser encontrado na string $link
if (preg_match($pattern, $link)) {
$getLink[$nlinks] = $link;//Atribui o link ao array $getLink
echo $getLink[$nlinks]."<br>";//Imprime o link na tela
$nlinks++;
}
}
Now, I also need to take the string that is inside the 'a' tag, I couldn’t find any example to help me solve this.
Block I picked up via Crawler:
<a href="link">
<font style="font-size: 14px;" color="black" face="arial"><b>String que eu quero pegar</b></font>
</a>
Which class are you using ? but it’s probably something like jQuery ('a font b')->html();
– AnthraxisBR
has no class, this is a Crawler from another site...need to happen everything on the server side
– Charles Fay
Yes yes, but you’re not using a PHP class to access the DOM ? for example I use this class: https://github.com/punkave/phpQuery
– AnthraxisBR
No...I am using the very gift of PHP: http://php.net/manual/en/book.dom.php
– Charles Fay