1
I use this code to capture links from a particular page:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$resultado = curl_exec($ch);
preg_match_all('/<a href="/(.*)"/i', $resultado, $outros);
However, this regular expression leaves out links such as:
<a name="exemplo" href="link.php"></a>
And if I take the <a
and leave the href
for example:
preg_match_all('/href="/(.*)"/i', $resultado, $outros);
there will pick up improper things like css links for example:
<link href="link.css">
What is the ideal regular expression to capture all href
of the elements a
without having the risk of capturing href
of elements that are not a
, as css
for example?
Thank you very much friend. PERFECT!
– user7438004