Assuming you’re using Curl to pull the HTML from an address, and then use PHP to collect certain collected HTML data:
You can use the class DOMDocument
to do the parse HTML, find the tag <a/>
and collect the value of the attribute href
.
Then making use of the function parse_url()
can extract the query string of the same, being this what you intend:
Example
// o HTML que recolheste
$html = '<html>
<head></head>
<body>
<a href="https://www.site.com/user.asp?ref=fvFCF9D8N4Ak">bubu</a>
</body>
</html>';
// Instanciar o DOMDocument
$dom = new DOMDocument;
// Carregar o HTML recolhido para o DOMDocument
@$dom->loadHTML($html);
// Percorrer o DOM e por cada tag 'a' encontrada
foreach ($dom->getElementsByTagName('a') as $tag) {
// apanhar o valor do atributo 'href'
$href = $tag->getAttribute('href');
// se não estiver vazio
if (!empty($href)) {
// guardar a query string numa variável
$queryString = parse_url($href, PHP_URL_QUERY); // Resultado: ref=fvFCF9D8N4Ak
}
}
See example working on Ideone.
If you only have the HTML present in the question, the method is exactly the same:
$html = '<a href="https://www.site.com/user.asp?ref=fvFCF9D8N4Ak">';
$dom = new DOMDocument;
@$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('a') as $tag) {
$href = $tag->getAttribute('href');
if (!empty($href)) {
$queryString = parse_url($href, PHP_URL_QUERY); // Resultado: ref=fvFCF9D8N4Ak
}
}
See example working on Ideone.