php regex to get 2 groups of href link

Asked

Viewed 238 times

1

Hello, how to mount a REGEX to get 2 groups of all href link

<a href="/page/page/categoria/page?page=2&amp;publica=1" rel="next">2</a>

Where group 1 would be all link

/page/page/categoria/page?page=2&amp;publica=1

And the second group would be the page number (page=?)

2

My REGEX for how much is so:

href=["][^"]+page=(\d+).+["]
// GRUPO 1: href="/page/page/categoria/page?page=2&amp;publica=1" rel="next"
// GRUPO 2: 2
  • If I understand the only problem with your regex and that it is returning the rel="next"` together?

  • Good evening, I wonder if my answer helped you? If not, report might have had some doubt in the use of it.

2 answers

1

Instead of regex you can use Domdocument, a PHP API that works with XML and HTML, an example would look like this:

$conteudoDoHtml = '<a href="/page/page/categoria/page?page=2&amp;publica=1" rel="next">2</a>';

$dom = new DOMDocument;
$dom->loadHTML($conteudoDoHtml);
$ancoras = $dom->getElementsByTagName("a");
foreach($ancoras as $elementos) {
   echo $elementos->getAttribute('href'), '<hr>';
}

So you would just do a regex to extract the page

$conteudoDoHtml = '<a href="/page/page/categoria/page?page=2&amp;publica=1" rel="next">2</a>';

$dom = new DOMDocument;
$dom->loadHTML($conteudoDoHtml);
$ancoras = $dom->getElementsByTagName("a");
foreach($ancoras as $elementos) {
   $data = $elementos->getAttribute('href');

   echo 'Conteudo de href:', $data, '<br>';

   preg_match('#(&amp;|&|\?)page=(\d+)#', $data, $match);

   echo 'page=', $data[2], '<br>';

   var_dump($match);//Pra visualizar melhor o resultado do preg_match
   echo '<hr>';
}

1

href="([^"]+\?(page=([^&]*))[^"]+)"

See working on Regex101

Basically, it captures href that contains page. And subdivides in the way you want.

match[1] = toda url
match[2] = page=conteudo
match[3] = conteudo

Browser other questions tagged

You are not signed in. Login or sign up in order to post.