"- [...] a page with several types of links [...]".
As you said it is a page with several links, I assume(imos) that these links are an element a
attribute-ridden href
and such...
As I mentioned in the comment, to use regular expression the ideal is that the links follow a pattern. In case those links are on a page, I made this regular expression by delimiting the link by quotation marks that normally delimit the attribute href
, which is where the link address is (and bla bla bla):
\"((http|https)\:\/\/([^\"]+)\.key)\"
Test in Regex101.com:
Now just capture the group $1
in your script:
<?php
$conteudo_da_pagina = '<a href="http://site.com/palavra.key"></a>
<a href="https://site.com/palavra.key"></a>
<a href="http://subdominio.site.com/palavra.key"></a>
<a href="https://subdominio.site.com/palavra.chave"></a>
<a href="http://subdominio.site.com/uma_coisa_qualquer.keykey"></a>
<a href="https://subdominio.site.com/outra_coisa_qualquer.key"></a>
<a href="https://subdominio.site.com/outra_coisa_qualquer.keys"></a>';
preg_match_all('/\"((http|https)\:\/\/([^\"]+)\.key)\"/', $conteudo_da_pagina, $ocorrencias);
print_r($ocorrencias[1]);
/*
Array
(
[0] => http://site.com/palavra.key
[1] => https://site.com/palavra.key
[2] => http://subdominio.site.com/palavra.key
[3] => https://subdominio.site.com/outra_coisa_qualquer.key
)
*********************************************************************/
@Edit (according to the question):
<?php
// $texto_da_busca = file_get_contents('...');
$texto_da_busca = '#EXTINF:-1 tvg-logo="http://www.brandemia.org/sites/default/files/sites/default/files/axn_logo_antiguo.jpg" group-title="Cine",AXN (MX)
http://live.izzitv.mx/Content/HLS/Live/Channel(AXN)/index.key
#EXTINF:-1 tvg-logo="https://i.imgur.com/Wrgs4X2.png" group-title="Cine",AMC (MX)
http://live.izzitv.mx/Content/HLS/Live/Channel(AMC_HD)/index.key';
preg_match_all('/((http|https)\:\/\/(.*?)\.key)/', $texto_da_busca, $matches);
print_r($matches[1]);
/* Retorna:
Array
(
[0] => http://live.izzitv.mx/Content/HLS/Live/Channel(AXN)/index.key
[1] => http://live.izzitv.mx/Content/HLS/Live/Channel(AMC_HD)/index.key
)
***************************************************************************/
Have you tried in any way? Only regular expression is enough to cure your doubt?
– LipESprY
look I researched yes on regular expression, but I could not at all do something that worked
– Paulo Vitor
So. Regular expression is very useful when such page content follows a pattern... Besides, it is not difficult to do... An answer based on the question link solves you?
– LipESprY
Yes, all help is welcome
– Paulo Vitor
It is necessary to use
file_get_contents
+preg_match_all
?? If so, give more details: Links are in the elements<a>
? Do you have any identification classes? Do you have any HTML excerpts from the site you are using for extraction? What is your code? If it is not necessary to use the above mentioned functions, I recommend usingDOMDocument
+DOMXPath
.– Valdeir Psr