You can use the native PHP API called DOMDocument
combined with curl
or file_get_contents
and then use preg_match
, a simple example to understand:
<?php
$meuhtml = '
<script type="text/javascript">
var src = "https:www.site.com";
</script>
<script type="text/javascript">
var src = \'https:www.site2.com\';
</script>
';
$doc = new DOMDocument;
$doc->loadHTML($meuhtml);
$tags = $doc->getElementsByTagName('script');
$urls = array();
foreach ($tags as $tag) {
if (preg_match('#var\s+src(\s+|)=(\s+|)(".*";|\'.*\';)#', $tag->nodeValue, $match)) {
$result = preg_replace('#^["\']|["\'];$#', '', $match[3]);
$urls[] = $result; //Adiciona ao array
}
}
//Mostra todas urls
print_r($urls);
To regex used #var\s+src(\s+|)=(\s+|)(".*";|\'.*\';)#
is who extracts the data returned by $tag->nodeValue
. See working in https://repl.it/Hwt4 (click on the button Run when the page loads).
Of course this was an example to understand the code, to download the data from another site you can use the curl
or whether in your php.ini
the allow_url_fopen
for on
, example with Curl:
<?php
$url = 'http://site.com';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
if (!$data) {
die('Erro');
}
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($httpcode !== 200) {
die('Erro na requisição');
}
curl_close($ch);
$doc = new DOMDocument;
$doc->loadHTML($data);
$tags = $doc->getElementsByTagName('script');
$urls = array();
foreach ($tags as $tag) {
if (preg_match('#var\s+src(\s+|)=(\s+|)(".*";|\'.*\';)#', $tag->nodeValue, $match)) {
$result = preg_replace('#^["\']|["\'];$#', '', $match[3]);
$urls[] = $result; //Adiciona ao array
}
}
//Mostra todas urls
print_r($urls);
Or if you just want to get the first URL change to:
$url = '';
foreach ($tags as $tag) {
if (preg_match('#var\s+src(\s+|)=(\s+|)(".*";|\'.*\';)#', $tag->nodeValue, $match)) {
$result = preg_replace('#^["\']|["\'];$#', '', $match[3]);
$url = $result;
break;// Finaliza o foreach assim que encontrar a url
}
}
echo $url;
Explain better what you’re trying to do.
– RFL
Explain in more detail, so we understand your problem
– Matheus Miranda