Capture the last PHP characters of an XML file

Asked

Viewed 82 times

1

I have a sitemap file from my site.

...
<loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
<loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
<loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
...

I need to capture the last 14 digits of all urls.

I started the following way and stopped:

<?php

function esquerda($str, $length) {
return substr($str, 0, $length);
}

$url = file_get_contents('https://site.com.br/sitemap.xml');

while (strpos($url,'/aluno/') > 0) {
    $url = substr($url,strpos($url,'/aluno/')+14);
    $numero = esquerda($url, 14);
    echo $numero;
    echo "<br>";
}

?>

But I’m not getting it right. Someone gets a solution?

2 answers

2


You have to take the size of string and decrease by $length:

function esquerda($str, $length) 
{
    return substr($str, (strlen($str) - $length), $length);
}

ONLINE EXAMPLE Ideone

<?php

    function esquerda($str, $length) 
    {
        return substr($str, (strlen($str) - $length), $length);
    }

    $n = '1400001';

    echo esquerda($n, 5); // resposta: 00001

That would be the idea upon yours xml missing to know the previous keys, but, I will generate an example:

function esquerda($str, $length) 
{
    $length++;
    return substr($str, (strlen($str) - $length), $length - 1);
}

$xml = '<a>
    <loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
    <loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
    <loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
    </a>';

$simpleXml = simplexml_load_string($xml);

foreach($simpleXml->loc as $loc) 
{
    echo esquerda($loc, 14);
    echo '<br />';
}

if it is a file or address can do simpleXML_load_file($url); which also works.

References:

  • I couldn’t. What would it look like in the code? Remembering that I need to get all the last numbers from all the sitemap urls. I changed the function and the return is: url>

  • @Marcelo has how you put the Xml in your question, what are you really using?

  • It worked like you did! Thank you! I only have one problem: is it possible to put the sitemap url instead of the content? in $xml =

  • 1

    yes of course just do so @Marcelo put a comment to use simpleXML_load_file($url)

0

First you have to read the XML format, then you can use DOMDocument or SimpleXML, for your case, it seems simple the second resolves:

For example:

<?php

$xmlstring = '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
   </url>
   <url>
      <loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
   </url>
   <url>
      <loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
   </url>
</urlset>';

$xml = simplexml_load_string($xmlstring);

foreach ($xml as $tag) {
    var_dump($tag->loc);
}

I made the example assuming it is a sitemap

Now if XML is something like:

<foo>
<loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
<loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
<loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
</foo>

That would be enough:

<?php

$xmlstring = '<?xml version="1.0" encoding="UTF-8"?>
<foo>
<loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
<loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
<loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
</foo>';

$xml = simplexml_load_string($xmlstring);

foreach ($xml as $tagFoo) {
    var_dump($tagFoo);
}

If it comes from a URL you can use $xml = simpleXML_load_file('http://site/sitemap.xml'); instead of $xml = simplexml_load_string($xmlstring);

Whatever format it is, just adjust to read, now we go to the point to get the last part of the string.

I could do a function like one of these two:

  • With substr

    function final_str($str, $delimitador = '/') {
        $str = trim($str, $delimitador); //Remove o delimitador do final
    
        $posicao = strrpos($str, $delimitador) + 1; //Pega a posição do ultimo delimitador (no seu caso /)
    
        return substr($str, $posicao); // Remove tudo antes do delimitador incluindo o delimitador
    }
    
  • With explode

    function final_str($str, $delimitador = '/') {
        $str = trim($str, $delimitador); //Remove o delimitador do final
    
        $partes = explode($delimitador, $str); //Separa a string em partes
    
        return $partes[count($partes) - 1]; //Pega a parte final da string
    }
    

And wear it like this:

$xml = simplexml_load_string($xmlstring);

foreach ($xml as $tag) {
    var_dump( final_str($tag->loc) );
}

Or so if it comes from a URL:

$xml = simplexml_load_string('http://site.com/sitemap.xml');

foreach ($xml as $tag) {
    var_dump( final_str($tag->loc) );
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.