How to create a script to automate link exchange in HTML?

Asked

Viewed 533 times

0

I am performing maintenance on a system that has on a page hundreds of links as follows:

<li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>

Note that there is a text outside the tag a. What I want to do, put this text inside the tag a with your supposed href, the above example would look like this:

<li> <a href="http://exemplo.com"> Revista alvo </a> </li>

I’m making this exchange at hand but there are hundreds and hundreds of links, which makes the task tiring. Does anyone have any idea how I can make a script to accomplish this task? It can be in any language, PHP, JS etc. I have tried to be as clear as possible, if you do not understand I will try to explain again, I need help as it is urgent, Please!

  • 1

    Using Sublime vc would do it well, he tbm uses regular expressions to select areas, I used several times this idea

  • Can you give me an example of how I would do @Weessmith? No regex manjo

  • 1

    Research: <li>([a-zA-Z0-9 ]+)<a(.*?)>(.*?)+<\/li>$ and the replace: <li><a$2>$1</a></li> something like this, see this example: https://regexr.com/3pedc

2 answers

4

Updating: If you do not want to open file by file, you can create a PHP script to scan a directory for HTML file or other case you want. See:

This script will be executed through Terminal / Powershell, then you will receive a parameter which will be the directory to be scanned. Use the function glob to scan the directory, it will receive a parameter that will be "{$Dir}/*.html" and will return an array if it has found something, if it does not, it will return an empty array and false in case of error.

Before using the script below, make a backup!

// Conta quantos argumentos foi informado.
// O primeiro argumento sempre será o nome do arquivo.
$CountArgs = count($argv);

// Verifica se é menor que 2
if ($CountArgs < 2) {
  echo "Informe um diretório!\n\n";
  exit(0);
}
// Verifica se o argumento é um diretório.
else if ( !is_dir($argv[1]) ) {
  echo "O parâmetro informado não é um diretório!\n\n";
  exit(0);
}
// Guarda o argumento na variável.
$Dir = $argv[1];

// Varre o diretório atrás de arquivos html
// depois percorre a array e executa a função.
foreach (glob("{$Dir}/*.html") as $arquivo) {
  alterar_links($arquivo);
}

function alterar_links($Arquivo) {
  // Lê o arquivo, e guarda o conteúdo na variável
  $Conteudo = file_get_contents($Arquivo);
  // Faz a busca usando a expressão regular
  // e modifica usando um callback
  $Alteracoes = preg_replace_callback("|<li>([\w\s]+)<a(.*?)>(.*?)<\/li>|",
    function($retorno) {
      return "<li><a{$retorno[2]}>{$retorno[1]}</a></li>";
    },
    $Conteudo);
  // Abre o arquivo em modo escrita
  $arquivo = fopen($Arquivo,'w+');
  // Escreve as alterações no arquivo
  fwrite($arquivo, $Alteracoes);
  // Fecha
  fclose($arquivo);
}

Important: Note that when making the change, I’m not leaving space between the tag li and the tag a: <li><a{$retorno[2]}>{$retorno[1]}</a></li>. Thus, if the script reads the file again, it makes no changes.

References:


Sublime Text

You can use Regular Expressions to speed up the process, see:

<li>([\w\s]+)<a(.*?)>(.*?)<\/li>$

Explanation:

  • (.*?): Captures text within tag "a" including tag closure
  • <a(.*?)>: Captures "a tag attributes"
  • ([\w\s]+): Captures the text before the "a"

To use in Sublime Text, press CTRL+H afterward ALT+R to activate the Regular Expressions search, in the Find field put the above code, already in the Replace field:

<li><a$2>$1</a></li>

Explanation:

  • $1: Puts the captured text before the "a"
  • $2: Puts captured attributes from tag "a"

Note that I used [\w\s]+ instead of [a-zA-Z0-9 ]+ because then you get back everything that is before, already [a-zA-Z0-9 ]+ will capture only letters, numbers and spaces.

Funcionando

0

Thanks to everyone who helped me, I managed to solve with the script below that reply of international stackoverflow.

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width">
  <title>JS Bin</title>
<script src="https://code.jquery.com/jquery-1.12.4.js"></script>


</head>
<body>
<ul id="linksList">
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo10.com"> http://exemplo.com </a> </li>

  </ul>
  <a href="#" id="changeIt">Change</a>
  <script>
    $(document).ready(function(){     
      $("#changeIt").click(function(){
        $("#linksList li").each(function(){
          txt = $(this).text().split(' http://')[0].trim();
          lnk = $(this).children('a').text(txt)
          $(this).html(lnk)
        })
      })
    })
  </script>
</body>
</html>

Browser other questions tagged

You are not signed in. Login or sign up in order to post.