Add . html extension at the end of all links in multiple files (with Shell Script)

Asked

Viewed 50 times

1

Hello.

I have a lot of HTML files that have links in the following format: http://localhost:8080/tag:alguma_coisa and I need to add the extension .html to these links so that they stay like this: http://localhost:8080/tag:alguma_coisa.html.

I’ve tried several combinations with findand sed but with none of them I got the expected result.

Does anyone have any idea how to do this with Shell Script?

  • 1

    Are the contents of these files just links? Or are there other things (comments, blank lines, etc...)?

  • It’s HTML source code. What I need is to get all attributes of html tags in the format nome_atributo="tag:algumacoisa" and trade for nome_atributo="tag:algumacoisa.html". For example, I have <a href="tag:city"> and I need to exchange for <a href="tag>city.html">

2 answers

1

Considering files in the current directory in . html format and that within these files there are only links, would look like this:

find . -type f -name *.html -exec sed 's/$/.html/g' {} \;
  • I used the command find . -type f -name "*.htm" -exec sed '/tag:.*"/ s/tag:.*"/&.html/g' {} \; and it was as close as I could get, because the result was this: The original pattern was <a href="tag:city">city</a> and I managed to put the .htmlin the end, like this: <a href="tag:city".html>city</a>. The problem is that he was out of quotation marks.

0


With the help of Marcelodez in the topic of VOL:

# Ajusta os links para arquivos tag:* colocando .html ao final
# Com a importante ajudade de Marcelodez no tópico https://www.vivaolinux.com.br/topico/Sed-Awk-ER-Manipulacao-de-Textos-Strings/Acrescentar-extensao-html-ao-final-de-todos-os-links-em-varios-arquivos-com-Shell-Script/
find $OUTPUTDIR -type f -name "*.html" | xargs sed -i 's/href/\nhref/g' # quebra todos as tag A para que href comece uma nova linha
find $OUTPUTDIR -type f -name "*.html" | xargs sed -i '/^href=".*tag:.*[^#]/ s/"/.html"/2' # coloca .html ao final do valor de href para links com padrão tag:* mas que não tem #
find $OUTPUTDIR -type f -name "*.html" | xargs sed -i '/^href=".*tag:.*#/ s/#/.html#/1' # coloca .html antes do # para links no padrão tag:* com #
find $OUTPUTDIR -type f -name "*.html" | xargs sed -i '/<a/ N;s/\n//' # junta href nba mesma linha de tag

Browser other questions tagged

You are not signed in. Login or sign up in order to post.