Find and Replace in bash

Asked

Viewed 109 times

2

I need to find outdated lines in a csv file and replace them with new lines.

These are the commands that find the lines that will be replaced(old) and that will replace(new).

linhas_antigas=$(diff -y arquivo_com_linhas_antigas.csv arquivo_com_linhas_novas.csv | grep -e "|" | awk -F"|" '{ print $1 }')
linhas_novas=$(diff -y arquivo_com_linhas_antigas.csv arquivo_com_linhas_novas.csv | grep -e "|" | awk -F"|" '{ print $2 }' | sed 's/\t *//')

Then I run the following chunk to replace:

while read -r arquivo_antigo 
do
    echo ${arquivo_antigo//"$linhas_antigas"/"$linhas_novas"} 
done < arquivo_com_linhas_antigas.csv

Now the problem... When the diff returns only one line between the two files, replace is done quietly. But if it has two or more lines to update, it does not replace any of them.

I imagine if the variables $linhas_antigas and $linhas_novas were arrays to facilitate the process.

But how to do this? Is there any other solution??

  • Put some examples of the lines. I’ve done something similar, I await examples of lines.

  • Ancient: Gabriel Hardoim;10;52;3 New: Gabriel Hardoim;12;55;3

  • The archives .CSV will always have the same amount rows/records ? If the corresponding row does not exist, it should be removed ? The order of records matters ?

2 answers

2


From what I understood it would be like updating a backup file. Contents of file 1:

A1 B1;10;52;3
A2 B2;12;52;3
A3 B3;10;52;3
A4 B4;10;34;3
A5 B5;10;52;3
A6 B6;10;33;3
A7 B7;08;52;4

Archive content 2:

A1 B1;10;52;1
A2 B2;12;52;2
A3 B3;10;52;3
A4 B4;10;34;3
A5 B5;10;52;5
A6 B6;10;33;6
A7 B7;08;52;4

Script:

#!/bin/bash
# Quantidade de linhas para determinar quantas vezes o laço sera executado
# Poderia ser com "while" dizendo "enquanto o arquivo for diferente um do outro faça"
qt=$(diff -y --suppress-common-lines l1.csv l2.csv | wc -l)
for (( i = 0; i < $qt; i++ )); do
    # Pega sempre a primeira ocorrência, linhas diferentes
    linha=$(diff -y --suppress-common-lines l1.csv l2.csv | head -n1)
    # Pega a linha antiga
    la=$(awk '{print $1,$2}' <<< $linha)
    # Linha nova
    ln=$(awk '{print $4,$5}' <<< $linha)
    # Coloca o conteúdo do arquivo na variável
    arq=$(cat l1.csv)
    # Faz a substituição da linha antiga pela nova
    arq=${arq//$la/$ln}
    # Coloca a alteração dentro do arquivo original
    echo "$arq" > l1.csv
done

Exit:

A1 B1;10;52;1
A2 B2;12;52;2
A3 B3;10;52;3
A4 B4;10;34;3
A5 B5;10;52;5
A6 B6;10;33;6
A7 B7;08;52;4

This above method makes the change line by line, if I understand well, and you want to update a file taking into account another file, could do as follows:

#!/bin/bash
if [[ -n $(diff -q l1.csv l2.csv) ]]; then
    cat l2.csv > l1.csv
fi

Your script wasn’t working because you put all the different lines inside the variable, you have to do it line by line, so when you had just a different line it worked.

1

You can implement a script in gawk to process your files, for example:

BEGIN{
}

{
    if( FNR == NR )
    {
        a[FNR] = $0;
        next;
    }

    print (a[FNR] == $0) ? a[FNR] : $0;
}

END{
}

Or, in a line:

$ awk 'FNR==NR{a[FNR]=$0;next}{print a[FNR]==$0?a[FNR]:$0}' antigas.csv novas.csv

Assuming that the files .CSV input are something like:

ancient.csv:

JESUS DE NAZARE;15;21;1
MARIA MAGDALENA;12;52;3
JOAO DE DEUS;33;52;5
MATUZALEM DA COSTA;10;34;7
MICHAEL JACKSON;10;28;2
DINO DA SILVA SAURO;16;32;4
FULANO DE TAL;84;25;6

new.csv:

JESUS DE NAZARE;15;21;8
MARIA MAGDALENA;12;52;3
JOAO DE DEUS;33;52;5
MATUZALEM DA COSTA;15;34;7
MICHAEL JACKSON;10;28;2
DINO DA SILVA SAURO;14;32;9
FULANO DE TAL;84;25;6

Exit:

JESUS DE NAZARE;15;21;8
MARIA MAGDALENA;12;52;3
JOAO DE DEUS;33;52;5
MATUZALEM DA COSTA;15;34;7
MICHAEL JACKSON;10;28;2
DINO DA SILVA SAURO;14;32;9
FULANO DE TAL;84;25;6

Browser other questions tagged

You are not signed in. Login or sign up in order to post.