Delete all lines that have a specific string

Asked

Viewed 2,307 times

1

I would like to delete all lines that have the string Excluído, searching from one specific string to the end of another. Example:

<p class="FolderPath">
    <table class="DiffTable">
        <tr>
                <td class="DiffName">
                    <a name="I36676">
                    </a>
                </td>
                <td>
                    Excluído no destino
                </td>
           </tr>
         </table>
</p>

Delete all lines in the middle of the string Excluído starting the search for the string from <p class="FolderPath"> up to the string </p>. I know how to delete the line containing the string:

sed '/Excluído/d' arquivo.txt

What it would be like to erase N lines ? starting from <p class="FolderPath"> up to the string </p>?

2 answers

2

I made a quick solution here, then I add a more "polished".

sed -n '/DiffTable/,/<\/p>/p' arquivo.html |grep -q Excluído && sed '/"DiffTable"/,/<\/p>/{//!d}' arquivo.html || echo "Palavra buscada nao existe"

Parameters:

sed -n '/Difftable/,/</p>/p' file.html |grep -q Excluded &&

sed will extract all content between Difftable and </p> and with grep we check if the Deleted string exists.

sed '/"Difftable"/,/</p>/{//! d}' file.html

This is where we replace, excluding all Difftable and </p> lines. The parameter {//!d} Keep the lines of the searched tags, if you want to delete, switch to just d

|| echo "Word sought does not exist"

Just to let you know the word wasn’t found in the grep

  • I used the command sed -n '/DiffTable/,/<\/p>/p' arquivo.html |grep -q Excluído && sed '/"DiffTable"/,/<\/p>/{//d}' arquivo.html || echo "Palavra buscada nao existe" without the exclamation in d , he returned to me without the <table class="DiffTable"> and without the </p> but did not delete the other lines

  • Whoa, the replacement part was like this? sed '/"DiffTable"/,/<\/p>/d' arquivo.html Here is working as I understood you wanted, delete all lines between these tags if there is the word Deleted. Let me know if it works,

  • ah yes, the problem was in d , it worked now!

  • @Mrpaper, if you have in the file two paragraphs with tables, one with "Deleted" and the other without, your program will delete both! It seems to me that what is requested is that only the paragraphs containing "excluded".

  • 1

    Opa, Jjoão, truth. If there are two identical tags, one with the string searched and the other without, will delete both. If his file has repetitions of the searched tags, it will not work.

  • I did the test with more than one tag as Jjoão said, really it excludes both. How to proceed?

Show 1 more comment

1


Using Perl:

perl -n0E 'say grep !/Excluído/ , split(/(<p class="FolderPath">.*?<\/p>)/s)'

where:

  • split(/(<p class="FolderPath">.*?<\/p>)/s) divides the input file by paragraphs and its intermediate
  • say grep !/Excluído/ prints the drives that do not contain "Deleted"
  • perl -n0E process the whole file at once
  • I used the command so: perl -n0E 'say grep !/Excluído/ , split(/(<p class="FolderPath">.*?<\/p>)/s)' arquivo.html and returned the following: /Deleted/: Event not found.

  • @Rakiz, you’re on Linux, right? It works well here. I’m sure you’ve included the' ...' ?

  • yes, I’m in Buntu, keeps giving the same feedback: /Deleted/: Event not found .

  • @Rakiz, make sure you’ve written the ' ' that are around the command (really cut and Paste).

  • fail my Jjoão I was testing your command on the terminal, not in a script. Now it did right without deleting the tags when they repeat, how do I use this perl command to modify the file in question? without just it return me the result in the terminal?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.