Deleting lines only if any cell in the worksheet is empty (using a script)

Asked

Viewed 480 times

1

I have a spreadsheet from Calc (libreoffice)/Excel (actually it’s a text file that I opened as a spreadsheet for easy viewing) that contains some blank cells (not the entire line):

planilha

I would like it if any cell on the right is empty, the entire line is erased. It is a few thousand lines, so it is impossible to do manually.

  • Is this a csv file? If so, I think I have a regular expression for you

  • 1

    @Jefferson Quesado: It is not csv but it is easy to save it as . csv!

  • Now that I realized, in the title may be that any blank cell excludes the line, but in the text only the cell on the right that excludes

2 answers

4

We want to select only lines that have two filled cells in a csv. For this we can use the following regular expression :

^..*,..*$

I am using simple regular expressions, so that even processors that do not implement the metacharacter + can understand. Compatible with grep and sed.

Use ..* has the same result as .+: it is necessary to have at least one character, it cannot be the empty string. Thus, we guarantee that both columns have value.

If a column separator other than , (as in other DSVs), only change the separation character. This solution does not guarantee generalized cells, where the separator can be inside quotes or escapes.

3


I ran a test like this to simulate your blank cells:

>>> x = [1491828000,1491828180,1491828360,1491828540,1491828720,1491828900,1491829080,1491829260,1491829440,1491829620,1491829800,1491829980,1491830160]
>>> y = ["1", "2", "3", "", "5", "", "", "", "9", "10", "", "12", ""]

And I ran a CSV like this:

with open('teste.csv', 'a+') as f:
    for i in range(0, len(x)):
            if y[i]: f.write(str(x[i]) + "," + str(y[i]) + "\n")

Stayed like this:

Resultado

Browser other questions tagged

You are not signed in. Login or sign up in order to post.