Ignore the first line

Asked

Viewed 902 times

3

I am using the following command to organize a file:

sort -u arquivo.csv -o arquivo.csv

But I need to ignore his first line, in case the header.

How can I do that?

  • 1

    Reading and writing in the same file simultaneously does not usually work (although I have never read the documentation of sort to be sure for this command). In this case, you would want to separate the header and sort the rest of the file?

  • Exactly that!!

2 answers

3

You can use the utility tail to skip the first line of the input file before redirecting its output to sort, look at you:

$ tail -n +2 arquivo.csv | sort -u -o arquivo.csv

According to the utility documentation sort, the sorting process is done in memory, before opening/writing the output file (specified with the option -o), this guarantees an ordination in place of the file securely.

Note that the first line of the file is destroyed with the execution of that command.

Reference: https://unix.stackexchange.com/questions/29697/does-sort-support-sorting-a-file-in-place-like-sed-in-place

  • 2

    Good to know that the -o outfile is Lazy. On second thought, every ordering requires this, because I can only know who the first element is after finishing the operation on the entire dataset

2


Use the command tail with the option -n:

tail -n +2 arquivo.csv

In the case, -n +2 will display only from the second line onwards (you can change the number to any value you need).
The sign of + is also important, because if you write only -n 2, it will show the last two lines.

Then you can pass the result directly to the sort:

tail -n +2 arquivo.csv | sort -u > arquivo_ordenado.csv

Also, I recommend saving the output in another file (as I did above), since reading and writing in the same file doesn’t usually work very well (by own experience, depending on the command, the file is overwritten and/or truncated and you end up losing the data - I don’t know how the sort treat it, but I’d rather not risk).

To answer from Lacobus answers this specific question about the sort.


Keep the header

As well remembered in the comments, with the above command, the first line (header) is lost. If you need it in the final file, one way to do it is to break in two commands.

First save the first line in the final file, with the command head:

head -1 arquivo.csv > arquivo_ordenado.csv

The parameter -1 says to pick up only the first line. Then do the tail and sort, and add the result to the file:

tail -n +2 arquivo.csv | sort -u >> arquivo_ordenado.csv

Notice I’m wearing it now >>, which adds the information at the end of the file. If I use only >, the contents of the file is overwritten and the header will be lost.

  • Only left to save the header to put it in the ordered file.

  • @Jeffersonquesado In fact, when I read "I need to skip the first line", understood that she did not need to be in the final file, but it makes sense that she is. I updated the reply, thank you!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.