5
I need a command that overwrites a specific pattern on each line of a file as many times as necessary until the pattern is no longer found.
For example, in a CSV file, fields are separated by a semicolon ;
.
Null fields have no character, as in the following file representing a contact list with 3 records:
Nome;Sobrenome;Telefone1;Telefone2;Email
Joao;Silva;9999-8888;9292-9292;[email protected]
Maria;Souza;8899-0011;;[email protected]
Carlos;Oliveira;;;
The first line is the file header. The contact Maria Souza
owns the Telefone2
null and void contact Carlos Oliveira
has null the fields Telefone1
, Telefone2
and Email
.
I want to add \N
where the field is null.
On Linux, I use the command:
$ sed -e 's/;;/;\\N;/g' -e 's/;$/;\\N/' arquivo.csv > novo-arquivo.csv
The result is satisfactory for the record Maria Souza
, but not for the Carlos Oliveira
, because by finding the first pattern ;;
and performing the substitution (Carlos;Oliveira;\N;;
) it does not consider the substitute text in the continuation of the research and already passes to the next standard, which is the ;$
, leaving the result this way:
Carlos;Oliveira;\N;;\N
Remaining a null field yet.
I would like a solution for both Unix and Windows.
I don’t think it’s a good idea to treat a CSV file with a regular expression, but, as you probably already know this and should simply be processing data to provide to another program, I’ll let it go ;)
– motobói