Command to replace characters recursively


Viewed 2,176 times


I need a command that overwrites a specific pattern on each line of a file as many times as necessary until the pattern is no longer found.

For example, in a CSV file, fields are separated by a semicolon ;.
Null fields have no character, as in the following file representing a contact list with 3 records:

Joao;Silva;9999-8888;9292-9292;[email protected]
Maria;Souza;8899-0011;;[email protected]

The first line is the file header. The contact Maria Souza owns the Telefone2 null and void contact Carlos Oliveira has null the fields Telefone1, Telefone2 and Email.

I want to add \N where the field is null.

On Linux, I use the command:

$ sed -e 's/;;/;\\N;/g' -e 's/;$/;\\N/' arquivo.csv > novo-arquivo.csv

The result is satisfactory for the record Maria Souza, but not for the Carlos Oliveira, because by finding the first pattern ;; and performing the substitution (Carlos;Oliveira;\N;;) it does not consider the substitute text in the continuation of the research and already passes to the next standard, which is the ;$, leaving the result this way:


Remaining a null field yet.
I would like a solution for both Unix and Windows.

  • I don’t think it’s a good idea to treat a CSV file with a regular expression, but, as you probably already know this and should simply be processing data to provide to another program, I’ll let it go ;)

3 answers


Use perl, which supports look-Ahead:

 perl -p -e 's/;(?=;|$)/;\\N/g' arquivo.csv > novo-arquivo.csv

Incidentally, if you want to make the change within the same file (without having to redirect to another), simply pass the -i option (infidel):

 perl -p -i -e 's/;(?=;|$)/;\\N/g' arquivo.csv
  • Very interesting @Lias. But this command you posted does not meet the last field of the record if the line ends with ; which means that the last field is null and shall be replaced by ;\N. I can put a new parameter -e, thus?: perl -p -e 's/;(?=;)/;\\N/g' -e 's/;$/;\\N/g' arquivo.csv > novo-arquivo.csv

  • @ricidleiv truth, just fix, thanks.

  • and in the case, for example, to delete a word recursively in a line, like the word nulo in string nnuloulochavenulnuloo to leave only the plavra chave. In Perl, how could I do that?

  • Well, in that case I think there would be no way to escape from a loop. Likely the one-Liner looks more beautiful in sed, same: sed ':loop; s/nulo//g; t loop;' -- hard to beat that. =)

  • 1

    You can simplify the case of the line ending with ; use perl -p -e 's/;(?=;|$)/;\\N/g;'

  • @rodrigorgs good, Rodrigo! I’ll incorporate the suggestion! -- although it gets a little harder to read, right?

Show 1 more comment


I’m used to Java development environments, both on Linux and Windows, so I would use a task Ant to perform cross-platform file handling operations like this.

Ant is a powerful and versatile tool used for automation, builds (compilation and package assembly) and file processing. It is important to note that Ant is not a programming language, as some think, but it is a form of declaration of activities (tasks) to be implemented.

Installing the Ant

Download the binary package here, unpack it into a folder and add it to the PATH of your operating system.

Example in Windows:

set path=%path%;c:\caminho\apache-ant-1.9.3\bin

Writing the Ant Build

The following Ant project replaces lines in a given file:

<project name="MyProject" default="replace" basedir=".">
    <target name="replace">
                flags="gs" />

Running the Project

Ant automatically searches for a file called build.xml in the current directory. So, if file.txt is the file to be processed, the following command will perform the overwriting:

ant -Dfile=file.txt

If the Ant project has another name, you can use the parameter -f:

ant -f /caminho/meu-build.xml -Dfile=file.txt

Learning more about the Ant

Just read the manual in full.


The command sed Linux allows working with labels, useful for working with recursiveness.

For example, it can be used as follows:

$ sed -e ':loop' -e 's/;;/;\\N;/g' -e 't loop' -e 's/;$/;\\N/' arquivo.csv > novo-arquivo.csv

Remembering that if the file was generated in Windows and is using the Linux command, you should convert the DOS standard file to Unix, because the end-of-line character is different. And vice versa.

You can use the commands dos2unix or unix2dos.

  • 1

    By the way, sed also supports separating commands with ; and directly change the file itself so you could do: sed -i ':loop; s/;;/;\\N;/g; t loop; s/;$/;\\N/' arquivo.csv -- result equivalent to the second perl command of my reply =)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.