Remove duplicated string characters if not a graph

Asked

Viewed 1,079 times

1

How to remove duplicate characters in a string if it is not a graph (rr, ss) using Regex? Ex:

Oiiiii => Hi

Aloooo => Alo

Past => Past

Car => Car

If the rr or ss appear at the beginning or end of the word, can be removed, e.g.:

Rides => Cars

  • There is the possibility of r or s appear more than 2 times ? If yes what is the procedure in this case

  • If this happens, it can be considered only 2, and the rest removed. Ex: carrrro = carro

  • @Thiagor. What have you tried?

1 answer

2


I would like to start by saying that this regex does not cover 100% of your cases, but is correct in almost all of them. And I honestly don’t see much way to cover without complicating drastically and maybe using code by hand.

But let’s start with the regex itself:

([^rs])(?=\1+)|(rr)(?=r+)|(ss)(?=s+)

See in regex101

Explanation:

([^rs])  - Qualquer letra que não r ou s
(?=\1+)  - Que se repita uma vez ou mais
|(rr)    - Ou dois r's
(?=r+)   - Que tenham mais r's à frente
|(ss)    - Ou dois s's
(?=s+)   - Que tenham mais s's à frente

And does the substitution for nothing, empty text, because what is captured are the duplicated letters that you want to remove.

Testing: Entree:

oiiiiiiii amiggggos passssado Carrrrrrrros

Exit:

oi amigos passado Carros

You can always adjust the regex to other letters you want to let duplicate by moving [^rs] and in groups (rr), adding others you want.

Notice if you step in Carross regex cannot understand that it was supposed to be Carros, but this can complicate and well.

  • Very good! That’s enough for me. Thank you so much for your help!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.