How to insert a character in the middle of the sentence

Asked

Viewed 896 times

5

I am implementing a function that checks whether within a word there are two equal letters in a row, and if so, add an x between the two letters to separate them.

Example:

  • Entree: pizza
  • Exit: pizxza
def checagem_letras(self):
    i = 0
    for letra in self.__frase:
        if self.__frase[i] == self.__frase[i+1]:
            ?????????

I don’t know what method to implement to make the code work.

  • Diovana Valim, there will be cases where the input can be more than two repeated letters? Example: 'Pizzza', 'Carburarrrrrrrrrr'

  • Complementing the @Augustovasques comment, I understood that if the string is 'aaa', the result should be 'axaxa' (an "x" inserted between each occurrence of two repetitions of the same letter), including answered with codes that do this...

3 answers

4

One solution is to scroll through the string characters, and only add the "x" if the next character is equal to the current one:

class Test:

  def __init__(self, frase):
    self.__frase = frase;

  def checagem_letras(self):
    result = ''
    for i, letra in enumerate(self.__frase):
        result += letra
        if i < len(self.__frase) - 1 and letra == self.__frase[i + 1]:
            result += 'x'
    return result

print(Test("pizza").checagem_letras()) # pizxza
print(Test("pizzaiollo").checagem_letras()) # pizxzaiolxlo
print(Test("aaa bbbb").checagem_letras()) # axaxa bxbxbxb

I use enumerate to traverse the characters of the string, and at the same time I already have the respective index. So I can check the next character - taking due care to check if I’m not in the last character (the condition i < len(self.__frase) - 1), because in this case I cannot verify the next character (otherwise I will try to access an index that does not exist and will occur a IndexError).

If the next character is equal to the current one, I add the "x". At the end I return the changed string.

The detail is that I needed to build another string, since strings are immutable in Python (see documentation), then it is not possible to change the indexes of an existing string. That is, this code:

s = 'abc'
s[1] = 'x'

Causes a TypeError (see), because I tried to change a string index. So the only way to do what you want is to create another string.

It is also worth noting that in the case of 3 repetitions (aaa) i understood that the result should be an "x" inserted between each occurrence of two repeated letters, and therefore the result should be axaxa (it was not clear whether this case happens, nor what should happen if it happens).


It is unclear whether the method should return the changed string or just modify the current phrase. If you only want to modify the current phrase, do:

def checagem_letras(self):
    result = ''
    for i, letra in enumerate(self.__frase):
        result += letra
        if i < len(self.__frase) - 1 and letra == self.__frase[i + 1]:
            result += 'x'
    # modifico a frase em vez de retornar
    self.__frase = result

Also, the above algorithm inserts the "x" for any repeating character (not just letters). But if you want to restrict yourself to letters, you can change the condition of if. For example:

if i < len(self.__frase) - 1 and letra.isalpha() and letra == self.__frase[i + 1]:

I used isalpha(), which checks if it is a letter (thus other characters will be ignored even if they are repeated). Change the condition to whatever you need.


Another way of doing it - a little more complicated, and I admit that for this case it is a certain "exaggeration", since the above solution is much simpler - is to use regular expressions (regex):

import re

class Test:

  def __init__(self, frase):
    self.__frase = frase;

  def checagem_letras(self):
    return re.sub(r'([a-zA-Z])(?=\1)', r'\1x', self.__frase)

print(Test("pizza").checagem_letras()) # pizxza
print(Test("pizzaiollo").checagem_letras()) # pizxzaiolxlo
print(Test("aaa bbbb").checagem_letras()) # axaxa bxbxbxb

The regex uses the character class [a-zA-Z], which takes a letter from a to z (lowercase or uppercase) and as it is in parentheses, it forms a catch group.

Then I use a Lookahead (the stretch between (?= and )), that checks if something exists ahead. And that something is \1, which is a backreference and means "the same thing that was captured by capture group 1". In this case, group 1 is the first pair of parentheses, which contains the letter of a to z. That is, regex checks if it has a repeated letter.

Then in the substitution I use \1x, that is, the letter that the regex has detected is repeated (a backreference \1), followed by an "x".

In this case I am being very strict and I only enter the "x" when it is a repeated letter. But if you want to be more generic like the first option above and consider any character (not just letters), you can swap the character class for a dot:

def checagem_letras(self):
    return re.sub(r'(.)(?=\1)', r'\1x', self.__frase)

For in regex, the point corresponds to any character (except line breaks).

  • 2

    I think it’s worth mentioning that the approach has to be because strings are immutable in Python - so it will always be necessary to build another string, as is done in the variable "result" here. There is no way to only insert the direct letter into the string that is already in the . __sentence

  • @jsbueno Well remembered, updated the reply, thank you!

3


You can solve your problem in a rather "pythonic" way using a generator combined with a state machine, look at you:

def substituir(string):
  ultimo = None
  for atual in string:
    if atual == ultimo:
      yield 'x'
    yield atual
    ultimo = atual

s = "pizza carro passaro"
print(''.join(substituir(s)))

Exit:

pizxza carxro pasxsaro

See working on Repl.it

1

You can use the method sub module re which provides regular expression matching operations.

re.sub( padrão , repl , sequência , contagem = 0 , sinalizadores = 0 )

sub() returns a string obtained by replacing the occurrences of the pattern in the sequence with the repl replacement . If the pattern is not found, the sequence will be returned unchanged.

import re

frase = 'Zicco, hojje tem pizza.'

#São gerados dois grupos de captura `[r'\w\w',r'\w']`->`[r'\1',r'\2']` no caso ele substitui 
#a ocorrência dobrando segundo grupo de captura e inserindo x entre `\2x\2` 
frase_pocessada = re.sub(r'((\w)\2+)',r'\2x\2',frase)

print(frase_pocessada)

Code in Repl.it: https://repl.it/repls/DarkorchidEverlastingScript

  • 1

    Only one detail, if the string is 'aaa', I understood that the result should be axaxa (and in the my answer has a regex that does this). But as it is not clear if this case happens, your regex already solves the cases of only two repeated letters :-)

  • 1

    @hkotsubo even asked the AP if there will be different cases of input. For my regex treats every repetition as if they were always a double.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.