Find substring with REGEX

Asked

Viewed 48 times

1

I am trying to turn all parts N_(...) into uppercase. I thought REGEX would be the most appropriate. It’s just too hard to even capture the N_(...) part and then turn it into capital letters I can do it myself:

My file:

stuffy, stuffy. A+H_PRE+pol=no+N_stuffy:Fs stuffy, stuffy. A+H_PRE+pol=no+N_stuffy:Fp stuffy,. A+H_PRE+pol=no+N_stuffy:ms muffled, muffled. A+H_PRE+pol=no+N_muffled:mp abafante,. A+H_PRE+pol=no:+N_abafante:ms

Script:

import re

with open("word_upper.txt", "r") as f:
    text = f.read()

    pattern = re.findall(r'N_(\w+)', text)
    upper_word = pattern.group(1)

    print(upper_word)

Exit:

Traceback (Most recent call last):

File "teste_lemme.py", line 14, in

upper_word = Pattern.group(1)

Attributeerror: 'list' Object has no attribute 'group'

Desired exit:

stuffy stuffy stuffy stuffy abaphant

Then I thought about just turning this list into uppercase (using the (upper) method and then replacing with the replace method. So I would have:

muffled, muffled. A+H_PRE+pol=no+N_ABAFADO:Fs

What do you think?

1 answer

2


You can use the function re.sub to replace based on a regular expression and, if you pass as value to replace a searchable object, the value captured in the regular expression will be replaced by the return of the function.

Something like:

with open('words_upper.txt') as stream:
    text = stream.read()
    edited = re.sub(r'(N_\w+)', lambda match: match.group(0).upper(), text)

Thus, edited would take the value:

abafada,abafado.A+H_PRE+pol=no+N_ABAFADO:fs abafadas,abafado.A+H_PRE+pol=no+N_ABAFADO:fp abafado,.A+H_PRE+pol=no+N_ABAFADO:ms abafados,abafado.A+H_PRE+pol=no+N_ABAFADO:mp abafante,.A+H_PRE+pol=no:+N_ABAFANTE:ms

Browser other questions tagged

You are not signed in. Login or sign up in order to post.