2
As an example I have the following sentence:
texto = O gato subiu no telhado de 10m, um homem jogou-lhe uma pedra e ele caiu de uma altura de 20m.
I want to extract the following information:
(O gato subiu 10m, O gato caiu 20m)
I tried to:
(gato).*(subiu|caiu).*(?=m)
And just returned to me
gato subiu 10m
.
I can also use:
>>search_1=re.findall(re.compile('gato.*subiu.*(?=m)'),texto)
>>search_1=[gato subiu 10]
>>search_2=re.findall(re.compile('gato.*caiu.*(?=m)'),texto)
>>search_2=[gato caiu 20]
and then I put the two lists together.
But I still believe there must be a more optimized way to write this in just one line of code.
Obs:
Sentences always respect that order [gato / palavra / número seguido de "m"]
But not in his sentence
o gato caiu
, hasele caiu
. The expression must be able to understand thatele
refers togato
?– Woss
If I had the word cat twice it would be easy, but it only comes once.
– Mueladavc
If I use
'(o gato).+caiu.+(?=m)'
returns what I need, but then I would have to do it for every occurrence, went up, fell... etc.– Mueladavc
But first answer our question: how do you want it to return
gato caiu
where there is no such expression in the text? It should not returnele caiu 20m
?– Woss
The only possibility I can see is https://ideone.com/rqHi3v. But if the need is to really return
gato caiu
, a little more code is needed.– Woss
@Andersoncarloswoss if I make a code for
(o gato).+caiu.+(?=m)'
he returnso gato caiu 20 m
, I would have to make another code(o gato).+subiu.+(?=m)
he rehotorino gato subiu 10m
then would have my result, but would have to do two searches and I wanted to know if there is an optimized way without I need to do another code .– Mueladavc