REGEX - denying prepositions in the name of public places

Asked

Viewed 439 times

0

I could use some help in a regex. I need to beat streets at the base of the post office, but because of the prepositions many streets are not found.

Example:

Input file = STREET BEACH OF ARMACAO

Post Office Base = RUA PRAIA OF ARMACAO

Output file = Patio not found

I need a regex that ignores prepositions and looks for everything before and after it.

I managed to reach the expression that returns the prepositions ( sde s| sda s| sdo s).inserir a descrição da imagem aqui

But when I put the negation (?! sde s| sdo s), nothing returns.inserir a descrição da imagem aqui

No need to be anything for a specific language, running in text editor is already OK for what I need. The search of the street will be done by Stringreplacer FME (image below). inserir a descrição da imagem aqui

  • What language are you using? post your relevant code

  • @Guilhermecostamilam is not for any specific language. Working in the text editor is already OK for what I need. The search will be done in FME using Stringreplacer.

  • 1

    Say which engine regular expression is important. If not, the respondent can give a valid answer to sed but not valid in, I don’t know, PCRE. Or whatever PCRE but not recognized by Eclipse. (PS: were examples, do not know the compatibility of sed with PCRE nor with Eclipse/regex Java)

  • Please put the list of the streets in a text table. You can use this website.

2 answers

1

There is a Transformer in the FME called Fuzzystringcomparer or Fuzzystringcomparefrom2datasets.

This transformer calculates a similarity index for two attributes. These attributes may or may not be in the same dataset. After calculating this degree of similarity you can perform a cut filter to pick up only the records that returned a high similarity.

inserir a descrição da imagem aqui

0

Assuming that each address has only one preposition and we only look for the prepositions "DE", "DA" and "DO", we can simplify the comparisons using the expression:

 /^(.*?)((?:\sD[AEO])(\s.*))?$/

The consequence is that if a proposition exists, by capturing the relevant parts - group 1 and group 3 (if it exists), they would not be in the same group but if, in code, a new string is created as the concatenation of the two groups, the comparison can be made.

  • Thanks for the answer! But it didn’t work... I put the regex in the stream but found nothing, neither in the FME nor in the text editor.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.