Regular expression for wildcard word collection

Asked

Viewed 94 times

0

I am stuck in the following situation: I have a dictionary of words in a mongodb bank, where each word is stored in a document, which has other additional information about it.

I need a regular expression that can find each document through the word, bearing in mind that some words in the bank have a wildcard at the end (*). When this occurs, the word with the wildcard must be found if it is at the beginning of the search string. It is a kind of regular expression in the database, to simplify the registration of words. Thus, it is sufficient that the document has only the required part of the word with the wildcard at the end to be returned when the search string "matches" the word. It would be impractical to register every possible word, in addition to being completely unnecessary, since many words have exactly the same additional information.

To make the situation clearer, here are some examples:

  • If the word in the database is "pergunt*", the document should be returned whenever the search string starts with "pergunt" (question, ask, ask, etc.)
  • If the word in the database is "amig*", the document should be returned whenever the search string starts with "amig" (friend, friend, friendly, etc.)
  • If the word in the database is "love", the document should be returned whenever the search string is exactly "love".

I need a single regular expression that fits all of the above (wildcard at the end, no wildcard).

If possible also show a solution for cases where the wildcard is also at the beginning of the word (*word*), in this case a same regular expression for all situations (wildcard at the beginning, wildcard at the end, wildcard at the beginning and at the end, no wildcard).

Thanks in advance. It will help me a lot.

1 answer

0


Do you want the user to send "question" and want it to become a regular expression that validates "question*"? This makes no sense, to build a regular expression you need to know the format of what you are validating, "question" contains no information for that.

If there was to be a regular expression in this process, it was to exist registered in the database. For example, if the expression /^pergunt/i were registered, the user could send "question", "ask", "asked", and all these words would be valid.

/^amor$/i would validate "love", but not "loving", /palavra/i validates any string containing "word", in the beginning, middle or end.

  • Thank you for answering. I’m using a database already ready with the words in this format and to avoid having to update the words in the database, I thought of this solution. I’m only familiar with the basics of regular expression, so I thought it might be possible to resolve the situation the way I asked. But anyway, your answer helped a lot. The possibility to update documents in the database to define regular expressions in words will be evaluated.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.