How to search for similar words or synonyms in Postgresql

Asked

Viewed 1,250 times

2

I need to get a survey to return similar words

I found the phonetic research it can even be used to refine the search I need, but I think this is not ideal.

For example, in a database I have several professionals, but your professional experience does not correspond to any standard words.

When searching: "Supermarket Manager", I would like to get results such as:

  • "Manager of Supermarkets"
  • "Management of Supermarkets"
  • "Director of Supermarket"

Would anyone have any suggestions?

Thank you

  • What you apparently need is semantic search. I don’t remember seeing anything native to it in the PostgreSQL, but only things like Semantic MediaWiki. Also, there are more "parrudas" solutions that have this functionality, as those based on Lucene (Hibernate Search and Solr, for example).

  • Thank you Bruno César, I will research these tools.

  • Okay, just make sure it’s right there what you need, if you choose something using Ucene I can help you post something to start.

  • @Brunocésar, it seems to me that the path to be followed is the same semantic search. But for that I need to prepare my own knowledge base or can I use some mechanism that already brings me the information to be compared in my search? About Postgresql, I found an article that deals with tsvector and full text search, but I didn’t quite understand how to build this structure and feed the information I have.

  • The Solr already has dictionary of some more common things Raphael, exists even for the Portuguese language and you can improve it by creating your own dictionary. Things you should consider are stemming (in linguistic analysis) and the use of synonyms. To create this, you can even work with machine learning. Anyway, there are several ways.

  • No Postgresql tsvector is a function to extract the lexema from some strings, it seems interesting to start with this, I never used and I do not know details about its behavior. This function is one of the available in Postgresql for full text, search technique that differs from the common one because it is based on metadata and other things like documents and such. See this, It may help to use full-text in Postgresql. If I can get some time, I test this in Postgresql from dictionaries I use in Ucene.

Show 1 more comment

1 answer

-1

Rafael,

I believe you will achieve this using Full Text Search. http://www.postgresql.org/docs/current/static/textsearch.html

select * 
from tabela 
where to_tsvector('portuguese', campo_text) @@ to_tsquery('portuguese', 'Gerente & Supermercado')

Depending on the amount of information to process you can create a column in your table of type tsvector, so you would only need to use the to_tsquery() in the where with the content of the consultation.

I hope I’ve helped :)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.