Search for similar names or with typo - Mysql

Asked

Viewed 71 times

2

I am looking to perform a search in a table containing the name of the products. I created a FULLTEXT to optimize the performance of the queries.

SELECT  * FROM tabela WHERE MATCH (campo) AGAINST ('Picant*Defumad*' IN BOOLEAN MODE);

Upshot:

1 LOMBO DEFUMADO PICANTE FATIADO PRIETO BANDEJA

2 CAFÉ PICANTE

3 BISC TAP SNACK CURCUMA E PAPPRICA DEFUMADA 25G

In this search I already obtained an improvement of not being considered the order that the words were typed, as in the first result presented. But I also need that if the user type "PICATE", it seeks records that are similar as in the case of PICANTE .

I have tried to use other forms of HOW as in the case of IN NATURAL LANGUAGE MODE, but did not solve the problem.

  • "But I also need the case or user to type "SPICY"" I didn’t quite understand the doubt, it would be like mouthing "PICANT*", but in this case "PICANTE*"

  • In this case, I need to search for similar words, : "PICATE", where the N, he searches every record I have "SPICY".

  • 2

    there is a resource called SOUNDEX, look into it

  • 1

    The SOUNDEX function does not fit my search form, because it uses a phonetic algorithm, and in this case I have a typo that can compromise the phonetics of the text. .

  • in this case there is nothing ready to help you, you will need to create dictionaries of similar words and move to the query

  • The problem that the table even contains more than 1 million record. In the application I use PHP, and I was able to do a search that already returns to me what I need, but I allocate all the values of the BD in memory so I can handle and it consumes a lot of machine resource.

  • I’m seeing a way to replicate PHP code for a Mysql function.

  • No Oracle já fiz por Jaro-Winkler e edit_distance combinadas , achei para Mysql https://gist.github.com/TheHiddenHaku/7229861 doc Oracle https://docs.oracle.com/cd/E11882_01/appdev.112/e40758/u_match.htm#ARPLS352

  • pq do not use REGEX?

Show 4 more comments

1 answer

0

One option is to use REGEXP. Following replicable example:

CREATE DATABASE supermarket;
use supermarket;
CREATE TABLE products (id INT, name VARCHAR(100));
INSERT INTO products (id,name)
VALUES (1,"LOMBO DEFUMADO PICANTE FATIADO PRIETO BANDEJA");
INSERT INTO products (id,name)
VALUES (2,"CAFÉ MOGIANA 250g");
INSERT INTO products (id,name)
VALUES (3,"CAFÉ PICANTE");
INSERT INTO products (id,name)
VALUES (4,"CATCH-UP PICATE");

SELECT *
FROM supermarket.products
WHERE name REGEXP 'PICA(N|)TE'

Returns:

+------+-----------------------------------------------+
| id   | name                                          |
+------+-----------------------------------------------+
|    1 | LOMBO DEFUMADO PICANTE FATIADO PRIETO BANDEJA |
|    3 | CAFÉ PICANTE                                  |
|    4 | CATCH-UP PICATE                               |
+------+-----------------------------------------------+

In this example I used a very simple regex, using the information you gave. But this regular expression can become more complex to cover the cases you want.

  • If the user slips his finger and writes PICAMTE will have to create another rule just to cover this error and so should be made a rule for each possible typo which makes this approach unfeasible.

  • then it would not be enough to do PICA(M|N|)TE?

  • ë a search if type "LONBO" will have to create the rule of loin, but also can it type "LONBOO", "LAMBO", "LAMVO", for each of the infinite words that the user can type will have to create endless rules of error handling.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.