SQL query does not work when it has accents

Asked

Viewed 981 times

2

I have in my database a category defined as LAND/AREAS that when making the following SQL query does not return any result of the existing 18.

SELECT * FROM immovable WHERE 1=1 AND CATEGORY IN ('LAND/AREAS')

Already this consultation works:

SELECT * FROM immovable WHERE 1=1 AND CATEGORY IN ('APARTMENT')

Some connoisseur of SQL know why the first statement does not return result? Is it problem with characters?

Updating

The IN is because the values obtained are derived from a array

// se $tipo for um array e for maior que 0
if (is_array($tipo) && count($tipo) > 0) {
   $where .= " CATEGORIA IN ('".implode("','", $tipo)."') AND ";
} else {
   $where .= " CATEGORIA = '{$tipo}' AND ";
}

And is AREAS same, a term coming from a web service whose software is very badly done and that needs to be inserted into the database in this way.

  • 3

    Areas? Wouldn’t be AREAS? And IN doesn’t make much sense in that context, IN serves to locate on a data set. Maybe ( CATEGORIA LIKE '%TERRENO%' AND CATEGORIA LIKE '%REAS%') is better. UPDATE based on the question Edit: Beware, because if the webservice has problem, it may have some invisible character between the à and the REAS. Have you tried to sanitize the data before applying the query?

  • I did not try to sanitize !!! Is that to eat? Sorry to joke :)

  • 3

    @Marcosvinicius may be that you need to convert from one encoding to another, it would be nice to analyze the size of the string to see if it has the number of characters you are seeing on the screen. Most likely if you used a binary search, it would solve your problem. Ex: House = 0b0001, Land =0b0010, Areas = 0b0100, so when searching for Land OR Areas, would make a Binary OR with 0b0110, for example. Mysql has FIND_IN_SET for this: http://dev.mysql.com/doc/refman/5.0/en/set.html

1 answer

3


Literal text accentuation as a database search filter is a problem, as encoding may differ between the application and DBMS - you would have to ensure synchronization between these settings.

Options:

  • Use parameters instead of concatenating the value in the SQL command (in this case build the query dynamically using OR instead of IN).

  • Agree not to use special characters in this field (best applicable if the field is an identifier and not from a register).

In your case, as the text comes from an external service and needs to be both persisted and searched, I suggest "normalize" the text always before persisting and before searching.

Behold:

http://php.net/manual/en/normalizer.normalize.php

In doing so, you protect yourself from future fixes that the service provider will make in your string encoding.

  • I will use (I am already using) the second option to agree not to use special characters in the field or even in the database.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.