Problem with Regex in PHP

Asked

Viewed 183 times

2

I have the following function that is used to take special characters from a string:

function removeSpecialChars($string){
  //List (Array) of special chars
  $pattern = array("/(á|à|ã|â|ä)/","/(Á|À|Ã|Â|Ä)/","/(é|è|ê|ë)/","/(É|È|Ê|Ë)/","/(í|ì|î|ï)/","/(Í|Ì|Î|Ï)/","/(ó|ò|õ|ô|ö)/","/(Ó|Ò|Õ|Ô|Ö)/","/(ú|ù|û|ü)/","/(Ú|Ù|Û|Ü)/","/(ñ)/","/(Ñ)/","/(ç)/","/(Ç)/","/(\'|\"|\^|\~|\;|\:|\°|\?|\&|\*|\+|\@|\#|\$|\%|\!|\\|\/|\(|\)|\||\=|\.|\,)/");

  //List (Array) of letters
  $replacement = array('a', 'A', 'e', 'E', 'i', 'I', 'o', 'O', 'u', 'U', 'n', 'N', 'c', 'C', '');

  return preg_replace($pattern , $replacement, $string);
}

It works well, the only problem is that this function cannot replace the bar, be it reversed (\) or normal (/), and the dollar sign($), but if I put that regular expression on some site to test, like put on this link works normally.

Does anyone know why it doesn’t work in PHP?

2 answers

3


This is because within a string (between the quotes), the backslash must be escaped and written as \\ (as described in documentation).

That’s why, \/ should be written as \\/, and \\ should be written as \\\\ and so on. Another detail is that the quote character itself (") should also be escaped and written as \". That is, inside the string, the snippet \" shall be written as \\\".

Then your expression would look like this:

function removeSpecialChars($string){
  //List (Array) of special chars
  $pattern = array("/(á|à|ã|â|ä)/","/(Á|À|Ã|Â|Ä)/","/(é|è|ê|ë)/","/(É|È|Ê|Ë)/","/(í|ì|î|ï)/","/(Í|Ì|Î|Ï)/","/(ó|ò|õ|ô|ö)/","/(Ó|Ò|Õ|Ô|Ö)/","/(ú|ù|û|ü)/","/(Ú|Ù|Û|Ü)/","/(ñ)/","/(Ñ)/","/(ç)/","/(Ç)/","/(\\'|\\\"|\\^|\\~|\\;|\\:|\\°|\\?|\\&|\\*|\\+|\\@|\\#|\\$|\\%|\\!|\\\\|\\/|\\(|\\)|\\||\\=|\\.|\\,)/");

  //List (Array) of letters
  $replacement = array('a', 'A', 'e', 'E', 'i', 'I', 'o', 'O', 'u', 'U', 'n', 'N', 'c', 'C', '');

  return preg_replace($pattern , $replacement, $string);
}

echo removeSpecialChars("áçõ/\\?&"); // aco

On the website you tested it was not necessary to escape from \ because regex there is not inside a PHP string.


Another option is to use character classes (delimited by []), taking care to use the flag u, for accented characters may not work properly when used within a character class:

function removeSpecialChars($string){
  $pattern = array("/[áàãâä]/u","/[ÁÀÃÂÄ]/u","/[éèêë]/u","/[ÉÈÊË]/u","/[íìîï]/u","/[ÍÌÎÏ]/u","/[óòõôö]/u","/[ÓÒÕÔÖ]/u","/[úùûü]/u","/[ÚÙÛÜ]/u","/ñ/u","/Ñ/u","/ç/u","/Ç/u", '/[\'"\^~;:°?&*+@#$%!\(|\)=.,\/\\\\]/');

  $replacement = array('a', 'A', 'e', 'E', 'i', 'I', 'o', 'O', 'u', 'U', 'n', 'N', 'c', 'C', '');

  return preg_replace($pattern , $replacement, $string);
}


echo removeSpecialChars("áçõ/\\?&");

There is another option, which is to use the class Normalizer to remove the accents. To use it, you must enable the extension intl:

function removeSpecialChars($string){
  $semAcentos = preg_replace('/\p{M}/u', '', Normalizer::normalize($string, Normalizer::FORM_D));
  return preg_replace('/[\'"\^~;:°?&*+@#$%!\(|\)=.,\/\\\\]/' , '', $semAcentos);
}

echo removeSpecialChars("áçõ/\\?&");

With this, the accents are removed at once, simply using regex to remove the special characters at the end.

3

Simplest you put everything in brackets [] (Character set, or character set), and of course, escaping the bars and double bounding quotes:

"/['\"^~;:°?&*+@#$%!\/()|=.,\\\]/"
    ↑               ↑       ↑↑
 escape          escape   escape

Now I missed other characters, like the brackets [] and the keys {}, for example. If you are going to include them, you need to escape the brackets as well:

"/[{}\[\]'\"^~;:°?&*+@#$%!\/()|=.,\\\]/"
     ↑ ↑  ↑               ↑       ↑↑
    escapes            escape   escape
  • 1

    A good idea too, will simplify the code, thank you.

  • @Gabrielqueirozschicora I was going to suggest this, but when I tested with preg_replace didn’t work, see. Maybe I’m too rusty with PHP, I don’t know if I missed something :-/

  • Qq way, it is a valid hint to simplify the regex, then +1 :-)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.