What exactly is the "u" modifier for?

Asked

Viewed 79 times

4

What exactly does the modifier u in the regular expressions of preg_ in PHP?

It is recommended to use it whenever processing strings that have accentuated characters?

$valor = 'ãẽi ouã';
preg_match('/\w+/u', $valor, $matches);

$matches; // 'array(2) { ãẽi, ouã}

1 answer

2


This modifier /u is for Unicode support.

For example if you want to make a regex with words in Japanese it is necessary to use it.

preg_match('/[\x{2460}-\x{2468}]/u', $str);

Where \x{hex} - is a char-code hexadecimal UTF-8.

Running the following regex:

$valor = 'ãẽi ouã';
preg_match('/\w+/u', $valor, $matches);

returns:

array (
  0 => 
  array (
    0 => 'ãẽi',
    1 => 'ouã',
  ),
)

Running the following regex (without the modifier):

$valor = 'ãẽi ouã';
preg_match('/\w+/', $valor, $matches);

returns:

array (
  0 => 
  array (
    0 => '�',
    1 => '��',
    2 => 'i',
    3 => 'ou�',
  ),
)

Should not be used to pick up accented vowels examples:

$valor = 'ãẽi ouã';
preg_match('/a/u', $valor, $matches);

returns:

array (
 0 => 
  array (
  ),
)

Testing site: Link

Documentation: Link

  • Thank you for the answer. I modified an excerpt from the second question, because I thought it was important for the doubt. I actually wanted to say If I always have to use "u" when analyzing words that have accents (utf-8 encompasses a lot of things)

  • 1

    @Wallacemaxters ok I’ll give a complimentary

  • @Wallacemaxters I edited

Browser other questions tagged

You are not signed in. Login or sign up in order to post.