This modifier /u
is for Unicode support.
For example if you want to make a regex
with words in Japanese it is necessary to use it.
preg_match('/[\x{2460}-\x{2468}]/u', $str);
Where \x{hex}
- is a char-code
hexadecimal UTF-8.
Running the following regex:
$valor = 'ãẽi ouã';
preg_match('/\w+/u', $valor, $matches);
returns:
array (
0 =>
array (
0 => 'ãẽi',
1 => 'ouã',
),
)
Running the following regex (without the modifier):
$valor = 'ãẽi ouã';
preg_match('/\w+/', $valor, $matches);
returns:
array (
0 =>
array (
0 => '�',
1 => '��',
2 => 'i',
3 => 'ou�',
),
)
Should not be used to pick up accented vowels examples:
$valor = 'ãẽi ouã';
preg_match('/a/u', $valor, $matches);
returns:
array (
0 =>
array (
),
)
Testing site: Link
Documentation: Link
Thank you for the answer. I modified an excerpt from the second question, because I thought it was important for the doubt. I actually wanted to say If I always have to use "u" when analyzing words that have accents (utf-8 encompasses a lot of things)
– Wallace Maxters
@Wallacemaxters ok I’ll give a complimentary
– Ricardo
@Wallacemaxters I edited
– Ricardo