Try it like this:
print_r(array_count_values(str_word_count($texto, 1, "óé")));
Upshot:
Array (
[Hoje] => 1
[nós] => 1
[vamos] => 1
[falar] => 1
[de] => 2
[PHP] => 2
[uma] => 1
[linguagem] => 1
[criada] => 1
[no] => 1
[é] => 1
[ano] => 1
)
To understand how array_count_values
works see the php manual.
Editing
A smarter solution (language independent)
With the previous solution it is necessary to specify the entire set of utf-8 special characters (just as it was done with the ó
and the é
).
Following a solution, but complicated, however, eliminates the special character set problem.
$text = str_replace(".","", "Hoje nós vamos falar de PHP. PHP é uma linguagem criada no ano de ...");
$namePattern = '/[\s,:?!]+/u';
$wordsArray = preg_split($namePattern, $text, -1, PREG_SPLIT_NO_EMPTY);
$wordsArray2 = array_count_values($wordsArray);
print_r($wordsArray2);
In this solution I use regular expressions to break the words and then I use the array_count_values
to count words. The result is:
Array
(
[Hoje] => 1
[nós] => 1
[vamos] => 1
[falar] => 1
[de] => 2
[PHP] => 2
[é] => 1
[uma] => 1
[linguagem] => 1
[criada] => 1
[no] => 1
[ano] => 1
)
This solution also meets the need, however, the points must be eliminated before the split of words, otherwise will appear in the result words with .
and words without the .
.For example:
...
[PHP.] => 1
[PHP] => 1
...
Counting words is never such a simple task. It is necessary to know well the string
who wishes to count the words before applying a definitive solution.
What is the best solution? The question remains...
– Jorge B.
Yes, I will try to set up a performance test as soon as I have time to evaluate the best result, but all solutions are very interesting.
– Kazzkiq
In PHP every string is considered an array.
– Ivan Ferrer