A slightly different approach that allows depend on of similar_text(), providing its use, by removing points and spaces regularly and conditionally.
For this approach, the ideal would be to use preg_replace_callback() but with two preg_replace() consecutive ER is cleaner:
$var1 = "M. D. AQUI";
$var2 = "MD AQUI";
$var1 = preg_replace( '/(\w)\.\s+(?!\w{2,})/', '$1', $var1 ); // MD. AQUI
$var1 = preg_replace( '/(\w)\.\s+(?=\w{2,})/', '$1 ', $var1 ); // MD AQUI
if( $var1 != $var2 ) {
similar_text( $var1, $var2, $percentual );
if( $percentual > 70 ) {
// Strings similares, faz alguma coisa
}
} else {
// Strings iguais
}
The first substitution removes points and spaces from individual letters if they are not followed by a word with more than one letter.
The second does the same thing, but the other way around. If the letter and dot are followed by a larger word, remove the dot, but add an extra space.
So it doesn’t get "all stuck together".
This approach has the advantages:
- Handle only one of the strings, which is useful if the second comes from a fixed source that you cannot or should not change
- Does not require the use of similar text() because, at least in the above scenario, the strings are equal. If they are not and you want to rely on similar_text() as fallback, it decreases the probability of the percentage returning a false-positive with a score very low.
In the
str_replace(". ", "", $var1);
remove the space and leave only the point, so:str_replace(".", "", $var1);
.– Franchesco
I’ve done it that way, but there’s still a problem. For example: If I take the points from M. D. HERE the result will be M D HERE if compare to MD HERE the result is false;
– LUCAS WILLIAM
Knife is different from Do, But if I take out the cedilla, they look the same. What’s the real logic of changing the word? What kind of comparison do you need to make? Do you consider the accents?
– Papa Charlie