Text comparison

Question

Text comparison

Asked 10 years, 8 months ago

Viewed 89 times

2

I have a question when comparing variables.

I receive a variable value in a string and I need to compare it to another string.

For example:

$var1 = "M. D. AQUI";
$var2 = "MD AQUI"; // COM PONTUAÇÃO OU SEM PONTUAÇÃO. COM ESPAÇOS OU SEM ESPAÇOS.

Well, I tried to make one replace in the variable, exchanging the points for nothing but, the space continues. I can’t take the space because the text will be all together.

$result = str_replace(". ", "", $var1); // resultado: MDAQUI  / Com isso não consigo fazer a comparação de semelhanças.

Could someone help with the code or indicate a study tool?

In the str_replace(". ", "", $var1); remove the space and leave only the point, so: str_replace(".", "", $var1);.

– Franchesco

2014/11/21 at 11:49
I’ve done it that way, but there’s still a problem. For example: If I take the points from M. D. HERE the result will be M D HERE if compare to MD HERE the result is false;

– LUCAS WILLIAM

2014/11/21 at 11:53
Knife is different from Do, But if I take out the cedilla, they look the same. What’s the real logic of changing the word? What kind of comparison do you need to make? Do you consider the accents?

– Papa Charlie

2014/11/21 at 21:14

2 answers

Browser other questions tagged php

You are not signed in. Login or sign up in order to post.

by Jorge B. • **11,427** points · Answer 1 · 2014-11-21T11:51:57+00:00

What you have to do is remove all spaces or points from TWO strings:

$var1 = str_replace(".", "", $var1);
$var1 = str_replace(" ", "", $var1);
$var2 = str_replace(".", "", $var2);
$var2 = str_replace(" ", "", $var2);

$var1==$var2  (true)

If you want to compare similarities like you said in the code you can use the function similar_text:

$var1 = strtoupper("M. D. AQUI");
$var2 = strtoupper("MD AQUI");

similar_text($var1, $var2, $percentagemDeSemelhanca);
echo $percentagemDeSemelhanca;

//resultado => 82.3529411765

Then you will know the percentage of similarity of the two strings. I used the strtoupper to increase the probability of similarity between strings in case they are not capitalized.

Phpfiddle example

by Bruno Augusto • **8,661** points · Answer 2 · 2014-11-21T13:14:01+00:00

A slightly different approach that allows depend on of similar_text(), providing its use, by removing points and spaces regularly and conditionally.

For this approach, the ideal would be to use preg_replace_callback() but with two preg_replace() consecutive ER is cleaner:

$var1 = "M. D. AQUI";
$var2 = "MD AQUI";

$var1 = preg_replace( '/(\w)\.\s+(?!\w{2,})/', '$1', $var1 ); // MD. AQUI

$var1 = preg_replace( '/(\w)\.\s+(?=\w{2,})/', '$1 ', $var1 ); // MD AQUI

if( $var1 != $var2 ) {

    similar_text( $var1, $var2, $percentual );

    if( $percentual > 70 ) {

        // Strings similares, faz alguma coisa
    }

} else {

    // Strings iguais
}

The first substitution removes points and spaces from individual letters if they are not followed by a word with more than one letter.

The second does the same thing, but the other way around. If the letter and dot are followed by a larger word, remove the dot, but add an extra space.

So it doesn’t get "all stuck together".

This approach has the advantages:

Handle only one of the strings, which is useful if the second comes from a fixed source that you cannot or should not change
Does not require the use of similar text() because, at least in the above scenario, the strings are equal. If they are not and you want to rely on similar_text() as fallback, it decreases the probability of the percentage returning a false-positive with a score very low.