Compare using String.Contains() disregarding accents and case

Asked

Viewed 866 times

3

I know there’s already question, I even used the same title to call attention, but this one refers to C#, I’m having this problem in java, code:

//historico e searchC são ArrayLists, no caso do case utilizei o 
//toUpperCase() para igualar, porém, em questão de acentos ele não retorna 
//nenhum valor, mesmo estando exatamente igual ao valor da linha do array
int i = 0;
    for (String[] linha : historico){
        if(linha[2].contains(nome.toUpperCase())) {
            searchC.add(historico.get(i));

        }
        i++;
    }

Case the variable nome contains any type of accent, no value is returned in the searchC. Does anyone know how to make the comparison by disregarding accents and marry directly?

2 answers

3


Adapting this reply from Soen, you can use the class Normalizer for that reason:

public boolean contanisIgnoreAccents(String a, String b) {
    String input1 = Normalizer.normalize(a, Normalizer.Form.NFD)
            .replaceAll("\\p{InCombiningDiacriticalMarks}+", "")
            .toLowerCase();

    String input2 = Normalizer.normalize(b, Normalizer.Form.NFD)
            .replaceAll("\\p{InCombiningDiacriticalMarks}+", "")
            .toLowerCase();

    return input1.contains(input2);
}

The comparison below:

System.out.println(contanisIgnoreAccents("Este é joao", "João"));
System.out.println(contanisIgnoreAccents("Onde está joÂo", "João"));

will return:

true
true

As can be seen in ideone: https://ideone.com/cJcF6M

The method checks whether the second string passed as argument is contained in the first.¹

¹ I edited the example to make clear the above definition, because if it is not used this way, the result may be shown wrong.

2

I believe the simplest and quickest way would be to use the Stringutils of Apache Commons Lang.

In build.Radle adds dependency

dependencies {
  ...
  implementation 'org.apache.commons:commons-lang3:3.7'
}

Java code

import org.apache.commons.lang3.StringUtils;

public Boolean contanisIgnoreAccents(String a, String b) {

    // Remove os acentos e convert para minúsculo
    String str1 = StringUtils.stripAccents(a).toLowerCase();
    String str2 = StringUtils.stripAccents(b).toLowerCase();

    return str1.contains(str2);
}

public void onButtonClick {
    System.out.println(contanisIgnoreAccents("João", "joao"));
    System.out.println(contanisIgnoreAccents("João", "joÃo"));
}

Upshot

true
true

Kotlin code

fun contanisIgnoreAccents(a: String, b: String) = (
    StringUtils.stripAccents(a).toLowerCase().contains(StringUtils.stripAccents(b).toLowerCase())    
)
  • Copied my example :p

  • From what I understand, he wants to see if they count, not if it’s the same. Check if "This is John" contains "John" will return false in your example, even if it contains.

  • @Articuno, I took your example to show in another way :D

  • 1

    True! I made an equals and not a contains. Fixing the code!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.