How to compare each character of a Java String?

Question

How to compare each character of a Java String?

Asked 5 years, 3 months ago

Viewed 1,699 times

5

I’m creating a Java application where I travel through one String with a bow for and I need to check every character of that String. Example:

for (int i = 0; i < texto.length(); i++) {
    char caractere = texto.charAt(i);

    if (char.equals("?")){
        System.out.println("Você não pode adicionar ao texto interrogação.");

    } else if (char.equals(" ")) {
        System.out.println("Você não pode adicionar ao texto espaços.");
    }
    // ...
}

In this example above, I know the method equals does not exist in char and that was just to illustrate what I’d like to do.

How can I compare each character of String in Java? Is it possible?

2 answers

9

As the method chartAt returns a char which is a primitive type, you can make an equal comparison with ==, but need to put the character in between simple quotes, to identify that it is a char:

caractere == '?'

It would look more or less like this:

class Main {
  public static void main(String[] args) {
    String texto = "Teste ? ";

    for (int i = 0; i < texto.length(); i++) {
        char caractere = texto.charAt(i);

        if (caractere == '?'){
            System.out.println("Você não pode adicionar ao texto interrogação.");

        } else if (caractere == ' ') {
            System.out.println("Você não pode adicionar ao texto espaços.");
        }
    }
  }
}

See online: https://repl.it/repls/LegitimateExcellentAddition

Browser other questions tagged java string comparison char

You are not signed in. Login or sign up in order to post.

by hkotsubo • **55,826** points · Answer 1 · 2020-04-16T20:27:18+00:00

Just for the record, to check that a String contains some character, do not need to traverse all characters in one loop. Just use the method contains:

if (texto.contains("?")) {
    System.out.println("Você não pode adicionar ao texto interrogação.");
} else if (texto.contains(" ")) {
    System.out.println("Você não pode adicionar ao texto espaços.");
}

Only now I had to use double quotes, because the method gets one String and not a char.

The difference, of course, is that in your loop, as you go through all the characters, so if the String has a space and a ?, both messages will be displayed (and if there is more than one occurrence, then the message will be displayed multiple times). Already in the above code only one of them is displayed - unless you take the else, then both will be displayed:

if (texto.contains("?")) {
    System.out.println("Você não pode adicionar ao texto interrogação.");
} 
if (texto.contains(" ")) {
    System.out.println("Você não pode adicionar ao texto espaços.");
}

Another difference is that contains is not limited to checking only one character:

System.out.println("abcdef".contains("cde")); // true

Going a little further, the comparison char to char works well for texts in Portuguese (and several other languages), but has its limitations, since nowadays it is possible to have codes like this:

String texto = "a";
for (int i = 0; i < texto.length(); i++) {
    char c = texto.charAt(i);
    System.out.printf("%c - %06X\n", c, (int) c);
}

Yes, a direct emoji in the code. If your IDE does not support this, you can build the same String thus:

int[] codepoints = { 0x61, 0x1f4a9 };
String texto = new String(codepoints, 0, codepoints.length);

Despite the String have two "characters" (the letter a and emoji ), the output shows 3 char's:

a - 000061
? - 00D83D
? - 00DCA9

That’s because one char in Java has 16 bits, and is only able to store values up to 65535. But Unicode defines a much larger amount of characters, so characters like emoji PILE OF POO, whose code point is U+1F4A9 (i.e., a value greater than a char supports) are "broken" in two - in case, 0xD83D and 0xDCA9, which is called "surrogate pair" (that’s because internally Java stores the String's in UTF-16 - to better understand, read here and here).

I mean, if I want to search for the emoji, it’s no use going through the char's one by one. A not very good solution would be to check the next character to know if it is a surrogate pair:

for (int i = 0; i < texto.length(); i++) {
    char c = texto.charAt(i);
    // verifica o surrogate pair (precisa verificar o próximo caractere)
    if (c == 0xd83d && i < texto.length() - 1 && texto.charAt(i + 1) == 0xdca9) {
        System.out.println("tem emoji");
    }
}

The if above can also be so:

// não preciso saber o valor do próximo, só preciso verificar se são um surrogate pair
if (c == 0xd83d && i < texto.length() - 1 && Character.isSurrogatePair(c, texto.charAt(i + 1))) {

But in this case, I think I’d better contains:

if (texto.contains("")) {
    System.out.println("tem emoji");
}

Or go through the code points of String:

int codePointCount = texto.codePointCount(0, texto.length());
for (int i = 0; i < codePointCount; i++) {
    int cp = texto.codePointAt(i);
    if (cp == 0x1f4a9) {
        System.out.println("tem emoji");
    }

    // comparação com char literal continua funcionando para valores abaixo de 0xffff
    if (cp == 'a') {
        System.out.println("Tem letra 'a'");
    }
}