The indexes of a string start at zero, so they go from zero to length - 1
. So it’s wrong to do j<=x.length()
, so you’re getting an extra index at the end and error when trying to access a non-existent position. It’s right to use <
instead of <=
.
As for the algorithm, just save the previous character, and you only add the current character in the new string if it is different from the previous one. Like this:
String[] array = {"lleonardo", "joaoo"};
String[] result = new String[array.length];
for (int i = 0; i < array.length; i++) {
String atual = array[i];
StringBuilder sb = new StringBuilder();
char anterior = 0;
for (int j = 0; j < atual.length(); j++) {
char c = atual.charAt(j);
if (c != anterior) {
sb.append(c);
}
anterior = c;
}
result[i] = sb.toString();
}
To create the new string I used a StringBuilder
, that for several successive concatenations in a loop, is more efficient than concatenating strings directly.
At the end, the array result
will have strings without repeated consecutive characters.
But this solution has limits. Of course, if you only have strings containing Portuguese texts, you probably won’t have any problems. But if you have something like that:
// sim, um emoji direto no código
String[] array = { "" };
It no longer works. The "short" explanation is that Java internally stores strings in UTF-16 (according to own documentation quotes: "To String
represents a string in the UTF-16 format"), and some characters end up occupying 2 char
's (print out "".length()
, and see that the result is 4
- each emoji needs 2 char
's to be stored, and length
returns the size of the array of char
used internally). The long explanation for understanding all these details is here.
Anyway, if you want to delete the repeated characters for this case, then we have to iterate through the code points of the string:
String[] array = { "aaxybb" };
String[] result = new String[array.length];
for (int i = 0; i < array.length; i++) {
String atual = array[i];
StringBuilder sb = new StringBuilder();
int anterior = -1, cp;
for (int j = 0; j < atual.length(); j += Character.charCount(cp)) {
cp = atual.codePointAt(j);
if (cp != anterior) {
sb.appendCodePoint(cp);
}
anterior = cp;
}
result[i] = sb.toString();
}
It will still fail if the string has grapheme clusters or accents normalized in NFD, but if you want to delve into these cases, I suggest reading here.
Another option is to use
Pattern.compile("(.)\\1+")
, so already takes one or more occurrences of the repeated character (instead of only one), and in the substitution makereplaceAll("$1")
- this also avoids the use of Lookahead, which makes regex a little more efficient (not that regex is the most efficient thing in the world, but finally, compare the amount of Steps here and here)– hkotsubo
Good suggestion @hkotsubo ... I improved the regex.
– Alex de Moraes