Search string inside Java list

Asked

Viewed 56 times

1

I have an array with information from several movies, and I need to get the name of the films regardless of whether they are Uppercase or Lowercase, and I need for example, when fetching the name "for" it returns to me all the movies that have "for" in name, regardless of whether you’re at the beginning or the end. I found two ways:

if (filmes.getNomeFilme().contains(name)) {

That way he returns me the movies but does not ignore the case sensitive

This other way here:

if (filmes.getNomeFilme().equalsIgnoreCase(name)) {

This way it ignores the case sensitive but the user has to pass the full name of the movie to find it.

  • 1

    just convert to lowercase to compare: filmes.getNomeFilme().toLowerCase().contains(name.toLowerCase());

  • Thanks guys, Ricardo’s answer worked out, now it’s all working right!

2 answers

1


You can use the stream api to work your data, for example:

import java.util.stream.Stream;

public class Nomes{
        public static void main(String args[]){
                String[] nomes = {"lara","LAURA","MaUra","Bruna"};
                
                Stream.of(nomes).map(f -> f.toLowerCase()).filter(f -> f.contains("a")).forEach(System.out::println);           
                System.out.println("--------------------------------------------");
                Stream.of(nomes).map(f -> f.toLowerCase()).filter(f -> f.contains("au")).forEach(System.out::println);
        }
}

It will pass all to minute letter and then filter equally, past parameter

0

At first, just turn everything into tiny:

if (filmes.getNomeFilme().toLowerCase().contains(name.toLowerCase()))

Or in capital letters:

if (filmes.getNomeFilme().toUpperCase().contains(name.toUpperCase()))

Just remember that there is the inefficiency of creating new strings all the time (each call of toUpperCase and toLowerCase returns another modified string).

Not to mention some "strange" cases depending on the language (detailed here). But if you only have Portuguese texts, these problems do not occur, so the link is the reference.


Another alternative is to use regionMatches, which although not so straightforward, is more efficient (code based in this reply by Soen):

public static boolean containsIgnoreCase(String src, String what) {
    final int length = what.length();
    if (length == 0)
        return true; // string vazia retorna true

    final char firstLo = Character.toLowerCase(what.charAt(0));
    final char firstUp = Character.toUpperCase(what.charAt(0));

    for (int i = src.length() - length; i >= 0; i--) {
        // verifica só pra ter certeza antes de chamar regionMatches sem necessidade
        final char ch = src.charAt(i);
        if (ch != firstLo && ch != firstUp)
            continue;

        if (src.regionMatches(true, i, what, 0, length))
            return true;
    }

    return false;
}

And then just do:

if (containsIgnoreCase(filmes.getNomeFilme(), name))

Just out of curiosity, I ran a little test with JMH (Java Microbenchmark Harness), and the result was:

Benchmark                                         Mode  Samples        Score  Score error  Units
o.s.StringContainsBenchmark.testLowerCase        thrpt      200  5983873,852    79952,187  ops/s
o.s.StringContainsBenchmark.testRegionMatches    thrpt      200  7047000,300   199599,775  ops/s

In this case, the solution with regionMatches had more than 7 million operations per second, against 5.9 million of the solution with toLowerCase.

Remembering, of course, that this may vary according to the hardware and the data being tested. But generally, regionMatches did better than toLowerCase.

Test code:

import java.util.Arrays;
import java.util.List;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@State(Scope.Benchmark)
public class StringContainsBenchmark {

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder().include(StringContainsBenchmark.class.getSimpleName()).warmupIterations(5).measurementIterations(5).forks(3).build();
        new Runner(opt).run();
    }

    List<String> nomes;
    String busca;

    @Setup
    public void setup() {
        nomes = Arrays.asList("Fulano de tal", "ciclano de qual", "beltrano da SILVA", "Trajano de sOUZa");
        busca = "ano";
    }

    @Benchmark
    public void testRegionMatches(Blackhole bh) {
        for (String s : nomes) {
            bh.consume(comRegionMatches(s, busca));
        }
    }

    @Benchmark
    public void testLowerCase(Blackhole bh) {
        for (String s : nomes) {
            bh.consume(comLowerCase(s, busca));
        }
    }

    private boolean comRegionMatches(String src, String busca) {
        final int length = busca.length();
        if (length == 0)
            return true; // string vazia retorna true

        final char firstLo = Character.toLowerCase(busca.charAt(0));
        final char firstUp = Character.toUpperCase(busca.charAt(0));

        for (int i = src.length() - length; i >= 0; i--) {
            // verifica só pra ter certeza antes de chamar regionMatches sem necessidade
            final char ch = src.charAt(i);
            if (ch != firstLo && ch != firstUp)
                continue;

            if (src.regionMatches(true, i, busca, 0, length))
                return true;
        }

        return false;
    }

    private boolean comLowerCase(String src, String busca) {
        return src.toLowerCase().contains(busca.toLowerCase());
    }
}

The other answer used streams to iterate through the strings, which also works, but I find it an exaggeration for a case in which a loop simple with if resolves.

Despite of "legal", streams have their cost and for that reason are slower than a loop "traditional". Not that it’s wrong to use, as long as you’re aware of the implications.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.