How to find position of occurrence of a String in a JAVA file?

Asked

Viewed 1,270 times

0

I am having to implement a work for the college, where I need to read a text file, word for word, save them in a hash table and then, according to other words read in a second file, report the occurrence of each of them. So far so good!

The problem is that I also need to store the starting position of each occurrence and I don’t know how to read word for word from the file so that I can save it. The only way I can think of to do that is by using the Randomaccessfile, but how would I read word for word?

I am currently reading the words as follows:

String palavra;
File arq = new File("teste.txt");
try{
    Scanner in = new Scanner(arq);
    while(in.hasNext()){
        palavra = in.next().toLowerCase();
    }
}catch(IOException e){
}

I ignored the rest of the code, because what really matters is reading the words.

  • You want to search a text for some specific words from your list and save the initial and final position of this found word?

  • @Diegoschmidt I am saving the words of my text in a hash table. I need to save beyond the word itself, her initial position in the file. Then I will do a search for a few words and then I will display if these words are in the file, how often they appear and the initial position of each of the occurrences.

1 answer

0


I don’t know if I fully understand what you want, but follow an example:

First you will need to take all the content of the text, so create the following method in the main class:

private static String getTexto(String nomeArquivo) throws IOException {
    StringBuilder conteudo = new StringBuilder();
    BufferedReader reader = new BufferedReader(new FileReader(nomeArquivo));
    while (reader.ready()) {
        String linha = reader.readLine();
        conteudo.append(linha);
    }
    reader.close();

    return conteudo.toString();
}

Then create the method that will return a Linkedhashmap containing the existing text words and their initial position in the text:

private static Map<String, Integer> getPalavrasDoTexto(String conteudoTexto) throws IOException {
    Map<String, Integer> listaDePalavrasDoTexto = new LinkedHashMap<>();
    String palavra = "";
    int posicaoInicioBusca = 0;
    for (Character caracter : conteudoTexto.toCharArray()) {
        if (Character.isAlphabetic(caracter)) { // verificação para armazenar somente letras
            palavra += caracter;
        } else {
            if (!palavra.isEmpty() && !listaDePalavrasDoTexto.containsKey(palavra.toLowerCase())) { // verificação para não pegar a palavra novamente caso já tenha encontrado antes
                int posicaoDeInicio = conteudoTexto.indexOf(palavra, posicaoInicioBusca); // aqui pegamos a posição da palavra a partir da posicao da ultima palavra
                posicaoInicioBusca = posicaoDeInicio;

                listaDePalavrasDoTexto.put(palavra.toLowerCase(), posicaoDeInicio);
            }

            palavra = "";
        }
    }

    return listaDePalavrasDoTexto;
}

Also create the method that returns the search words, follows:

private static List<String> getPalavrasParaBuscar(String nomeArquivo) throws IOException {
    List<String> listaDePalavrasBusca = new ArrayList<>();
    BufferedReader reader = new BufferedReader(new FileReader(nomeArquivo));
    while (reader.ready()) {
        String palavra = reader.readLine();
        listaDePalavrasBusca.add(palavra.toLowerCase());
    }
    reader.close();

    return listaDePalavrasBusca;
}

Create a file called texto.txt containing the whole text and another call palavras.txt with one word per line, both in the root directory.

Inside the main method, place the following:

String texto = getTexto("texto.txt");
List<String> palavrasParaBuscar = getPalavrasParaBuscar("palavras.txt");
Map<String, Integer> palavrasDoTexto = getPalavrasDoTexto(texto);

for (String palavraParaBuscar : palavrasParaBuscar) {
    if (palavrasDoTexto.containsKey(palavraParaBuscar)) { // só vamos buscar as palavras que já foram encontradas anteriormente

        System.out.println("-----");
        System.out.println("BUSCANDO PALAVRA: " + palavraParaBuscar + ", POS INICIO TEXTO: "+ palavrasDoTexto.get(palavraParaBuscar));

        Pattern pattern = Pattern.compile(palavraParaBuscar, Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(texto);

        while (matcher.find()) { // enquanto acharmos alguma ocorrência
            String palavraEncontrada = matcher.group();
            int posicaoDeInicio = matcher.start();
            int posicaoFinal = matcher.end();

            System.out.println();
            System.out.println("PALAVRA ENCONTRADA: " + palavraEncontrada);
            System.out.println("POS INICIO: " + posicaoDeInicio);
            System.out.println("POS FINAL: " + posicaoFinal);
        }
    }
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.