Working with two txt files

Asked

Viewed 578 times

1

I have two different.txt files as an example: arquivo1.txt has 1854 lines with 6 numbers on each line separated by " "(a space). In the other file 2.txt I have more than 1 million lines with 6 numbers on each line separated by " "(a space too). I tried to get the first line of the archive2.txt to analyze all the lines of the archive1.txt by looking for how many equal numbers there are in each line of the archive1.txt and then go to the next line of the archive2.txt and again do the same analysis that was done before, by the end of archive2.txt. But the problem is that my code is only analyzing the first line of the file1.txt with the first line of the file2.txt, then jumps to the next line of both files. Someone can help me write this code the way I’d like. The code is as follows::

public class Confere {

public static void main(String[] args) throws FileNotFoundException {

try {
// pega os arquivos txt´s
File file = new File("C:/Users/Usuario/Documents/Vander/mega.txt");
File file2 = new File("C:/Users/Usuario/Documents/Vander/resultadomega.txt");

FileReader fileReader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(fileReader);

FileReader fileReader2 = new FileReader(file2);
BufferedReader bufferedReader2 = new BufferedReader(fileReader2);

while (bufferedReader.ready()) {

bufferedReader2.ready();

String linha = bufferedReader.readLine(); // lê uma linha...
String linha2 = bufferedReader2.readLine(); // lê uma linha...

if (linha.toString().contains(linha2.toString())) { // verifica se as linhas são iguais
System.out.println("igual");
// #####################################################
RandomAccessFile raf = new RandomAccessFile("C:/Users/Usuario/Documents/Vander/relatorio.txt", "rw");

raf.seek(raf.length());
raf.writeBytes(linha + "\r\n");
raf.close();
// ######################################################
System.out.println(linha);
} else {
System.out.println("diferente");
}
}

} catch (IOException e) {
throw new RuntimeException(e);
}

}

I’d be very grateful, because it’s very difficult...

  • In short... You want to count the occurrences of each line of the file 1.txt in the file 2.txt?

  • That would be, showing the line of the file 2.txt and how many occurrences they had in each line, for example: line 1 = 4 occurrences and so on until the end of the file 2.txt.

2 answers

1

I had no way to test,but tries to store the contents of the files in two lists,example:

public static void main(String[] args) throws Exception{

    File file = new File("C:/Users/Usuario/Documents/Vander/mega.txt");
    File file2 = new File("C:/Users/Usuario/Documents/Vander/resultadomega.txt");

    FileReader fileReader = new FileReader(file);
    BufferedReader bufferedReader = new BufferedReader(fileReader);

    FileReader fileReader2 = new FileReader(file2);
    BufferedReader bufferedReader2 = new BufferedReader(fileReader2);

    List<String> arquivo1 = new ArrayList<>();
    List<String> arquivo2 = new ArrayList<>();

    while (bufferedReader.ready())
        arquivo1.add(bufferedReader.readLine());

    while (bufferedReader2.ready())
        arquivo2.add(bufferedReader2.readLine());

    arquivo1.stream().forEach(linhaArquivo1->{
        long qtdOcorrencia=arquivo2.stream().filter(linhaArquivo2-> linhaArquivo2.equals(linhaArquivo1)).count();
        System.out.println("Conteúdo: "+linhaArquivo1+" Quantidade Ocorrência: "+qtdOcorrencia);
    });
}
  • Almost there... I applied the code and it worked very well, analyzed all lines, but I would like to know how to analyze the content of each line and compare with the content of the other line showing how many equal numbers exist in each line. I tried to use the split, the Trim, inside the for, but I couldn’t, a tokeinizer solves the problem?

  • I guess now it solves your problem.

1


To get all lines from a file, use Files#readAllLines:

List<String> linhas = Files.readAllLines(Paths.get("C:/foo.txt"));

To get an array containing string items separated by whitespace, use \s+ as Pattern for the method String#split:

String []valores = "Stack Overflow".split("\\s+"); // ["Stack", "Overflow"]

To get duplicated items in two arrays, a solution is to create a temporary list and pass the array as argument (in list) for the constructor of ArrayList. Then using the method retainAll you get the elements that exist in the two collections, for example:

String []a = {"stack", "overflow", "em", "português"};
String []b = {"stack", "overflow"};

List<String> duplicados = new ArrayList<>(Arrays.asList(a));
duplicados.retainAll(Arrays.asList(b)); // ["stack", "overflow"]

With this you can take all the lines of the files, "break them" by the blank space and check if the items of one list exist in another.


Example

List<String> linhasA = Files.readAllLines(Paths.get("C:/a.txt"));
List<String> linhasB = Files.readAllLines(Paths.get("C:/b.txt"));

linhasB.forEach(linhaB -> {
   linhasA.forEach(linhaA -> {
       String []valoresLinhaB = linhaB.split("\\s+");
       String []valoresLinhaA = linhaA.split("\\s+");

       List<String> duplicados = new ArrayList<>(Arrays.asList(valoresLinhaB));
       duplicados.retainAll(Arrays.asList(valoresLinhaA));

       if(duplicados.size() > 0){
          String mensagem = String.format("Linha B: %10s | Linha A: %10s | Duplicados: %15s",
                            linhaB, linhaA, duplicados);
          System.out.println(mensagem);
       }
   });
});

Example of output:

Linha B:        2 4 | Linha A:  1 2 3 4 5 | Duplicados:     [2, 4]
Linha B:        2 4 | Linha A:    2 3 4 5 | Duplicados:     [2, 4]
Linha B:        2 4 | Linha A:      1 4 5 | Duplicados:        [4]
Linha B:        2 4 | Linha A:        1 4 | Duplicados:        [4]
Linha B:        2 4 | Linha A:      3 4 5 | Duplicados:        [4]
===
Linha B:      2 4 5 | Linha A:  1 2 3 4 5 | Duplicados:  [2, 4, 5]
Linha B:      2 4 5 | Linha A:    2 3 4 5 | Duplicados:  [2, 4, 5]
Linha B:      2 4 5 | Linha A:      1 4 5 | Duplicados:     [4, 5]
Linha B:      2 4 5 | Linha A:        1 4 | Duplicados:        [4]
Linha B:      2 4 5 | Linha A:      3 4 5 | Duplicados:     [4, 5]
===
Linha B:    1 2 5 4 | Linha A:  1 2 3 4 5 | Duplicados: [1, 2, 5, 4]
Linha B:    1 2 5 4 | Linha A:    2 3 4 5 | Duplicados:    [2, 5, 4]
Linha B:    1 2 5 4 | Linha A:      1 4 5 | Duplicados:    [1, 5, 4]
Linha B:    1 2 5 4 | Linha A:        1 4 | Duplicados:       [1, 4]
Linha B:    1 2 5 4 | Linha A:      3 4 5 | Duplicados:       [5, 4]
...
  • 1

    Thanks for the help Renan, that’s exactly what I was trying to do and I couldn’t, with your help I can continue now in my project. Thank you.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.