How to manipulate a CSV file column?

Asked

Viewed 1,277 times

2

Hey, guys! What’s up? I am learning Java and need help with CSV files. I can read the file, but I need to perform operations with a few columns. Example, with fictitious data, of the table I am using:

ID, Nome, Idade, Cargo, Salario
15, Alessandro Martins, 25, Assistente Administrativo, 1800
36, Fátima Ribeiro, 30, Gerente Administrativa, 3000
99, Roberta Menezes, 32, Vendedora, 2500

The table I’m using has 17,795 rows and 185 columns.

I need to take the average Age and define the highest Salary, but I do not know how to go through only the column of Age or Salary, from other methods I created.

Will calculate the age average:

public static double mediaIdade(Como passar a coluna como parâmetro?){
   //Como calcular a média da coluna idade aqui dentro?
   return media;
}

You’ll find the top 10 salaries:

public static double dezMaioresSalarios(Como passar a coluna como parâmetro?){
   //Como colocar os 10 maiores salários aqui dentro?
   return media;
}

This is the code I used to read the file:

public static BufferedReader lendoCSV(String arquivo, String separador) {
        BufferedReader conteudoArquivo = null;
        String linha = "";

        try{
            conteudoArquivo = new BufferedReader(new FileReader(arquivo));
            linha = conteudoArquivo.readLine();
            while((linha = conteudoArquivo.readLine()) != null) {
                jogadores = linha.split(separador);
            }
            System.out.println("A leitura do arquivo deu certo!");
        } catch(FileNotFoundException e) {
            System.out.println("Arquivo não encontrado: \n" + e.getMessage());
        } catch(ArrayIndexOutOfBoundsException e ) {
            System.out.println("Indice fora dos limites: \n" + e.getMessage());
        } catch(IOException e) {
            System.out.println("Erro de entrada de dados: \n" + e.getMessage());
        } finally {
            if(conteudoArquivo != null) {
                try {
                    conteudoArquivo.close();
                } catch(IOException e) {
                    System.out.println("Erro de entrada de dados: \n" + e.getMessage());
                }
            }
        }
        return conteudoArquivo;
    }

Can someone help me?

From now on, thank you very much!

  • It seems to me that after you gave the split in the row just access the array elements. Since it is a CSV the column will always be the same. Ex.: jogadores[2] will always contain age.

  • Yes. The problem is how do I add up the whole column age and average? How do I go through this column to get data I want? Example: the column age(players[2]) has 17,995 lines, how do I add up all these ages to average, if all ages are in a single entry of a vector and not in an integer vector... Also, how am I going to convert each row of this column into a double type?

  • Inside the while... Each iteration of it is a line being read. Just have an external variable to store the sum and another to store the number of items. So when leaving the for, all lines will have been read and just do the average calculation.

2 answers

2

First, you create a class to represent the player:

public class Jogador {
    private final int id;
    private final String nome;
    private final int idade;
    private final String cargo;
    private final int salario;

    public Jogador(int id, String nome, int idade, String cargo, int salario) {
        this.id = id;
        this.nome = nome;
        this.idade = idade;
        this.cargo = cargo;
        this.salario = salario;
    }

    public static Jogador parse(String linha, String separador) {
        String[] partes = linha.split(separador);
        if (partes.length != 5) {
            throw new IllegalArgumentException("Linha de jogador mal-formada: " + linha);
        }
        try {
            return new Jogador(
                Integer.parseInt(partes[0].trim()),
                partes[1].trim(),
                Integer.parseInt(partes[2].trim()),
                partes[3].trim(),
                Integer.parseInt(partes[4].trim()));
        } catch (NumberFormatException x) {
            throw new IllegalArgumentException("Linha de jogador mal-formada: " + linha);
        }
    }

    public int getId() { return id; }
    public String getNome() { return nome; }
    public int getIdade() { return idade; }
    public String getCargo() { return cargo; }
    public int getSalario() { return salario; }
}

Observe the method parse which serves to turn a CSV line into a player instance.

To read all lines in the file, you can use the method Files.readAllLines(Path, Charset):

Files.readAllLines(new File(arquivo).toPath(), StandardCharsets.UTF_8)

That will provide you with a List<String> with the contents of all lines of the file. This also eliminates the need to open and close the file, and read each of the lines.

And to convert these lines into players, you can use the Stream:

public static List<Jogador> lerJogadores(String arquivo, String separador) throws IOException {
    return Files
            .readAllLines(new File(arquivo).toPath(), StandardCharsets.UTF_8)
            .stream()
            .skip(1)
            .map(s -> Jogador.parse(s, separador))
            .collect(Collectors.toList());
}

Note the .skip(1), that serves to skip the CSV header. The result is a List<Jogador>.

To calculate the average of several players, you can use the IntStream, who already has the method average() which exactly serves to calculate the average:

public static double mediaIdade(List<Jogador> jogadores) {
    return jogadores.stream().mapToInt(Jogador::getIdade).average().orElse(0.0);
}

And for the top 10 salaries:

public static int[] dezMaioresSalarios(List<Jogador> jogadores) {
    return jogadores.stream().mapToInt(Jogador::getSalario).map(x -> -x).sorted().limit(10).map(x -> -x).toArray();
}

Here’s a little trick. The method sorted() orders the IntStream, but that would take the 10 lowest wages instead of the 10 highest. The solution is to put a .map(x -> -x) before and after the .sorted() to force him to order from behind-to-front. The limit(10) ensures that you will only get the 10 that matter.

After that, it’s easy to do the main. Note the Arrays.toString(int[]):

public static void main(String[] args) throws IOException {
    List<Jogador> jogadores = lerJogadores("xxx.csv", ",");
    System.out.println("Média de idade: " + mediaIdade(jogadores));
    System.out.println("10 maiores salários: " + Arrays.toString(dezMaioresSalarios(jogadores)));
}

Here follows the full code:

import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Stream;
import java.util.stream.Collectors;

public class Teste {

    public static void main(String[] args) throws IOException {
        List<Jogador> jogadores = lerJogadores("xxx.csv", ",");
        System.out.println("Média de idade: " + mediaIdade(jogadores));
        System.out.println("10 maiores salários: " + Arrays.toString(dezMaioresSalarios(jogadores)));
    }

    public static List<Jogador> lerJogadores(String arquivo, String separador) throws IOException {
        return Files
                .readAllLines(new File(arquivo).toPath(), StandardCharsets.UTF_8)
                .stream()
                .skip(1)
                .map(s -> Jogador.parse(s, separador))
                .collect(Collectors.toList());
    }

    public static double mediaIdade(List<Jogador> jogadores) {
        return jogadores.stream().mapToInt(Jogador::getIdade).average().orElse(0.0);
    }

    public static int[] dezMaioresSalarios(List<Jogador> jogadores) {
        return jogadores.stream().mapToInt(Jogador::getSalario).map(x -> -x).sorted().limit(10).map(x -> -x).toArray();
    }

    public static class Jogador {
        private final int id;
        private final String nome;
        private final int idade;
        private final String cargo;
        private final int salario;

        public Jogador(int id, String nome, int idade, String cargo, int salario) {
            this.id = id;
            this.nome = nome;
            this.idade = idade;
            this.cargo = cargo;
            this.salario = salario;
        }

        public static Jogador parse(String linha, String separador) {
            String[] partes = linha.split(separador);
            if (partes.length != 5) {
                throw new IllegalArgumentException("Linha de jogador mal-formada: " + linha);
            }
            try {
                return new Jogador(
                    Integer.parseInt(partes[0].trim()),
                    partes[1].trim(),
                    Integer.parseInt(partes[2].trim()),
                    partes[3].trim(),
                    Integer.parseInt(partes[4].trim()));
            } catch (NumberFormatException x) {
                throw new IllegalArgumentException("Linha de jogador mal-formada: " + linha);
            }
        }

        public int getId() { return id; }
        public String getNome() { return nome; }
        public int getIdade() { return idade; }
        public String getCargo() { return cargo; }
        public int getSalario() { return salario; }
    }
}

1


I haven’t programmed in Java for a long time but I think I can understand the concept.

Within the while, where each CSV line is read, conversions will be made and added to an accumulator. At the end the average is calculated.

conteudoArquivo = new BufferedReader(new FileReader(arquivo));

double soma_idade = 0;
int qtde = 0;

String linha = conteudoArquivo.readLine();
while((linha = conteudoArquivo.readLine()) != null) {
    jogador = linha.split(separador);
    soma_idade += Float.parseFloat(jogador[2]);
    qtde++;
}
double media_idade = soma_idade / qtde;
System.out.println("A leitura do arquivo deu certo!");
  • That’s cool! Thank you! While I was thinking about how to make my question clearer, I had an idea and it worked and it looked like yours. Only that I chose to play all converted values in another vector, which I called age, now worked out! Thank you very much!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.