Java 8 (Stream) - Grouped sum

Asked

Viewed 4,417 times

2

Good morning.

I have the following function below:

private List<Object[]> calcularTotal(List<Object[]> lista, int chave, int valor){
    return lista.stream()
            .map(l -> new Object[] {l[chave], l[valor]})
            .collect(Collectors.groupingBy(o -> o[0], Collectors.summingDouble(v -> ((BigDecimal) v[1]).doubleValue())))
            .entrySet()
            .stream()
            .map(o -> new Object[] {o.getKey(), o.getValue()})
            .collect(Collectors.toList());
}

List : {[45, 100], [45, 200], [50, 30]}

Function result: {[45, 300], [50, 30]}

Accumulating a value I managed to do, but I have no idea how to do accumulating another value.

For example:

List : {[45, 100, 200], [45, 200, 400], [50, 30, 60]}

Desired result: {[45, 300, 600], [50, 30, 60]}

In both examples, the first position is the key by which you will group.

3 answers

2

I present a solution by map and reduce.

Map

Given the raw input List<Object[]> lista, you are only interested in the columns whose contents are in int... valores. You also want to separate the value of a given index as a key.

Reduce

Here we have two operations inside the pipeline:

  1. Group by key.
  2. Add all selected values, column by column.

Have you discovered the Collectors.groupingBy that makes the first reduction of the pipeline, now we need to write the step of the map and the step of the sum in the reduction. I will therefore develop an auxiliary class representing our "table":

public class Holder {

    /**
     * Segundo passo da redução.
     * @return soma colunas selecionadas
     */
    public static Object[] combine(Object[] first, Object[] second) {
        return IntStream.range(0, first.length).mapToObj(i -> {
            final BigDecimal x = (BigDecimal) first[i];
            final BigDecimal y = (BigDecimal) second[i];
            return x.add(y);
        }).toArray();
    }

    final BigDecimal key;
    final Object[] selectedValues;

    /**
     * Passo do mapeamento.
     *
     * @param values entradas cruas
     * @param keyIndex indice da chave
     * @param valueIndexes indices selecionados
     */
    public Holder(Object[] values, int keyIndex, int... valueIndexes) {
        this.key = (BigDecimal) values[keyIndex];
        this.selectedValues = Arrays
                .stream(valueIndexes)
                .mapToObj(i -> values[i])
                .toArray();
    }

    /**
     * Elemento neutro para a redução.
     * @param size quantidade de colunas no elemento neutro
     */
    public Holder(int size) {
        this.key = null;
        this.selectedValues = new Object[size];
        Arrays.fill(selectedValues, BigDecimal.ZERO);
    }

    public Object[] getSelectedValues() {
        return selectedValues;
    }

    public BigDecimal getKey() {
        return key;
    }
}

Once you have this structure the processing is very direct:

public static Map<BigDecimal, Object[]> calcularTotal(List<Object[]> lista, int chave, 
                                                      int... valores) {
    final Map<BigDecimal, Object[]> results = lista
            .stream()
            .map(o -> new Holder(o,chave, valores))
            .collect(
                    Collectors.groupingBy(
                            Holder::getKey,
                            Collectors.reducing(
                                    new Holder(valores.length).getSelectedValues(),
                                    Holder::getSelectedValues,
                                    Holder::combine)));
    return results;
}

I see no reason to map one Map<BigDecimal, Object[]> again for a List<Object[]>, however, if you really need to, it is trivial to make the conversion upon the entrySet as per your own response.


Gist with full source code

2

I did, but it took a lot of work:

import java.math.BigDecimal;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

/**
 * @author Victor
 */
public class StreamArray {

    // Obtido daqui: https://stackoverflow.com/a/15497288/540552
    public static Object[][] transposeMatrix(Object[][] m) {
        Object[][] temp = new Object[m[0].length][m.length];
        for (int i = 0; i < m.length; i++) {
            for (int j = 0; j < m[0].length; j++) {
                temp[j][i] = m[i][j];
            }
        }
        return temp;
    }

    // Converte o List<List<Object>> em Object[][], faz a transposição, e converte de volta em List<List<Object>>.
    public static List<List<Object>> transpose(List<List<Object>> m) {
        Object[][] a = m.stream().map(List::toArray).collect(Collectors.toList()).toArray(new Object[][] {});
        Object[][] b = transposeMatrix(a);
        return Arrays.asList(b).stream().map(Arrays::asList).collect(Collectors.toList());
    }

    public static List<Object> join(List<List<Object>> lists) {
        // Recebemos um List<List<Object>> aonde o List interno é um conjunto de chaves e valores.
        // O List externo é uma lista de conjuntos de chaves e valores.
        return transpose(lists)
                .stream()

                // Agora temos um Stream<List<Object>> diferente.
                // O List interno é um conjunto de elementos em uma mesma posição.
                // O Stream externo é uma lista de conjuntos de elementos em uma mesma posição.

                // Em cada List interno, soma todos os valores. O resultado é um Stream<Object>, aonde cada Object é
                // a soma dos valores em uma dada posição.
                .map(x -> x.stream().collect(Collectors.summingDouble(v -> ((BigDecimal) v).doubleValue())))

                .collect(Collectors.toList());
    }

    private static Stream<List<Object>> calcularTotal2(Stream<List<Object>> st, int chave) {
        return st
                .collect(Collectors.groupingBy(o -> o.get(chave), Collectors.toList())) // Agora temos um Map<Object, List<List<Object>>>
                .entrySet() // Temos agora um Collection<Map.Entry<Object, List<List<Object>>>>
                .stream()

                // Temos agora um Stream<Map.Entry<Object, List<List<Object>>>>.
                // A lista interna equivale a um array contendo a chave e os valores.
                // A lista intermediária é um conjunto de listas representando chaves e valores tal que todas tem a mesma chave.
                // O entry é a relação de chaves para listas intermediárias.
                // O stream externo é o conjunto total.

                .map(e -> {
                    List<Object> in = StreamArray.join(e.getValue()); // Junta as listas intermediárias.
                    in.set(chave, e.getKey()); // Coloca a chave de volta.
                    return in;
                });
    }

    private static List<Object[]> calcularTotal(List<Object[]> lista, int chave) {
        return calcularTotal2(lista.stream().map(Arrays::asList), chave)
                .map(x -> x.toArray(new Object[x.size()]))
                .collect(Collectors.toList());
    }

    public static void main(String... args) {
        // Cria alguns valores BigDecimal para colocar no List<Object[]>
        BigDecimal BD_30 = BigDecimal.valueOf(30);
        BigDecimal BD_45 = BigDecimal.valueOf(45);
        BigDecimal BD_50 = BigDecimal.valueOf(50);
        BigDecimal BD_60 = BigDecimal.valueOf(60);
        BigDecimal BD_100 = BigDecimal.valueOf(100);
        BigDecimal BD_200 = BigDecimal.valueOf(200);
        BigDecimal BD_400 = BigDecimal.valueOf(400);

        // Vamos calcular com isso. A posição 0 de cada array é a chave.
        List<Object[]> a = Arrays.asList(
                new Object[] {BD_45, BD_100, BD_200},
                new Object[] {BD_45, BD_200, BD_400},
                new Object[] {BD_50, BD_30, BD_60});

        // Faz a mágica.
        List<List<Object>> b = calcularTotal(a, 0) // Faz a mágica.
                .stream()
                .map(Arrays::asList) // Transforma os arrays internos em listas, assim o System.out imprime eles de uma forma legal.
                .collect(Collectors.toList());

        System.out.println(b);
    }
}

Here’s the way out:

[[45, 300.0, 600.0], [50, 30.0, 60.0]]

First, I used the code to transpose matrices from here: https://stackoverflow.com/a/15497288/540552

The reason I need to do this is that when we have a list of lists, to get a list of sums, just apply for each element of the external list something that adds all the elements of a list. This is much easier than creating a list where each element is a position that contains the sum of the internal list elements at that position.

Having the code to transpose a matrix, to apply it to a list of lists, I first need to convert it into matrix, transpose, and then convert back into list lists.

In the code, you should realize that I did the method calcularTotal2 work with Stream<List<Object>> instead of Stream<Object[]> or List<Object[]>. The reason for this is that mixing arrays with streams is horrible and the code would be much more complicated if I didn’t do it. The method calcularTotal just converts the List<Object[]> in a Stream<List<Object>> and converts the Stream<List<Object>> resulting in List<Object[]>

Considerations about that:

  • It’s certainly possible to transpose the list list without having to convert into a matrix and convert back later, but that’s a bit of a hassle because the matrix is already created with all the necessary positions, while in the list list, both external and internal lists are created empty and cannot simply "set" an element in a position without this position existing before.

  • At the end I am converting from array to list in calcularTotal converting again into array transpose, converting back to list on transpose and converting again to array on calcularTotal. Obviously I could not do all these conversions and work with arrays from start to finish, but again I repeat that arrays and streams do not match.

  • Finally, using arrays directly like this is usually a sign that there is some problem in object orientation, even more if the array is of type Object. What the array represents should perhaps be a specific class with specific methods, especially considering that one of its positions has a special meaning (the key). In this case, it is possible that the algorithm/program will exit in a much more elegant way using more object orientation and fewer lists, maps and arrays.

  • Streams, like any other java tool are not silver bullets and can be used and abused incorrectly like anything else. That is, sometimes (but not always, obviously) it is easier and more practical to use the old for and leave the Streams on the side. It is also possible to use a hybrid strategy where there is a Stream on the outside and a for internal or vice versa, or something like that.

  • 1

    +1 For the effort. Probably took time to create all this. :)

1


I managed to solve the problem.

Still, thanks for your help.

How was the function:

public static List<Object[]> calcularTotal(List<Object[]> lista, int chave, int... valores){
    List<Map<Object, Double>> maps = new ArrayList<>();

    for (int i : valores)
        maps.add(lista.stream()
                      .collect(Collectors.groupingBy(o -> o[chave], Collectors.summingDouble(v -> ((BigDecimal) ((Object[]) v)[i]).doubleValue()))));

    List<Object[]> list = new ArrayList<>();

    maps.get(0).keySet().stream().sorted().forEach(o -> {
        List<Object> l = new ArrayList<>();
        l.add(o);
        maps.forEach(m -> l.add(m.get(o)));

        list.add(l.toArray());
    });

    return list;
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.