Why are arrays covariant and generic are invariant?

Question

Why are arrays covariant and generic are invariant?

Asked 8 years, 8 months ago

Viewed 333 times

13

Arrays in Java are covariant, that is, it’s perfectly cool to do something like:

Number[] meuArray = new Integer[10];

Generic types are invariant. The line below does not compile:

List<Number> minhaLista = new ArrayList<Integer>();

Compiler output:

error: incompatible types: ArrayList<Integer> cannot be converted to List<Number>
        List<Number> minhaLista = new ArrayList<Integer>();
                                  ^

The question is, why did they decide to implement covariant arrays?

Original question: Soen - Why are arrays covariant but Generics are invariant?

1 answer

Browser other questions tagged java array generic language-design variance

You are not signed in. Login or sign up in order to post.

by Victor Stafusa • **63,338** points · Answer 1 · 2016-08-22T17:17:31+00:00

Covariant arrays are considered a language flaw. Ideally they should be invariant as well.

The main reason they were designed as covariants was a poor attempt to allow the implementation of generic object sets, in particular so that methods working with object arrays could work with arrays of any type of object. This concept is actually a gambiarra, but it was a gambiarra that made it possible to create some code that worked in a minimally generic way.

Codes that take advantage of the covariance of arrays are codes that write elements in the same array from where they are read, such as codes that order elements of an array, or shuffle them, or duplicate them, etc. A simple example:

public static void embaralhar(Object[] array) {
    for (int i = array.length - 1; i > 0; i--) {
        int j = (int) (Math.random() * (i + 1));
        Object temp = array[i];
        array[i] = array[j];
        array[j] = temp;
    }
}

Note that once the parameter type is Object[], then it can be used to String[], Thread[], Cachorro[], based on the reasoning that "if String is subclass of Object, then String[] is subclass of Object[]", i.e., the code works because of the covariance. The type of the variable temp is Object, but represents an object obtained from within the array itself, and thus can be safely put back into it again. Note that in case you try to use the covariance to pollute an array with something of a type that is incompatible (and therefore cannot have come from it), such as adding an Gato in an instance of Cachorro[] represented by a variable of the type Object[], you will take a ArrayStoreException:

Cachorro[] array = new Cachorro[3];
Object[] mesmoArray = array;
mesmoArray[0] = new Gato(); // Estoura uma ArrayStoreException

This is from the time of Java 1.0. In fact, it predates even Java 1.0, since it appeared in the early stages of the development of Java 1.0 when it was still called Oak. At that time, we didn’t have the generic types that exist today (that came up with Java 5). We also didn’t have the Collections framework developed as today, and everything that referred to object collections was treated as an array (a concept inherited from C and C++). The most we had at the time were the classes Vector, Hashtable and Stack, who suffer from very serious modeling problems and whose purpose was more to serve as an example or a puzzle than to make a decent data modeling.

C++ has templates, which could serve as inspiration. However, the templates were not well seen and did not want to imitate them in Java. The reason is that the C++ compiler creates a version of the class or method for each combination of templates found, which greatly increases the size of the executable code produced. As the idea of Java was to be able to run even on devices with little memory and low computational power, this alternative was prohibitive. And also the C++ templates were invariant, and therefore a similar model would not serve to solve many of the problems faced, such as ordering lists of elements without knowing the type. So the consensus at the time was that Java didn’t need this.

It took a long time (and a good deal of practical experience and abuse of Casts) so that such a consensus would be abandoned and some idea would be formed that would allow the introduction of generic types without increasing the size of the executable code produced and that would have good flexibility in terms of covariance and countervariance. From this came the (hated) type-Rasure and the super and the extends that appear in some generic types. The super and the extends provide countervariance and covariance respectively.

The type-Rasure ensures that in the JVM generic types disappear, making them just a trick introduced by the compiler. Basically, in Java with Generics, the compiler adds underneath the scenes the Cms that would be needed in Java without Generics, which is a completely different approach than that adopted by the C++ templates. However, this solution based on type-Rasure created some problems, as generic types cannot be protected from data pollution in the same way as arrays do (which have the ArrayStoreException for that). Moreover, the implementation of arrays could not be reconciled with Generics (and that is why mixing arrays with Generics is difficult).

An example of data pollution (heap Pollution):

List<Cachorro> cachorros = new ArrayList<>();

// O compilador dá uma warning unchecked.
List<Object> objetos = (List<Object>) cachorros;

// Poluição de dados aqui. Um gato é introduzido na lista de cachorros!
objetos.add(new Gato());

// Vai dar ClassCastException, mas não há nenhum cast neste código.
// Na verdade, o compilador introduziu um cast por debaixo dos panos.
Cachorro c = cachorros.get(0);

If Java had generics types properly implemented from the start, arrays would not need to be invariant and type-Rasure it wouldn’t have been necessary. However, much of the knowledge needed for Generics to actually be implemented and to demonstrate which implementation strategies would work or not, was only developed thanks to the experience accumulated after a large number of Java developers suffered from this problem and sought various solutions themselves. In addition, Generics are a significantly complicated little monster and implementing them from the start would have hindered and delayed the release of Java as a programming language, so only having covariant arrays turned out to be the simplest output.