What is generic programming?

Question

What is generic programming?

Asked 9 years, 2 months ago

Viewed 2,199 times

24

What is generic programming?
It is a technique or a programming paradigm?
When should we use this concept?

According to Google, it looks like programming using Generic types, guy Cass Class<T>. I had not yet heard that there was a paradigm name for this :-) Although "programming paradigm" is mentioned in the wiki in Portuguese; in English it is "programming style".

– Caffé

2016/06/02 at 14:29

1 answer

Browser other questions tagged terminology generic paradigms

You are not signed in. Login or sign up in order to post.

by Maniero • **444,682** points · Answer 1 · 2016-06-02T14:44:24+00:00

Term

A paradigm is a programming style, where a technique is applied. So it’s a technique and can be a paradigm, according to specialized and recognized literature. We can say that it is a limited form of the paradigm of metaprogramming.

The formalization of the paradigm occurred in 1989 by Musser and Stepanov (from STL), but the technique was already adopted at least in the 1970s (possibly even before).

One can say that the term is just a way of generalizing a solution. But, excuse the pun (ah, sorry nothing, I loved doing it :) ), this is a generalization of the term :P

Better contextualized, genericity is the ability to write codes that adequately meet more than one type of data. It’s a way of writing code where the code has a wildcard type until it’s instantiated, then the real type is put there in place of this joker.

In typed languages some types were already generic (parametric), especially array and pointer. Genericity is the ability to create your own types in a generic way.

Technically it doesn’t have to be just types. It has language that accepts other forms of parameterization. You could, for example, have a class String<N> where N is the size of the text. This creates a new type String with N size characters. It would be different from a string common that we know with a size defined internally.

^{Speaking in terms of SQL string normal is VARCHAR, the one I’m exemplifying is CHAR. Remembering that VARCHAR only has maximum size (and look there, I will not go into details here), no set size, equal to string of languages. In general people play a CHAR(10) in string same, but not really equivalent types.}

Some information here will be based on the best known implementations, the paradigm does not have great demands with the exact way of functioning.

What a problem he solves

Instead of writing a function that works well for int and another that does exactly the same thing with float, Why not write only one code? In dynamic languages this is normal, whatever it comes "maps". But in static typing languages this is not possible. Each access to memory must be clearly based on the data type. This brings some advantages, which is not the case here.

The obvious drawback is to duplicate efforts. Genericity comes to solve this problem and make the code work for various types.

It is very common for the compiler to take a function that has a generic type and replicate it on its own for each type that is used in the application.

This helps the performance, but can make the code look great. That’s why the solution is not always ideal (it’s mostly). It’s good because it not only reduces repetition, but makes the code more DRY.

The exact mechanism is not defined by the paradigm, each language can choose the way it thinks best.

The main advantage is the security of types taking advantage of the same source. The second is usually the best performance for solving certain things at compile time (the extent of this varies greatly in each language).

Generic function

Example in pseudocode:

T Soma<T>(T p1, T p2) { return p1 + p2; }
Soma(1, 2);
Soma(1.5, 2.7);

Works perfectly, in the first case the parameters and return will be of type int and the second will be float. Because of the use of the function there will be the creation of two concrete functions in the target code generated by the compiler:

int Soma(int p1, int p2) { return p1 + p2; }
float Soma(float p1, float p2) { return p1 + p2; }

Of course any type that is used needs to be able to add values, if this is not guaranteed, there will be compilation or execution problems. But it depends on how each language works, I won’t go into detail now.

You may think it’s just syntactic freshness, but the constant type there has implications on memory composition and eventually on code semantics (even though it’s syntactically the same except for the type).

The "super variable" that defines the type (T in the example) can be used anywhere in the code, within the scope of its definition, of course, where a type.

Generic classes

The idea can be expanded. It is possible to write a class with a generic parameter where you need it. When you see a List<int> is because the definition of class List was written as List<T> and everywhere you need the kind that should be on this list, in this example, a int, the T will be exchanged for int. It is almost a substitution of text in the written source code. Of course, the compiler does this robustly (it’s different than a preprocessor that doesn’t understand well what it’s doing and can be a problem).

var lista = new List<int>;
lista.Add(1); //ok
lista.Add("1"); // erro, ele não aceita string

I guess I don’t have to tell you that deep down there will be a new class for every guy you use. But they will all be compatible and there are some optimizations to save space.

Some languages optimize more than others. C# is very good at this, it replicates what really needs to reuse what has compatible memory alignment (because of indirect). C++ is not, but it is more flexible, has a good reason for not being, and has a technique to avoid too much duplication. The best is Java that does not replicate, but has several disadvantages because of this, missing a part of the reason to use genericity (but retains the main one that is type security). I won’t go into detail, it won’t fit here.

You can think about that T that we see there in the code as a super variable whose possible values are any types available. Only this variable is available at a higher level and resolves in the compilation.

Obviously, if the application only uses a generic class with a type, only one version of it will exist. The same goes for functions. The use of the parameterized (generic) item is that it will define how many copies there will be.

This is a feature that can, in some languages, be used to make static polymorphism (parametric), which has advantages and disadvantages compared to dynamic polymorphism (of subtype), more traditional, which uses methods virtualization. Polymorphism ad hoc (static or dynamic), known as overload function is something else.

Remember the function problem that makes an addition to any type? It is possible to restrict the types that can be used generically. Example:

T Soma<T : Addable>(T p1, T p2) { return p1 + p2; }

Here the type has to implement the interface Addable to work, ie, it has to be able to make an addition. This solves the problem. Note that this is an abstract example in a fictional language. I don’t know any language that has such an interface (there are some that have something similar).

class MinhaClasse<T: Widget> {
    private T widget;
    set(T w) { widget = w; }
    T get() { return widget; }
}
var x = new MinhaClasse<Button>(); //Button é um Widget
x.set(new Button());
x.set(new TextEdit());  //TextEdit é um Widget
print(x.get()); //imprime algo do TextEdit
var y = new MinhaClasse<Stream>(); //não funciona porque não é um Widget

This avoids having to write (which is exactly what the compiler will do for you):

class MinhaClasse {
    private Button widget;
    set(Button w) { widget = w; }
    Button get() { return widget; }
}
class MinhaClasse {
    private TextEdit widget;
    set(TextEdit w) { widget = w; }
    TextEdit get() { return widget; }
}

I put in the Github for future reference.

It is possible that the language has optimizations or allows techniques that do not need to duplicate these classes. In fact the normal is to internally adopt dynamic polymorphism to avoid duplicity.

Alternatives

The alternative to this is the dynamic type (accepts anything by definition, through a language mechanism or by technique, such as the void * of C for example, or use Object as it was done in C# and Java in the past) which ends type security.

Or have a macro text replacement system that rewrites the code with the type when needed. This might even work, but done the right way is essentially the use of genericity in a simplified way, without context, as C does. Is to ask for a headache until syntactic.

Where and when to use?

In languages that preach the security of types must use all the time. It is not possible to obtain this functionality without writing repetitive code pads. Any place that detects that a code will be repeated to meet more than one type, the genericity must be applied.

Criticism

Some people say that the target code ends up getting too big (so there are auxiliary techniques, like dynamic polymorphism), but there are techniques that this does not occur. And there are others that the programmer can help avoid (specialization and derivation, for example).

Some say the code is less readable, but that’s very questionable. Perhaps slightly more difficult, you have to think a little more about the problem, but it seems to me to be something appropriate.

Finally, error messages are not the best possible to help the programmer. It depends on the implementation. It is true that in many cases it will be at least a little worse, but after some evolution it is rare to have messages completely meaningless as it happened in the past.

Relevant questions with examples of use: