How does the (anti-)EAV (Entity Attribute Value) standard work?

Asked

Viewed 1,131 times

6

I would like a functional example, because I do not understand very well this dynamic relationship of creating table of types, with table of data, a dynamic relationship, that many consider confusing.

2 answers

7


Functioning

It is very simple. Instead of having tables and columns, as is well known in the relational model, there is some table(s) with basically two columns, a key pair and value.

In the key there is an identification of which is the column of data you are using there and a unique identifier of what would be the row (something like the id which we use in tables). Eventually it is possible that this key is separated into two columns (one of them may be the virtual table the column belongs to).

The value is equal to what you would have in a column even. But as in most databases the column type has static type, you have to opt for some convention to record all data. Can be a character type with no limit size.

Obviously you need a data dictionary indicating the columns, the actual types of data so that there can be a conversion when you need the data with "right" typing and provide other relevant information. This dictionary may be in the database or only in the application.

There are extreme cases where there is only one table and the key also has additional information identifying which data table that data refers to. It used to be a anti-Pattern.

Think of something like that:

CREATE TABLE TABELA (
    TABELA INT NOT NULL,
    COLUNA INT NOT NULL,
    LINHA INT NOT NULL,
    VALOR VARCHAR,
    PRIMARY KEY (TABELA, COLUNA, LINHA));

Everything is recorded there. In some cases table and column identifiers are text with the name, which can be worse by generating slower access and taking up more space.

If I had tables like this:

CREATE TABLE PESSOA (
    ID INT NOT NULL PRIMARY KEY,
    NOME VARCHAR(60),
    NASCTO DATE,
    SALARIO MONEY);

CREATE TABLE DIVIDA (
    ID INT NOT NULL PRIMARY KEY,
    CLIENTE INT,
    VENCTO DATE,
    VALOR MONEY);

Transposing to EAV would be recorded like this:

INSERT INTO TABELA (TABELA, COLUNA, LINHA, VALOR) VALUES
                   (1, 1, 1, "JOÃO"), //nome da pessoa linha 1
                   (1, 2, 1, "15/07/1980"), //nascto
                   (1, 3, 1, "2000,00"), //salario
                   (1, 1, 2, "JOSÉ"), //linha 2 da pessoa
                   (1, 2, 2, "28/10/1986"),
                   (1, 3, 2, "1500,00"),
                   (2, 1, 1, "1"), //tabela divida, relaciona pessoa 1
                   (2, 2, 1, "20/08/2016"), //vencto
                   (2, 3, 1, "100,00"); //valor

I put in the Github for future reference.

Imagine the difficulty of making relationships with this data. Imagine how more complicated optimizations are. Imagine how much confusion will occur when all data is written as text, using a technique called stringly typed.

To use it right you almost need to reproduce "in the hand" what the database already does for you. What is fashionable, but does not mean that it is good.

When to use

Of course this has its usefulness there. Nothing is so bad that it has no use case. You need to know how to use it to the right extent when it solves a problem that can’t be solved better. The problem occurs when the person adopts it because they do not know what they are doing.

Whenever you need flexibility in the layout of the model, it is suitable. This way you go over the rigidity of the relational model. This example above is not so suitable since it simulates the relational model, but I did it to better understand how it works coming from relational.

It makes more sense when the end user can determine which columns need to be used. Whether by choice among several options, or by free creation. If it is by choice between options it will compensate more if there really are several options and almost always few will be used. In free creation, there’s no other way.

Some consider the Nosql an alternative to the EAV (functioning). In fact Nosql is the ready-to-use EAV (at least in some Nosql modalities). The term Nosql is somewhat misused, in fact what they mean is Norelational. One of the great advantages of Nosql is precisely not having scheme.

In fact, if you’re going to use it that way, there’s a good chance that a database called Nosql is a better solution (or it was, today many relational people can do the same thing in a simple way, technologies evolve). But that’s all it takes. In a relational database (SQL or not) it’s easy to choose parts to be like this, parts not. I don’t know if you can have the same flexibility in most banks called Nosql, but almost.

Some relational databases allow optimizations for "open schema" scenarios. An example is SQL Server, I do not know if Oracle has something like this. The simplest banks, such as Sqlite, do not have.

Articles

Wikipedia article.

Example of use in a complete article.

Another article.

A comparison of models.

Response in the OS showing some difficulties.

  • Very enlightening, the people are really wanting to use this structure, arguing that it increases performance, and makes it possible, more consistency of data, and I had no idea if it was worth it, now that I’ve gained a little more sense of the thing, You might think about maybe not having to adopt her. Thank you.

3

The name itself is quite explanatory. This standard serves to bring a certain dynamicity in the structure of the data that will be saved in the database.

As you well know, usually a table is created to represent a certain information, where each column of this table will represent an attribute of the data saved. Take as an example a table Pessoas.

Pessoas
Id      Nome      Idade      Sexo      Twitter      Facebook
---------------------------------------------------------------
1       Mario     37         M         @mario.2     fb.com/ma
2       José      21         N         @jose.3      fb.com/ze

The EAV standard gives you the possibility to "disengage" this information so that the attributes of Pessoa are no longer represented by columns but are represented by rows in another table. The relation of an entity to the attributes and values is then made by means of relations between the tables.

Note that using this approach attributes can be created "at runtime", after all it is not necessary to change the structure of your table to add new attributes. And also, I think it’s important to point out that data integrity is somewhat compromised, after all, in most cases the column that saves the value of attributes will be text type.

Take an example:

Pessoas           Atributos          Valores
Id      Nome      Id      Descr      IdPessoa    IdAtributo      Valor
--------------   ----------------   --------------------------------------
1       Ari       1       Idade      1           1               37         
2       José      2       Sexo       1           2               M
                  3       Twitter    1           3               @mario.2
                  4       Facebook   1           4               fb.com/ma
                                     2           1               21
                                     2           2               M
                                     2           3               @jose.3
                                     2           4               fb.com/ze

See also this question:

  • Although it is clear to me, as "would work", I could not visualize a practical model of this. The relationship I even understand, I just haven’t managed to create a functional structure of it yet... I would like a practical example, in my view, in this other question, we only have the resolution of part of the problem, in practice, everything is very different.

  • What you want to know is when to use this model, that?

  • Actually @bigown was able to clarify clearly what I wanted. But thank you for your reply, it was also very useful for introducing the meaning of this (anti-)standard.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.