A bit of history
The discussion on the use of null
is discussed by developers of various platforms. I’ve heard a lot about this in Java as well.
On one occasion, talking to a colleague after a NullPointerException
have burst into production, he was arguing that null
should not exist, that someone, I do not know where, was trying to remove the null
of Java, etc. For object-oriented languages, some people suggest the use of a Pattern design (design standard) called Null Object (Null Object).
Besides, I’ve picked up some problems in several procedures and darlings arising from the failure to process fields null
, mainly with null values in Boolean expressions, resulting in unforeseen behaviors (see the "table" @mgibsonbr posted in his reply).
An interesting historical account of the null
is in the Wikipedia[in English].
A case where the nay use of null
creates problems
Imagine a table of patients from a system for medical offices, whose field idade
is unsigned integer not null
. Let’s analyze the following code in PHP:
$idade = empty($campo_idade) ? 0 : intval($campo_idade);
if (!$idade) throw new Exception('A idade deve ser preenchida.');
insere_cliente(..., $idade, ...);
The above code tries to verify if the age field has a value that is considered true (true
) for PHP before entering into the database. Does this work? Yes, as long as the office does not serve infants under 1 year of age.
Now let’s suppose we sell the system to a pediatric clinic. What do we do now that the value 0
(zero) should be valid? We only remove the validation (if
), since the database is "protected" against null and negative values.
So we start getting a lot of complaints that certain patients are showing up at age zero. We identified that secretaries often forget to fill the age and the registration is included with zero age by default. This is easy to solve, just stop the registration to be done if the field is empty, it is not?
But let’s assume that some of the client clinics have an emergency room and sometimes it is necessary to register a patient without knowing his exact age. Managers of clinics need to know which records nay have the age filled to collect this information later. And now? Now one is missing value to say that no value and we know that zero is not an option.
What null
means to you?
A null value makes a lot of sense when it is part of the business rule to treat a value uninformed or nonexistent.
The truth is that using other values, such as negative numbers or textual constants, to represent non-existent values is a mixture of concepts and generates various problems. Would that, in the absence of salt, someone should put sugar in the salt shaker to avoid the disappointment of a guest to find the empty container?
The design pattern Null Object, for example, you can avoid an exception in OO languages, but if the value is not properly treated, it can generate very strange behaviors. Imagine a user making a query and the screen opening with all empty fields because the query record was not found and a null Object was returned and "displayed" in the system interface.
Some treatments for null
in SQL
Every programming language has its methods for treating nulls, but I’m not going to go into that here.
Dbms may exhibit different value behaviors null
, but in my experience, just following some good development practices can avoid these problems. One of them is to not only use the operator =
, but also the IS NULL
or IS NOT NULL
when searching columns that can be null, for example:
select * from tabela where (nome is null or nome = '')
Another example, according to the case outlined in the previous topic, which returns patients with no age filled:
select * from paciente where idade is null
The same logic applies to any type of field or variable. Example in Transact-SQL:
create procedure atualiza_paciente_sem_idade(
@id int,
@nova_idade int
)
as
declare @idade_atual int
set @idade_atual = (select idade from paciente where id = @id)
if @idade_atual is null
begin
update paciente set idade = @nova_idade where id = @id
insert into log (...)
end
Also, it is important to use the proper routine to handle null fields before trying to manipulate the data. Example in SQL Server:
select upper(isnull(campo_possivelmente_nulo, '')) from tabela
Unfortunately, each database system seems to have implemented its own solution to return a value default in the case of a null
. Oracle uses NVL
, Microsoft SQL Server and Access use ISNULL
, Mysql uses IFNULL
and Postgresql does not have an equivalent function. See some examples at this link.
However, an alternative that seems to work in all the cited Sgbdrs is the COALESCE
. This function returns the first non-null value of a list of values. For example:
select coalesce(campo_possivelmente_nulo1, 'valor inexistente') from tabela
The effect of the above command is the same as the previous one.
Another approach is to create a custom function to handle this. For example, create a function ISNULL
on Oracle with the same implementation of NVL
existing. However, as there are several other divergences between the SQL implementations, I believe that this border is not valid. For systems supporting various types of databases, I prefer abstraction provided by a framework ORM (Object-Relational Mapping).
Another situation to get our attention is when joins are necessary. But it is nothing too complicated. Just know difference and know when to use INNER JOIN
and OUTER JOIN
. Know about LEFT JOIN
and RIGHT JOIN
also helps.
It is very common questions on the internet like: "my query does not list a clinic when there are no registered patients, which can be?"
This is because it was used INNER JOIN
in place of OUTER JOIN
.
Let’s take an example:
select clinica.id, avg(paciente.idade) as average
from clinica
join paciente on paciente.id_clinica = clinica.id
group by clinica.id
The above consultation aims to calculate the average age of patients in a clinic. The problem is that only clinics that have at least one patient will be listed.
Let’s change this query a little bit:
select clinica.id, avg(paciente.idade) as average
from clinica
left outer join paciente on paciente.id_clinica = clinica.id
group by clinica.id
Now, with the LEFT OUTER JOIN
, clinic without patients will return a null average. Note that in this case we do not want the average to be 0
(zero) when there are no patients, because if the clinic attended only newborns the average would be effectively zero years of age!
If anyone wants to test these queries, can use this sqlfiddle.
Considerations
The main issue is that, from the user’s point of view, an error or strange behavior caused by null
is so bad when an error, strange behavior or limitation due to the nay use of null
.
A very big decoy that is behind some arguments against the null
is: if there were no null
Fewer execution errors would occur. In fact, errors occur because developers in general do not effectively create "safe" code, that is, treating input values properly. So some want to get around the problem of code quality by removing values that make the code fail.
Validate any input, including user input and file and database readings, it is (and always will be) good programming practice. The "lazy" programmer is thinking: "my routine only released an unexpected exception because so-and-so sent the value XPTO
".
Finally, null fields, which you could call optional, are a necessity in various types of situation. The effort to limit their use does not effectively improve the quality of the systems. If developers do not create a proper logic for system routines, it is useless to use nulls or values to simulate empty fields.
And if, by a change in business rules, a field becomes nullable? Bugs will appear everywhere?
The SQL-ISO standard has NULL, the function
COALESCE()
and the operator "IS NULL"... It’s a state, meaning "uninformed value" or "non-existent value", and not a value. NULL is widely used, it makes no sense to question its usefulness. There is a small dispute of concepts and forms of modeling that can replace the use of NULL ISO. I think the question/answers is emphasizing a dichotomy ("NULL is good or evil?") that does not exist, and deviates the objective discussion on the concepts of modeling.– Peter Krauss
How about taking advantage of the enthusiasm to contribute to the poor Wikipedia Portuguese in this type of discussion? One can take advantage of the broad panorama raised by article in English, that already has in the answers below a great translation and didactic explanation of part of the topics(!)
– Peter Krauss