Automatic null value check versus types like "Option<T>"?

Asked

Viewed 68 times

4

I recently started learning Rust and was introduced to the type Option<T>, representing, through a sum type, the presence or absence of a value (mutually exclusive possibilities). This same idea is also present in other languages, such as Maybe in Haskell or Optional in Java.

Parallel to this, some languages, such as Typescript, C# or Java, have the idea of null value (usually represented by a value null), that can, generally simply assume value in any type. For example, in Typescript, without the flag strictNullChecks enabled, this code is valid:

//# strictNullChecks desabilitada.

const a: string = 'Luiz';
const b: string = null; // Não dá erro.

console.log(a.toUpperCase());
console.log(b.toUpperCase()); // Erro (em runtime): Cannot read property 'toUpperCase' of null 

See on Typescript playground.

This behavior is associated with "one billion dollar mistake", so that some solutions were later developed to mitigate it. Among them:

  • The creation of sum types such as the Option<T> of Rust, which explicitly indicate the possibility for an optional value to contain the null variation.
  • The establishment of options such as strictNullChecks of Typescript or the nullable Reference types of C# 8, which, when properly configured, causes the compiler to verify, at compile time, the possibility of a value being null and emitting an error.

In view of this, I have the following doubts:

  • These two mitigation techniques solve this problem (usually associated with errors such as null Reference Exception, null Pointer Exception, cannot access Property ... of Undefined/null, etc). Right?
  • There’s one advantage over the other?
  • If there is a choice between Option<T> and settings as strictNullChecks, what are the advantages and disadvantages of each so that I can make a known choice?

2 answers

4


I’ll start by saying that Java has and it was the language that drove the most idea of the indiscriminate use of null. Now she has another option, just like other languages have it now officially or by third-party libraries. Some older legacy-dependent languages are creating strict modes where the voidable type is not normal.

Null breaks the static and strong typing of language. A variable can have more than one type of value, the valid and invalid value which is in practice of another type.

Functional languages cannot have side effects, so they cannot have exceptions. Imperative languages are adopting a more functional style (and some users of them are abusing it). The question disregards other ways of treating the non-existence or invalidity of a value, but this is important. This mechanism is more than just avoiding the null.

Then it becomes complicated to say that something did not work. Unless instead of returning a simple value:

  1. returns more than one value with one of them indicating failure, if the language decides to accept that internally there will be an encapsulation in another type, which can be considered magical;
  2. pass a parameter by reference that will receive extra information if it worked out, which is convoluted;
  3. creates a type that has value if it is valid or invalidity information.

The same goes for a value within an object that may be invalid, where you create:

  1. a new field indicating invalidity and a mechanism that controls in the object this invalidity;
  2. a type that controls the invalidity of its own value.

The last items in these lists are just the type Optional (for some contexts may be the Maybe or Result).

This technique solves the "billion dollar mistake" because the null no longer exists, the language does not accept this normally (some may accept it optionally, even for the legacy). In fact the technique is not necessary to end the null, but its advent and inclusion in the standard language or library encourages more organized code.

Making an analogy, C uses to represent a string (by default) a pointer to a string plus an extra die with its size (yes, that’s the right thing to do, don’t take advantage of the terminator). The old C standard functions, work with terminator, but there are better ones that have extra size information and this is safer and more correct. Some people create a type that the two information is together and makes it much easier. In other languages this is better abstracted. It’s the same thing, the optional type is the value that needs extra information so you don’t have to send something separate together and let the person have to deal with it. The guy makes you do the right thing and hides the implementation detail that doesn’t interest you.

Everything has both advantage and disadvantage. I talk a little about this in Why should we avoid returning error codes?. I like the optional type more than the null or exception, or even a more explicit error code.

  • It depends on what you are comparing is a more universal and standardized technique to adopt.
  • In reasonable languages it is more abstract and clear what is intended, it has semantics.
  • It forces you to do the right thing and treat it properly, without looking so magical. It’s not a time bomb ready to explode as it is the null, or exception out of control.
  • Gives more flexibility on how to use.
  • Tends to have more efficiency (depends on the case and how to implement other mechanisms we are comparing).
  • It has more theoretical basis ~~~.

It’s not all flowers:

  • In some cases, having to deal with it instead of leaving the normal flow may cost a little more expensive. It depends a little on the style of the code, but it happens in very few cases. It can also have a slight extra cost of memory to save the state, but it depends on the case and implementation (C# does not have this cost in most cases).
  • Being explicit is considered noise in the code by some (in general people who claim to be explicit in several other things, so I don’t think much of what these people say). In many cases there is no more code, there may be less, it depends somewhat on the language (I understand that C# does not have a Optional official because he is waiting for the ADT to do right).
  • Some people may have difficulty understanding this form when they got used to another way.
  • I have seen people say that it can use wrong, but all mechanisms can and what we see in practice is that it is better used than its "competitors".

Yeah, they’re not real drawbacks :)

Of course you will not use it where an exception fits properly. Where it better replaces the exception it should not even have been used. So Rust has only panic exceptions, which is where they are useful, and they have no limitations where only the exception solves. Well, there are some cases, like foreign exceptions, where can be better than receiving an optional type and having to deal with it, even by happy journey which should be common (in good implementations).

Only for compatibility and legacy should opt for null.

The ideal is that you do not have an invalidity situation or no value, prefer this, then go to the Optional, and if it does not exist or has specific contraindication for the case, use another mechanism.

Of course, some languages don’t have that much culture and it counts when deciding what to use. I’m sorry for that.

Note that the type Optional is not the same as the Nullable that is not so good because it is a null otherwise.

0

Both solve the same problem.

Option<Tipo> x is more explicit, and looks more like a collection that can contain 0 or 1 elements.

Tipo? x is less verbose, more ergonomic when the underlying Apis use too much the object-or-null language, returning null usually signals some kind of error.

I believe that’s why Kotlin and Swift (and maybe C#) have adopted variable types, because the Android and iOS Apis use the object-or-null language a lot, and changing that can take decades.

An argument against Option<Tipo> is that if a method accepted Tipo as a parameter, and requires Option<Tipo>, all callers of this method will have to undergo maintenance. While, if asked Type and starts to ask Tipo?, callers need not change.

Others say this is a good thing: changing the type of Tipo for Option<Tipo> changes the semantics of the method and therefore it is desirable that the compilation "break" in the callers for review. (Personally, I agree.)

Source: https://softwareengineering.stackexchange.com/questions/410724/why-f-rust-and-others-use-option-type-instead-of-nullable-types-like-c-8-or-t

Browser other questions tagged

You are not signed in. Login or sign up in order to post.