How to write easy-to-maintain, readable code?

Asked

Viewed 5,450 times

128

How to know if the code is easy to read and maintain? I usually abstract a large part of my codes, I have a habit of using a lot lambda functions in C#. As this function for CPF validation.


public bool ValidacaoCPF(int[] cpf)
{
    int DV1 = 0, DV2 = 0, aux = 10;

    if (!cpf.All(x => x == cpf.First())) //Verifica se todos os números são iguais
        if (cpf[9] == ((DV1 = (cpf.Take(9).Sum(x => (x * aux--)) % 11)) < 2 ? 0 : 11 - DV1)) //Cálculo do primeiro digito verificador
            return cpf[10] == (((aux = 11) * 0) + (DV2 = (cpf.Take(10).Sum(x => (x * aux--)) % 11)) < 2 ? 0 : 11 - DV2); //Cálculo do segundo digito verificador

    return false;
}

In this example I commented to facilitate who will read or change the code.

  • The solution would be to comment on and/or use as few ready-made functions as possible (which speed up a lot of development)?
  • What other techniques can be used to improve code readability?
  • 2

    Your question is whether you should encapsulate these 3 commented codes in functions with a relevant name and call these functions instead of putting the code in line and comment on what he does (as you did)?

  • 1

    @bigown Not my doubt how to know that a code you are writing will be easy to maintain for the next programmer, sometimes we programmers get excited and do things that were to become giant in a few lines (but the next one to get our code, can spend a lot of time to understand 100% of the logic we use). I wanted to know how abstraction is good and feasible from a code maintainability point of view.

  • 1

    I don’t know if it’s easy to answer that without resorting to an opinion, but it’s a good try. What I can say is that most programmers are usually more comfortable with more imperative style, separating the steps further. I already think that big code is hard to follow. But I know that I am minority.

  • @Bigown I asked the question based on a Programers, I believe it is possible to get an answer without going to the side of opinion.

  • 5

    There the answers help little, interestingly have better answer c/zero votes the most voted are anti-respostas. Here also did not start very well, passing the tangent, but I hope that still comes something good.

  • 1

    I just wanted to leave my thanks to this issue. Regards

Show 1 more comment

7 answers

133


First to be objective I have to suggest something subjective.

Teamwork

You say "make it easy for anyone to touch the code after". That "who" is very important. As much as you have a correct way of dealing with the subject technically one aspect cannot be overlooked. Politically you have to take into consideration specifically who is this "who".

Following what your team expects in many cases is more important than following rules, especially the blind rules. How do you expect a code to be easy to maintain if everyone who will probably touch it does not expect and possibly do not understand what you have written, no matter how correct it is?

It gets worse if you don’t even know who these people are, IE, when you are thinking of future programmers who replace you in software development.

Functional style X imperative style

Looking at your code, he has a problem for most programmers. I would guess that 90% of programmers who might come to maintain their software don’t quite understand the functional style of programming. Even those who understand the use of collection extension methods for queries do not fully understand their implications (Lazy Evaluation and side effects for example). And what will happen when this programmer is going to mess with your code? He will probably abandon his and write his own version.

You can argue that it is the programmer’s obligation to know how to use all language resources. And I’d be with you on this. I’ve even resigned from a job because I was required to make such simplistic codes for the most beginner intern in programming. I wasn’t politically correct, I just made a choice for my life.

The common programmer better understands imperative style, he understands how the loop works. It is no use saying that your code is obviously taking 9 elements from a collection and applying a formula to each of these elements and adding up all the results. It’s obvious to me how this works, but it’s not for many people. Maybe the programmer can even understand, but doesn’t know how to change it to achieve a different result that maintenance requires. We realize this happening right here website all the time.

And imagine in a more complex example. Your example is simple and not so hard to understand. I’m considering that it’s just a short example rightly posted like this, but the problem will actually appear in more complicated situations.

In some cases it can be difficult to make the functional style properly since we do not always know the implementation. In imperative style an inefficiency would be more obvious.

I know this example doesn’t need performance, but the programmer will have difficulty improving more complex code by not fully understanding what is happening in the code.

The intern

So I’ll tell you that depending on the possible scenario of who is going to maintain this code is difficult maintainability because it uses resources that few people are familiar enough.

I myself would not follow this rule of the lowest common denominator, but this is my decision and I am willing to bear the consequences of it.

Variable names

By making a specific analysis in your code (only in the parts that matter most), I say that your variables use good names.

Yes, variables must have significant names. And yours do. Because auxiliar is more readable than aux or because primeiroDigitoVerificador is better than dv1? (I used lowercase simply because this is the convention adopted in language)

Who said to use contadorDasVezesQueAlgumaCoisaOcorreEmUmLaço is better than i. Too long? I did what the "rule" says! But even if I use only contador it does not help understanding more than i in a loop.

Of course the further away from your declaration a variable will be used, the more the name needs to be meaningful. But when it will only be used in limited scope, the name is not so important. Of course, don’t use random names (you knew a programmer who used swear words in all variables). You dosed well.

Using a meaningful name can be useful for semantics to a chunk of code and avoid commenting, but it’s not always that important. Comments should be avoided, but you have to analyze the context, it’s more wrong to follow a rule blindly.

It is not curious that virtually every example of teaching code lambda uses x how argument? Examples written by geniuses of programming! Don’t they know the books that say variables should have significant names? Or do they just know how to use the rule the right way and not blindly?

If you are going to analyze it well, in your example if the argument was called elemento or item would be more descriptive, but everyone uses x and no one who knows the mechanism used fails to understand that there is an element of the collection.

The lesson here is to be careful with established rules. Whether they are spoken by strangers on the Internet or by established computer authors.

Even the best most popular books have been written in contexts that most people are unaware of, are usually vague when something should be used (and it has to be, there are so many scenarios) and they don’t usually explain how to reach that conclusion (when explaining, does not emphasize this, emphasizes the simplified rule that everyone will keep). So a book that seems very good and well-intentioned ends up, by failure of readers, leading people to commit barbarities, where the intention was opposite. And worse, almost no one notices this happening. They start using the book as a crutch for their atrocities.

DRY

For me, one of the most important principles in coding is the Don’t Repeat Yoursef. He preaches that, simply, codes should not be repeated.

In your code there is repetition. An excerpt can be abstracted into a function and eliminate repetition.

When you have code repetition, it’s harder to maintain. The amendment at one point does not ensure that the other point is amended in the same way.

Of course the chances of error in the example posted are not great. But they exist.

This is similar to the comment problem. It is common for the comment to be repetitive, even when it does not appear to be. When you change a code and do not change the comment to reflect the change, potentially you may have dissonance. Remember this is not a rule, I’m not saying I shouldn’t use comment.

Note that abstracting in function the repeated code in question will not make the code more concise, but will make it have its unique representation. DRY is not to make code concise. Concision does not always help maintenance, having a single point to change an algorithm helps maintenance. As long as creating this abstraction is no more complicated than leaving the code inline.

I admit that I exaggerate about DRY, but ignoring it has always brought me more maintenance problems.

Also do not follow DRY blindly.

Tips

(are not rules)

  • ask others to analyze your code;
  • try to understand your old programs;
  • try reading your code without the help of editors who "embellish" the code;
  • know all the coding rules and learn how nay use them (how to use is easy).

References

Some of the books that talk the most about the subject are Clean Code and Code Complete. The Portuguese version is not so good.

I was a little reluctant to put them on because they’re a little dogmatic, they try to sell ideas that don’t work that well in all situations, they sell ideas that seem absolute when they’re not, and they can create a trend for those who are not experienced and think that the book works like a manual. If you don’t know how to use the tips that books give, risks becoming dogmatic too.

The first sees a very specific way of developing, practically obliges to choose certain tools (not software but paradigms, methodologies and techniques) for everything. If you know how to take only what is most important to apply in their situations can take advantage of your content.

The second is a little broader in the considerations but tries to be too specific in some cases and does not contextualize well. It is common to give tips that work well in one language and not in another, but it does not deal with this.

Concluding remarks

You still have a problem of readability by other factors, mainly because of the ternary (I love this operator) it was hard to read the code. It seems that you were so happy that you managed to do everything in one line, that you forgot the readability. Or not, after all you are here asking her :)

Keeping many lines of code can affect maintenance as well. Concision well thought out, and not artificial, is a good thing too. Just can’t overdo it.

Not always when a person cannot understand your code means that it is unreadable, may be that you are dealing with a "parachutist", increasingly common in our midst.

I put some naming conventions in that reply.

And I try define what is clean code. And also elegant code.

  • 1

    Thanks for the answer, I really liked it. This code gets an array of integers like you mentioned, but it’s just Poc that I created, a programmer friend of mine challenged me and I made this expression lambda to meet this requirement only that during this process I came across this question: Cool a functional code, spared me several loops, but how much does it make up for? If the programmers who are moving don’t understand the way I thought?

  • 1

    The algorithm of module 11 it’s pretty clear how it works, I think the basic requirement to understand my code is to know a little lambda expression, so I was in doubt program for everyone or program for those who are at the same level as me (understanding)? I understood that this goes from the environment where you work (would not have felt this question in an environment where all programmers have full knowledge of the tools they use).

  • 3

    A solution would also be all have the same level, because if writing code always worried about the next do not understand (by the abstraction level used) ends up limiting the programmer, as it accumulates knowledge that is not put into practice.

  • 1

    This is a question that only you can answer in every situation you encounter. There is no magic formula. But you have subsidies to help make the decision.

  • 1

    I agree, the answers to this question are adding me a good knowledge, I will do everything to take advantage of the proposals shared here. I will wait for the bonus period to give reward. Thanks again.

  • 1

    "For me, one of the most important principles in coding is Don’t Repeat Yoursef. He preaches that (simply) codes should not be repeated." For me one of the most important principles. I’ve been there, getting code like that, and it wasn’t pretty...

  • 1

    Excellent answer! Just one comment on DRY: "When you change a code and don’t change the comment to reflect the change, you could potentially have dissonance." Extrapolating, no one should document your code, because if one changes and the other does not have the same dissonance... IMHO redundancy is not at all bad, in spoken language it occurs all the time (as a form of "error correction"). For the machine it is harmful - because it can cause it to receive conflicting instructions. But the reader seeing a comment that does not match the code is a good thing, since it leads to question him.

  • 3

    And since I brought it up, a "thumb rule" I read once and always try to use is just to abstract something that was repeated at least 3 times. In the same way that repeated code has brought me a headache in the past, I’ve also spent hours trying to "generalize" something that I ended up using only once...

  • 4

    @mgibsonbr Look who speaks excellent answer :) I’m not saying there shouldn’t be comments. There is always the danger of people interpreting it as a golden rule, of course. It’s just a reason to think about minimizing its use. And on the end, that’s true, eliminating repetition is one thing, generalizing is another. It doesn’t always pay off. I just don’t like following rules. I like to know them (I think this is the most important thing about rules). I can abstract something that does not repeat itself and in extreme cases do not do it even if it has 4, 5 repetitions. But I understood what you said. I’m still gonna edit out a few things.

  • A correction: the method All will not validate the other digits unnecessarily if the second digit is different from the first, in the implementation itself of the All<TSource> in the Enumerable we can see that there is a return in case the predicate gives you a false answer. http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.Cs,be4bfd025bd2724c

  • @Gabrielkatakura is true, this example does not have a problem, I left the more generic answer, thank you.

Show 6 more comments

47

1. Concision

As a rule, the smaller the code the better. Your decision to use higher level functions is, in my opinion, correct: at the same time it is clear what its commands do in general (All: checks whether every element in the list satisfies a predicate; Sum: sum up), the specific details are expressed as simply as possible (the ambles, telling which test is to be applied to each individual element, and what is being summed up according to the individual digits).

This article by Paul Graham (in English) expresses well the advantages of concision as well as its impact on readability. For example, would an explicit loop be more readable than the use of both? Perhaps, but what matters is not the readability of a single line of code - but of a program as a whole. Translating freely from the article:

If you are used to reading novels and newspaper articles, your first experience reading a mathematical publication can be discouraging. It can take you half an hour to read a single page. Still, I’m sure notation is not the problem, even if it seems to be. Publishing is hard to read because ideas are hard. If you expressed the same ideas in prose (as mathematicians needed to do before creating succinct notations), they wouldn’t be any easier to read, as the publication would grow to the size of a book.

The same occurs in programming languages: at first glance, a regular expression may look like a comic character swearing, but [unless it’s too big] it briefly expresses something that would require lines and more lines of code to be performed otherwise. Dense languages, such as J, also seem difficult to read at first glance, but the resulting programmes have far fewer lines than those expressed in a traditional language (even functional ones).

2. Presentation

However, a caveat: concision does not mean "fewer lines of code" (e.g..: "one-Liners" gigantic) nor "variables with shorter name" - this only hinders readability, instead of improving it. Instead, it refers to the number of logical elements that make up your program. Again citing the linked article:

I believe that a better measure for the size of a program would be its number of elements, where an element is anything that would be a distinct node if you drew a tree representing the source code. The name of a variable or function is an element; an integer or floating point number is an element; a literal text segment is an element; an element of a pattern, or formatting directive, is an element; a new block is an element. There are border cases (-5 is one or two elements? ) but I think most of them are the same for all language, so they don’t affect the comparison too much.

This metric needs more detail, and it can be interpreted in the case of specific languages, but I think it tries to measure the right thing, which is the number of parts a program has. I think the tree that you would draw in this exercise is the same tree that you would have to do in your head in order to design the program, so its size is proportional to the amount of effort you would have to write it or read it.

In other words, even if the number of elements is the same, the way of arranging it in lines, the use of white spaces, etc can greatly affect the readability. Example:

if (!cpf.All(x => x == cpf.First())) //Verifica se todos os números são iguais

    //Cálculo do primeiro digito verificador
    if (cpf[9] == (
        (DV1 = (
            cpf.Take(9).Sum(x => (x * aux--)) % 11
        )) < 2 ? 0 : 11 - DV1
    )) 
    {
        //Cálculo do segundo digito verificador
        aux = 11;
        return cpf[10] == (
          (DV2 = (  cpf.Take(10).Sum(x => (x * aux--)) % 11  )) < 2 ? 
          0 : 
          11 - DV2);
    }

Note that I have not touched anything in your program code except for the placement of blank spaces and line breaks [and I have moved the comment to fit]. I did it two different ways. Which of the three [the original and mine] is more readable is debatable, personally I consider the original as easy as the others of understand (I had no difficulty seeing the logic when reading your original code) but not so easy to debug (are so many parentheses that it took me to figure out what was closing what).

3. Separation of Responsibilities

Just as a function (or a class) should not do more than one thing at a time, a line of code that does this is harder to read than two that do each one thing alone. One example that I consider to be quite positive is jQuery, which at the same time allows the chaining of methods (method chaining) encourages the programmer to put each invocation on a separate line:

$(seletor)
    .hide()
    .children(subseletor)
        .addClass(classe)
        .show()
        .end()
    .css(...)
    .show();

Although at first it’s a one-Liner which mixes various responsibilities, the form of presentation makes quite clear what is being done, and facilitates even the change - in the sense of adding lines or removing lines (instead of fiddling in the middle of a line). Not always is this style of encoding possible, and in the absence of the same I suggest transferring multiple responsibilities to different instructions:

if (!cpf.All(x => x == cpf.First())) //Verifica se todos os números são iguais
{
    //Cálculo do primeiro digito verificador
    DV1 = cpf.Take(9).Sum(x => (x * aux--)) % 11;
    if ( cpf[9] == (DV1 < 2 ? 0 : 11 - DV1) ) 
    {
        //Cálculo do segundo digito verificador
        aux = 11;
        DV2 = cpf.Take(10).Sum(x => (x * aux--)) % 11;
        return cpf[10] == (DV2 < 2 ? 0 : 11 - DV2);
    }
}

Note that when separating the calculation of DV1 and DV2 of its use, I could also eliminate several unnecessary parentheses - which in turn allowed the reduction of the number of line breaks and blank spaces, increasing again the concision [in the most general sense].

4. Level of abstraction

It is important for readability that a function/subroutine always operates at the same level of abstraction. For example, if a function needs all lines of a file, it is ideal to create a separate function to open the file, read line by line and return in a collection or iterator/generator [if your language no longer does so]. Otherwise, we would have a high level reasoning "broken" by the presence of a piece of code at a low level. In my opinion, more than concision this is the main reason why I found your use of All, Take and Sum right - you did not "pollute" the reasoning of "how to calculate CPF digits" with details of "how to sum up a list", etc.

You just have to be careful not to create too many levels: if a function calls too many external functions, understanding what the program does requires you to stop reading the main function to query each external function, then go back to where you left off reading. It is only good to break a function into smaller functions if each one is at a lower abstraction level (see that answer for an example). At this point my opinion is contrary to the general recommendation to "avoid very long functions" - I would say that if a function has a single responsibility and operates at the same level of abstraction, it can be long as necessary without impairing readability.

5. Comments

A recommendation I always read is not to "narrate the program" in the comments. For example, if your first comment was:

if (!cpf.All(x => x == cpf.First())) //Verifica se todos os números são iguais ao primeiro

it would be a useless comment - just read the code to understand that the program does it! Omitting "to the first" is a little better, implying that "all the same digits" are an "interesting" case, but without clarifying why. A little better would be "If all digits are equal, the CPF is invalid". The other comments are OK [explain at a high level what the code snippet does].

In addition, it is good to use comments to draw attention to particular or "unusual" cases. For example, if you are implementing a classic algorithm, and for some particular need you need to vary a detail in the implementation or embed an operation/verification just in that case, a comment explaining why that line/chunk exists may be interesting - although the rest of the entire algorithm is not commented on. How pointed out by Peter, the code must [ideally] be self-explanatory, only the intent behind a piece of code is what should be clarified (the goal of the code must be in your documentation, I refer to explain how a certain passage helps in achieving this goal).

And if the intention of your code is not apparent given its representation (e.g., you are using a low-level language, or your way of representing the data is unusual) then it helps a lot to be generous in the comments. For each low-level instruction, comment at a high level on what it does. Let me give you an example of a code I wrote a while ago - implementing a hash table in Javascript (treating collisions by chaining):

var toSet = this[':' + key.hash];
if ( toSet === undefined ) { // No key found, create one
    this['::' + key.hash] = key;
    this[':' + key.hash] = value;
}
else if ( toSet instanceof CompoundField ) // Multiple keys found, select the right one (or add)
    toSet.field(key, value);
else if ( this['::' + key.hash].match(key) ) // Same key found, update the value
    this[':' + key.hash] = value;
else { // Single key found, but not the right one; split into multiple keys
    this[':' + key.hash] = new CompoundField(this['::' + key.hash], toSet, key, value);
    delete this['::' + key.hash];
}

There is a place in my documentation where it is well explained how this representation works (and without reading it, it is likely that nobody understands 100% of the above code), but as it is nothing abstract - and therefore not a little self-explanatory - the comments serve as indications of "what" the code is doing. As in your case, a comment like "Calculation of the first digit checker" is a high-level explanation than the code makes, not as what makes.

  • +1 Whoever quotes Paul Graham cannot be wrong :P Seriously, it is a good answer, by indicating that the real problem is not the imperative question x functional, but rather the readability.

28

There are several techniques to improve the readability of the code, and in some cases this may even be relative.

A very good book on the subject is Clean Code (Clean Code) by Uncle Bob.

These two practices are simple and help greatly in the readability of the code:

  • Small methods that perform only one task
  • Variables with self-explanatory names

About commenting on the code, some programmers consider it bad practice because the code should be self-explanatory, but in some situations it is difficult to write code that does not need a comment (often by using verbose languages and API’s)

In your case I would turn these conditionals that are in Ifs into two boolean variables and their name would be self-explanatory.

Using many nested commands and if’s ternaries can make readability very difficult, not always the smaller code is more readable and stylish.

  • 3

    I love that book. According to him the quality of a code is measured by the amount of swearing your colleagues will say during a job review.

  • +1 "If a code has an extra comment, it probably has one less method". A block of annotated code can potentially turn into a new method, with a self-explanatory name, making the comment unnecessary.

19

The question is what other techniques can be used to improve the readability of a code?

In my view it can be a conjugation of everything that has already been said. All are valid. I think these are the main points:

  • Try to separate code blocks by functionality;
  • If necessary to comment on the code, better a comment here and there than two months trying to understand the code;
  • If this block is too large, or outside the scope of the function we want to create, separate it into an auxiliary function;
  • Give suggestive names to functions and variables;
  • Try not to repeat code (copy/Paste), which becomes very complicated when you want to change something at some point;

Finally, those who tend to be minimalist in coding, I think you should be careful to comment on what you are doing. I think either way is good, as long as you can make the code legible, I think it’s the most important thing. Better to be more careful than less.

  • 2

    +1 by "Try not to repeat code (copy/Paste)", once I worked with a programmer who preferred to create copies of my functions "not to risk getting in my way/breaking my code". The intention could be good, but when it came to changing something it was hell: several copies of the function needing the same change, and if you forgot a good luck to debug later...

  • I’ve had to take a code that’s all been replicated everywhere and I’m still suffering from it, and it’s been almost two years. Whenever it is necessary to change something is a punishment. When I have time, I clean the copies.

14

I prefer to use variable names and full functions even if it gets a little long, but it helps in reading the code, decreasing the need for comments because the code is self-explanatory.

Ex:

int totalAlunosPresentes = TotalizarAlunosPresentesNaSala(IdSala);

About creating functions, separate the blocks of your code that make a specific function, making this set of individual functions interact forming the result you need and enabling the reuse of these functions if necessary.

Making it very clear that this is my opinion, based on the experiences and readings I’ve had.

  • 1

    "separate the blocks of your code that make a specific function, making this set of individual functions interact forming the result you need."I agree, as long as the parts are at a lower abstraction level. Reading a function having all the time to stop to read a different function is laborious, I’ve had to work with codes of this type and I say it’s not easy. The problem is not "separating everything", but when some things are separated and others are not (i.e. the same level of abstraction is "pulverized" between different functions).

11

First, you must "indent", indentation consists of organizing categories, codes and source code elements of any language. For this, you have to organize the codes, lines and use comments to assist you in the maintenance of the software. See an example in C++:

Code in C++ not indented:

#include <iostream>
int main()
{
int x;
x = 1;
cout << x << "\n";
}

Indented code:

#include <iostream> // Inclui uma biblioteca para entrada e saída de dados

int main() // Função principal
{
     // Declaração de variável:
     int x = 1; // "x" recebe 1

     // Mensagem:
     cout << "x é igual a " << x << "\n";
}

Which of these did you think was the best one to read? It’s definitely the last one, because it explains what the commands are for and it’s organized. For example, you know that the command cout is within the function int main().

10

Here are some notes that I explained of some research and daily living:

  • Reduce complexity: Create a routine and automatically you’re hiding part of the code in one place. This makes your code less complex, because reading and understanding are easier. Not to mention maintenance.

  • Avoid duplicate code: Imagine repeated code in various parts of the system. With the use of routines you avoid code duplication and still save on system size.

  • Simplify Complex Boolean Tests: Developing routines that test the validations needed to perform some purpose helps a lot in understanding code.

  • Improve performance: In maintenance it is much easier to change in one place and solve the problem than to change in several other places and still have the possibility to forget some place.

Even simple operations must be converted into routines.

Many people think that because an operation is simple (two or three lines of code) it does not need to be refactored and end up copying the block of code in various parts of the system. What programmers should know is that a simple routine makes code simple and readable.

Learn to choose a good name for your routine

  1. Make code simple, readable and small.
  2. Waste time and learn to choose a good name for your routines.

source: linhadecodigo

Browser other questions tagged

You are not signed in. Login or sign up in order to post.