To what extent should I follow the conventions, where can I apply specific style patterns of my own?

Asked

Viewed 885 times

23

How far should I follow the conventions in Java codes, that is, how far the convention is a rule?

I can develop and apply my own styles of coding patterns in my code, knowing that I will use this pattern for all the same?

  • 4

    I did not understand why this question was denied. I find it completely valid and it was put in a very interesting way. " How far is convention a rule?". Who has denied it can manifest here?

  • Convention is not a rule.

2 answers

26


In fact, what is the same rule is what is imposed by the compiler, the rest is convention.

There are in practice two conventions in force. In particular, the Eclipse community tends to have a different formatting pattern than the Oracle staff (who inherited the Sun style). Many groups tend to follow the format of Eclipse or Oracle, with some smaller groups mixing them or adopting conventions that diverge from both in some parts. In most cases, such as where to place spaces around operators, Casts, methods, parentheses, Generics statements, array indexes, etc., the style is usually uniform. The major differences are as to {, as regards identation and as regards switch.

Getters and setters

There are many frameworks that deduce what are the properties of your objects when looking at getters and setters methods. If you use a different convention at this point, frameworks will not find their properties or manipulate them correctly, and then in this case you end up being required to follow the conventions.

Getters should have the name starting with get followed by a capital letter, having the return different from void and have no parameters. If the return type is boolean (the primitive type), the prefix may be either get how much is. Some frameworks (but not all) allow the prefix is can also be used when the return type is Boolean (the packaging class).

Setters should have the name starting with set followed by a capital letter, have exactly one parameter, and return void. Some (but not all) frameworks allow the return to be of the class itself declaring the method. Also, in most cases, the type of the Setter parameter should be the same as the getter return.

Identifiers

As for the rest, strictly speaking you’re usually free to use it however you want, as long as the compiler accepts it. However, it is still worth using the conventions. One of the reasons is the syntax coloring, as in the code below:

class Classe1 { // A palavra Classe1 vai ficar azulada.
}

class classe2 { // A palavra classe2 vai ficar preta.
}

int teste; // A palavra teste vai ficar preta.
int TESTE; // A palavra TESTE vai ficar preta.
int Teste; // A palavra Teste vai ficar azulada.

Note the color difference. The idea is that in the above case, class names are bluish while the other black identifiers. However, if you don’t follow the language conventions, Stackoverflow will give the wrong coloring to the words. It turns out that this is not a problem unique to Stackoverflow and several other programs will have the same problem.

And the naming convention of identifiers is as follows:

  • Names of classes, interfaces, enums and constructors must have the initial of each uppercase word, with the other lowercase letters. The words are not separated. The use of numbers is permitted. For example: StringBuilder, JPanel, Consumer, NullPointerException, LayoutManager2.

  • Method names, local variables, instance variables, static variables that are not constants, method parameters, Lambdas parameters, and packets must have the first lowercase letter, with the initial of each of the words other than the first uppercase and all the other letters also lowercase. For example, toString, getClass, isVisible, setEnabled, clone, element, start, getX1, temp, x, y, f1, etc..

  • Package and module names (Java 9+) must follow the reverse domain pattern and have all lowercase letters without separation of words. Numbers are allowed. For example, java.lang.annotation, javax.swing, javax.validation.constraintvalidation, org.json, org.apache.commons.lang3, org.springframework.web.servlet, java.base, android.service.quicksettings. The names starting with java, javax and javafx are or should be restricted to JDK and the others ideally should follow the reverse domain pattern (but this does not always happen in practice, like android). After the part of the package name that corresponds to the domain, the logic used when dividing the packages into subpackages is the project’s organizational logic and not the project’s naming organization logic. I mean, you should use com.example.meuprojeto.bancodedados instead of com.example.meu.projeto.banco.de.dados.

  • Names of constants (i.e., immutable objects declared with static and final) and elements of enums must have all capital letters with words separated by _. Numbers are allowed. For example, SOUTH_EAST, TOP, TIMED_WAITING, DECIMAL_FLOAT.

  • Generic type parameters are denoted by a single uppercase letter, such as Map<K, V> or List<E>.

There are also some frameworks that, as with getters and setters, require the name of the methods to have a certain structure. This was until fairly common before annotations were introduced in Java 5, where tools such as Junit 3 required test methods to have the name prefixed with test and EJB 2 also had several nomenclature rules. With the advent of annotations, these restrictions of names in the method (considered annoying) ended up being progressively abolished (as in Junit 4 and EJB 3), but from time to time there is one or other framework that still does some kind of imposition.

Spacing in general

As for spacing, the problem is much smaller, but the readability can still be affected, and so this issue is also important.

There are basic rules such as:

  • Never place spaces before comma or semicolon, but always place afterwards.

  • Always put spaces around binary operators, but not around unary operators.

  • Do not place space immediately after (, { or [.

  • Never place spaces within types (possibly generic), except after the comma that is within a list of generic parameters. That is to say, List<Map<String, Thread>> and double[] are ok, while List <Map< String , Thread> > or double [] are not.

  • Put space right after the cast. That is, int x = (int) y is okay, while int x = (int)y nay.

  • Never put spaces immediately before line breaks (they’re invisible and useless and only serve to create version conflicts in tools like Git and SVN).

  • Plus a lot of other little details.

The location of {

There is a disagreement as to where to place the { class start, interface, method, if, else, while, for, do...while, synchronized, try, catch or finally. This divergence has existed since the time C was beginning. There are essentially two styles in use:

  • Put the { at the end of the block line he’s opening - that’s the style that Sun has adopted and that Oracle has followed. This style was created by Brian Kernighan and Dennis Ritchie, creators of C, and was also adopted by Bjarne Stroustrup who created C++ and Linus Torvalds who created the Linux kernel. Example:

    if (x) {
        // blablabla
    }
    
  • Put the alone { on a line just for you. This standard was started by Eric Allman who created BSD Unix in C, being heavily influenced by the current Pascal standard, which uses the keywords begin and end to delimit blocks, where the begin normally placed isolated on its own line. In Java, this is the standard adopted by the Eclipse community. Example:

    if (x)
    {
        // blablabla
    }
    

There are other ways also to decide where the { is placed and some variants in certain special cases. This divergence has already caused some lengthy debates and flamewars in mailing lists and internet forums (and obviously it also existed in the scope of Stackexchange). I personally follow the style adopted by Oracle, with a small but: in the declaration of methods and constructors, when the list of parameters is large and ends up being divided into several lines, I use the { in its own line for it to be evidenced, rather than just hanging on the line of the last parameter.

Maximum width of a line

This point can also be controversial. Most conventions dictate that 80 or 79 columns is the limit.

However, this limit comes from the old terminals and consoles and the old printers of the 1980s and earlier that had a limit of 80 columns on the screen/paper. Today that limit is more than exceeded.

In addition, Java is a very verbose programming language, and because of that, it is very easy and customary to end up surpassing the eightieth column. Forcing the 80 column limit can leave multiple instructions and expressions divided into such a large number of rows (even more if they have multiple indentation levels) that it will make code significantly harder to read and understand.

Therefore, I consider that a limit of 120 to 160 columns is ideal. I don’t give a specific number, because I think it depends a lot on personal preferences and particularities of each project and any number that I would give would just be my personal opinion.

Size of the indentation

This one is the biggest of all divergences and the biggest cause of flamewars and fights about styles on the internet.

There are two issues involved here. The first is whether identation is with tabs or with spaces. The second is that if spaces are chosen, how many.

First, whatever criteria of identation you choose, you must be consistent. Some lines sometimes with tabs and sometimes with spaces is the worst of the worlds. Even worse when the same line mixes tabs and spaces in the identation. If you’re going to use spaces, always use the same amount of spaces to represent an indentation, otherwise it will look horrible and inconsistent.

Communities in C, C++ and other languages have a myriad of different parties, each with its niche and its disputes with other parties. In Java, there are essentially only two parties: Identar with 4 spaces or identar with 1 tab. Sun originally recommended either of the two forms and the Eclipse community adopted the latter. Later (around the time of Java 5, I think), Sun changed its mind and standardized for itself and started to recommend the first form only (4 spaces).

In my personal opinion, the identation with spaces is better, because:

  • In theory, tab-indented code should work with any tab size to be adopted by the user reading the code, so that the choice of the exact size would fit this. In practice however, only the exact size used by whoever wrote the code originally will work and if two or more developers have changed different parts of the same code using different tab sizes, it will go wrong anyway, regardless of the tab size used.

  • When using spaces, the code I wrote will be viewed by my neighbor exactly the way I visualize it. Similarly, the code my neighbor wrote will be visualized by me exactly the way it views.

  • Having to set tab size in each editor for every different code I find out is a bitch.

  • Many software, including email clients and web browsers, does not easily allow tab size to be configured (most consider a tab to be 8 spaces, some consider it to be 4). These softwares also have no way to guess which size of tab would be most suitable in each situation.

  • Having to look at the width of the tab is something that the user who is just occasionally browsing on a page or reading emails on mailing lists shouldn’t have to worry about.

  • Respecting the maximum width of a line is much harder when using tabs instead of spaces, since my tab size may differ from my neighbor’s tab size.

  • If you get an error message saying that something on column 33 of line 82 is wrong, and that line is indented with tabs, finding out which exactly is column 33 in that row can be a bit difficult.

In practice, many communities in various programming languages are very slowly abandoning tabs and adopting indentation only by spaces. Sun itself, ended up doing so for the reasons described above. Python 3 identation conventions also counter-indicate tabs and accept them only to maintain code compatibility written in previous versions.

This process of migrating tabs to spaces out there is very slow (taking decades) because there are many people out there who do not give up using tabs, simply hate to ident with spaces and there are many software out there that use tab as a standard way to ident. This is the most vehement debate and disagreement on code-writing conventions.

The cases of switch

This one is also a point of disagreement, although it is much smaller than the previous three. In practice there are two conventions competing with each other. Are they:

  1. Put the cases and the default at the same level of identation of switch:

    switch (x) {
    case 1:
        // Blablabla
    case 2:
        // Blablabla
    default:
        // Blablabla
    }
    
  2. Put the cases and the default with a higher level of identation than the switch:

    switch (x) {
        case 1:
            // Blablabla
        case 2:
            // Blablabla
        default:
            // Blablabla
    }
    

The } before the else, of catch and of finally

This one is a silly detail, but there are three different styles:

  1. if (x)
    {
        // blabla
    }
    else
    {
        // blabla
    }
    
    try
    {
        // blabla
    }
    catch (AlgumaException x)
    {
        // blabla
    }
    finally
    {
        // blabla
    }
    
  2. if (x) {
        // blabla
    } else {
        // blabla
    }
    
    try {
        // blabla
    } catch (AlgumaException x) {
        // blabla
    } finally {
        // blabla
    }
    
  3. if (x) {
        // blabla
    }
    else {
        // blabla
    }
    
    try {
        // blabla
    }
    catch (AlgumaException x) {
        // blabla
    }
    finally {
        // blabla
    }
    

The people who use { in their own line almost always adopt the first style.

Those who use { along with the statement of the block being opened tend to adopt the second style, but in some cases may prefer the third.

Variables of type array

There are two equally valid Java ways to declare an array:

  1. public static void main(String[] args)
    
  2. public static void main(String args[])
    

In general the first form is considered superior, because in it you follow the pattern [variable type + variable name] which applies to all other ways of declaring variables in the language. The second form is much less readable, and is present only because it was inherited from C and C++, because in it you first declare a part of the variable type, followed by the variable name and followed by the remaining part of the type, and in this case the information on the type of the variable is spread in two different locations unnecessarily.

Alignment of parameters on multiple lines

This one is somewhat controversial and concerns the positioning of the parameters of methods and constructors, when these are very numerous. Consider therefore the following cases:

  1. public String metodo(
            int x,
            int y,
            int z);
    
  2. public String metodo(int x,
                         int y,
                         int z);
    

The two forms are found out there, but I personally am only in favor of the first for the following reasons:

  • The first form maintains the same indentation pattern as the rest of the code, and does not cause any line to end up being indented by a number of spaces that are not multiple of the tab size.

  • The second way is fragile, because if you decide to change the name of the method, or the type of return or something relating to modifiers static, public, protected, private, strictfp, abstract, final, default or native, will have to worry about not mess with the alignment of parameters.

  • If the second form is made with tabs, the result will be a disaster. Often it will be necessary to mix spaces and tabs in the identation, because the identation size in the lines of the parameters may not be multiple of the tab size. In addition, only a certain size of the specific tab will produce the proper identation, and the idea that any tab size would suit the beauty.

In the first form above, normally the identation given to the parameters in relation to the method name declaration is double. The reason for this is so that it is at a different level of identation from both the body of the method and the declaration itself. For example:

public String meuMetodo(
        int parametro1,    // Dois níveis de identação além do cabeçalho.
        int parametero2)
{
    return "abc";          // Um nível de identação além do cabeçalho.
}

Keys after if, else, while and for

The use of keys ({}) after the if, else, while or for is optional in Java if the body is a single instruction (a feature inherited from C and C++). In fact, this is because the body of these blocks is defined to be a solitary instruction or an instruction set delimited by keys.

Note the following two ways:

  1. if (x) {
        fazerAlgumaCoisa();
    } else {
        fazerOutraCoisa();
    }
    
    for (int i = 0; i < 10; i++) {
        System.out.println(i);
    }
    
  2. if (x)
        fazerAlgumaCoisa();
    else
        fazerOutraCoisa();
    
    for (int i = 0; i < 10; i++)
        System.out.println(i);
    

The two forms are equivalent, and some people like the second form. I am strongly opposed to the second way because it is too prone to accidental carelessness:

if (x)
    fazerAlgumaCoisa();
    fazerOutraCoisa();

estamosForaDoIf();

Note that in this case identation deceives and makes it appear that the call to the method fazerOutraCoisa(); is inside the if, when you’re actually out. Often this ends up causing the programmer to deceive himself and write code with bugs, which could be prevented by always adopting the keys in the blocks if, else, for and while.

Another disastrous case:

if (x)
    //estamosDentroDoIf();

estamosForaDoIf();

In the above case, when commenting on the line within the if, the next line that was outside the if ended up sneaking into it!

Another case:

if (x)
    if (y)
        System.out.println("x e y são verdadeiros.");
else
    System.out.println("x é falso.");

Note that the else seems to be in the if outside, but it’s actually in if and the code won’t do what the programmer thinks it would do.

There’s one exception only that I think it’s worth not putting the keys in if (but that’s my personal opinion). That’s when the if has not else and is in a single line:

if (x) fazerAlgumaCoisa();
estamosForaDoIf();

The blocks try, catch, finally, switch, do...while and synchronized do not suffer from this problem because the keys are mandatory in them.

Identation of commented lines

Another difference between the convention used by Sun/Oracle and Eclipse concerns the identation of commented lines.

  • Sun/Oracle style:

    public class X {
        public void x() {
            // Esta linha é um comentário.
            int x = 5;
            // x++;
        }
    }
    
  • Eclipse style:

    public class X {
        public void x() {
    //      Esta linha é um comentário.
            int x = 5;
    //      x++;
        }
    }
    

Personally, I hate the style of the Eclipse, because the spaces between the // and the text is no longer an indentation as it is defined (spaces at the beginning of the line), and if the indentation is done with tabs, it will result in tabs in the middle of the line rather than just at the beginning, which is horrible. Also, an inattentive reader may not notice that the line of x++; is commented, especially if there are many levels of identation and the editor used has no syntax coloring or has one that is inappropriate.

Other

There are other concepts to consider also such as:

  • Should line breaks be placed before or after binary operators in very long logic or mathematical expressions? The idea of putting before is the one that is prevailing by making it clear that the line in question is continuation of the previous.

  • Where to break lines in methods calls with many complex parameters?

  • What is the best way to sort attributes, methods, constructors and internal classes within a given class?

  • What is the order of annotations to be applied to classes, attributes, methods and constructors?

  • What are the best ways to give good names to local classes, methods, attributes, parameters and variables, avoiding that names that are too long at the same time are sufficiently descriptive and understandable?

  • Code everything in English or use identifiers with names in Portuguese (or some other language)? Are cases that lead to identifiers that mix two different languages acceptable? If you want everything in English, is that the project programmers are very fluent in English?

  • Place a line break at the end of the file or not?

  • Where to put blank lines inside the code of some method?

  • Line breaks from source code must be \r (Mac), \n (Unix/Linux) or \r\n (Windows)?

  • Should the character encoding be UTF-8 or ISO-8859-1? UTF-8 has proved increasingly advantageous in this dispute due to better standardization, lower probability of unpleasant surprises with encodings, and ability to encode any character from anywhere in the world, including emojis .

  • A lot of other little details you can imagine.

The convention you must adopt

Finally, the choice of the following conventions is at your discretion at the end of the day. In the case of naming identifiers, I see little reason to escape the convention since, although it might actually have been better, you will already be using a lot of classes and library methods that follow the standard convention (even those of the package java.lang), which means that in trying to go against this, you would end up creating a code with a heterogeneous and deregulated style.

On the other hand, regarding the choice of tabs vs spaces, identation size, { at the end of the line of the block that starts or in its own line, maximum length of the line, where to place or not spaces, etc., this is something that is more at your discretion and where you have more freedom of choices. Just think about the pros and cons of each approach before you make a decision and whatever decision you make, be consistent and consistent with it.

Checkstyle

There is also a widely used tool in several professional-grade Java projects called checkstyle. This tool checks whether Java language code matches the style rules you define in the project, reporting any violation however minor.

The tool is quite flexible and configurable, having integration with all widely used Ides nowadays and allowing you to specify in an XML file what are the style rules to be adopted. The tool is free, open source and its development is very active, having frequent updates and is already ready for Java 11.

  • 6

    And I thought my answers were long:P

  • 1

    Perfect, too much even the account, great information obtained and I am safer knowing that several manias are just manias and nothing else, IE, have no theoretical and practical basis, I think that in the end what should be valid is the standard adopted by the team/software/company, the problem is that not always this pattern is defined previously, there is a salad.

0

The ideal is to use what your IDE (e.g. Eclipse, Android Studio) uses, because then self-training does the work for you.

As far as I know Java does not have an official coding style with ubiquity comparable to Go (which comes with gofmt) or even Python whose coding style is determined by the PEP8 document (and I do not follow, I prefer tabs with size 8).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.