Why does the absence of the suffix L cause the long variable to be interpreted as int?

Asked

Viewed 1,034 times

15

When I use a short number for long, as long long1 = 9797;, the number is accepted even without using the suffix L. However, by placing a higher number - as its minimum and maximum values, for example - the value is only accepted as long if using the suffix L, Otherwise, he says he’s out of range.

I first used a smaller value that was accepted without problems and without need of suffix and then larger values, which were only accepted with L:

public class Dúvida_sobre_long {

    public static void main(String[] args) {
        long long1 = 9797;
        long long2 = 922337203685477807;  // Erro de compilação nesta linha.
        long long3 = 922337203685477807L;

        System.out.println(long1);
        System.out.println(long2);
        System.out.println(long3);
    }

}

The build error is this:

The literal 922337203685477807 of type int is out of range.

I understood that without the suffix L, the value is interpreted as int, but I would like to know why this happens with higher values, since I have specified that the variable is long?

  • Just to complement: values float has the same behavior. If you do float x = 10.0; the 10.0 will be a literal double and not a float. It is necessary to make float x = 10.0F; or float x = 10.0f; for the value to be "truly" float.

  • Did any of the answers solve your question? Do you think you can accept one of them? Check out the [tour] how to do this, if you haven’t already. You would help the community by identifying what was the best solution for you. You can accept only one of them. But you can vote on any question or answer you find useful on the entire site.

2 answers

13

Why the language creators decided so. No better explanation :) It’s in the specification.

In fact what is declaring there is a literal int which is made a cast by the compiler implicitly. So it is reserving a space of 8 bytes, the size of a type long and storing an integer value that would only need 4 bytes, but the rest is filled with zeros, so it’s the same. There is no running cost, it’s just something the compiler has to deal with when building the code.

They might have required putting the suffix L even in cases of "low" values? Yes, they could, but they did not, they felt it was not necessary. For me it is inconsistent, but it is so and you must follow these rules. If you want to be more consistent put the L where you can even when you don’t need.

Could they have ceased to require the suffix at large values at least there in the statement or in cases that have no ambiguity and so infer that it is a long one even? They could. They probably thought it would get this bad. Maybe they thought it would add a chunk of code to the compiler, and therefore a runtime to handle it that they didn’t think was worth it.

I find it a derisory cost close to what they already do and by the inconsistency of the compiler’s behaviors, time he needs to understand what’s ahead, time he doesn’t want to do it. But Java started out wanting a low cost, without inferring anything. Today already infers some things.

I disagree with Victor Stafusa’s conclusion, although the answer is correct and very good. If they wanted to simplify the compiler they would require the suffix on every statement in one long. Today the compiler makes exceptions. But exceptions are not the end of the world. Close inference (from the point of view of code analysis) does not create major difficulties for the compiler. Complicated is to infer from something that is distant. And something like inference, at least in this case is done anyway. After all he has to decide that he has no ambiguity and accept without the suffix. The compiler would be simpler if everything were mandatory, even a 0L. Everything in his answer shows that they didn’t make it that simple. Either make it real simple, or settle down, one foot in each canoe.

Part of what happens is also because Java was based on languages that handled this way.

  • Just to complement: the suffix in lower case is also valid (l), just not cool to use because it looks like the number 1.

  • 1

    What alias is another nonsense, since it is bad, nor should allow, and still killed the suffix that could be useful in the future (not so much for him, but all the others).

  • Relevant link: language specification

  • Yes, I agree. It was just to complement, because for values float, is "acceptable" to use the suffix in both lower and upper case letters.

  • @Carlosheuberger not so inconsistent remains inconsistent. It is inconsistent only to exist L and not require S or B, or he discovers that he is a long and not accept just because there is no L, like you don’t have in other guys.

  • So you stalled that you didn’t understand mine, never mind.

Show 1 more comment

9

The structure of the compiler

Internally, the compiler is divided into several parts: Lexical analysis; syntactic analysis, semantic analysis, code generation and code optimization.

The first of these parts, the lexical analysis, is responsible for perforating the source code into tokens. For example, when writing public static void main(String[] args) {, The lexical analyzer will see 11 different tokens: public, static, void, main, (, String, [, ], args, )and {. In addition, the lexical analysis already makes a basic classification of the token: public, static and void are key words of the language; main and String are identifiers, (, ), [, ] and { are special symbols. Identation and comments are discarded by lexical analysis and do not constitute tokens.

In the syntactic analysis, the tokens will be grouped so that the compiler tries to understand the program structure. At this stage, he will see which access modifiers (public and static) followed by one type void, followed by a name main, followed by a list of parameters in parentheses corresponds to a method declaration.

Parsing transforms the code into a tree-like structure, where below the node representing the class, we have nodes representing fields, constructors and methods. Within the nodes that represent methods, we have nodes that represent the type of return, the modifiers, the parameters, the exceptions and the body. Within each node that corresponds to the field of a method, we have several other nodes that correspond to each method statement.

Semantic analysis is the step responsible for verifying that the program obtained from the syntactic analysis makes sense, verifying that all the variables used have been declared and initialized, that all the methods called exist and have the parameters of the correct types, if there are no variables with repeated names in the same scope, etc.

The literals int and long

Lexical analysis when finding a 9797 will issue a token literal-type int and when finding a 9797L will issue a token literal-type long. The answer to your question is that differentiation is made in lexical analysis. Behold here the lexical specification of that Part.

So that the lexical analyzer can distinguish the literal int of the literal long, they decided they have the suffix L or l, then it’s a literal long, if not, it’s a literal int. This is a very simple and easy rule to understand.

It wouldn’t be any different?

It is true that they could do otherwise, but the compiler design is easier if the lexical parser can already separate the literals ints of the literals longs, at the cost of putting this detail in the language with the suffix l or L. The same is true of the literal float that requires the suffix f or F to differentiate from the literal double.

The need to have these literals expressed is justified in particular by the presence of autoboxing:

Object a = 555;
Object b = 555L;
System.out.println(a.getClass().getName()); // java.lang.Integer
System.out.println(b.getClass().getName()); // java.lang.Long

Without the suffix, build 555 as long would require a cast.

If there were no such suffix L, you would have to use this to create a long without the suffix:

long y = (long) 922337203685477807;

But this does not work because the number 922337203685477807 is already out of the valid range for the int, then you can’t build it before you cast it. It must necessarily be built as long. There we have the suffix L for that reason.

Could have made those numbers already long by default, but then when using this:

int x = 555;

You’d have a problem because the literal is long and the variable is int. To solve this or you’d have to put a suffix on us ints, which would be much worse (having to use 555i instead of 555), or would have to use explicit Casts for int always, it would be horrible, since ints are everywhere.

Another possibility would be for the compiler to do a contextual analysis to know if the number fits the int or not. But this is not feasible. For example:

int f = 150096 * g - h / 5;

How to know if this fits or not in the int without using Casts or specific suffixes? It’s even possible to do it, but this complicates the syntactic and semantic analysis of the compiler to solve a simple detail of the language. That is, it would make the compiler structure quite complicated.

Another possibility would be for the lexical analyzer to verify if the number is in the int, emitting a token literal int if you are or a literal long if it is not. But it would have somewhat confusing side effects:

Object a = 2147483647;        // java.lang.Integer
Object b = (long) 2147483647; // java.lang.Long - Tem que ter o cast
Object c = 2147483648;        // java.lang.Long - Surpresa! Agora não precisa mais do cast.

In the case of byte and of short, there are literals for them, which is very boring and therefore Casts are always needed from int, long or char. For example:

byte b = (byte) 123;
short s = (short) 1234L;

However, as the int and the long are larger than the byte and the short, cast can be used, unlike the case of long to the int.

In the case of float and of double, the reverse occurs because the smaller type requires the suffix, which frees the larger type from having to do it. Adopt the same regarding the int and to the long would not be practical because it would mean that the int is that it would have to have the suffix (555i).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.