What is an interpreted language? Is Java interpreted?

Asked

Viewed 13,425 times

30

In this question here I quote that Java is an interpreted language, because I always understood it that way. But I was corrected in this comment that Java is no longer interpreted.

So I got some doubts:

  • What is an interpreted language?
  • So Java has been interpreted but not anymore? That’s right?

3 answers

23

Interpreting

An interpreted language executes the code directly from the source code.

Interpretation occurs in a similar way to compilation (translation), that is, it has a process of syntactic, lexical and semantic analysis, but this is done on demand. The source code is being read (can be line by line or otherwise) and interpreted with these processes and then something is executed according to what is written.

Java

Java worked like this in early versions.

There is still some confusion with this because there is still a process of "interpreting" the code generated by the compiler. But normally it is not considered as an interpreted code since even this "interpretation" does not occur instruction by instruction.

To better understand, we have to observe that a Java code goes through the same analysis processes mentioned above, what changes is the way it goes through the code and what it does at the end, which is what differentiates the interpretation of the compilation:

  • The interpretation takes place in short excerpts of the program, can be line by line, and at the end performs something that was determined in this excerpt.

  • Compilation takes place in larger parts (functions, classes, packages) trying to understand the whole and in the end a code is generated. There is a translation to another form.

    In this case it is a code from the Java virtual machine (JVMtaste of link the page in Portuguese, but do not forget to see in English, it is always better). It’s like the machine code a computer understands but it’s specific to the Java platform and not to a processor. So the program is compiled but cannot run directly on the processor as with languages like C or Pascal that normally create directly understandable code for the processor.

Jitter

So this virtual machine code that’s called bytecode is compiled as well, but it is an extremely simple process, it is in an easy format to be read and understood by this new compiler completely different from the compiler of the source code of the language. Besides not having to worry about whether the code is correct or not, this has been done before. And mainly this compilation does not occur instruction by instruction.

This is done by a JIT compiler (Just-in-Time) which is a compiler that generates the processor’s machine code, called native code. In the case of Java this Jitter transforms the bytecode in native code doing some optimizations that are only possible when you know well the environment that is running, not only the computer, operating system, settings, but also the other components (Packages) that are being used together.

This JIT compilation understands all intermediate code and generates native code on demand as it becomes necessary. But there is a way to force this compilation to occur a little earlier.

This Jitter did not exist in early versions of Java. Normally Jitter does not influence the semantics of the language so any language previously interpreted or compiled for a bytecode can be Jittada later. In fact this is increasingly common. We can cite as examples Javascript, Lua, PHP, etc. that have passed Jittadas later in independent implementations.

Jitter usually just has to understand this bytecode default and the processor code where it will run, you do not need to know anything of the language. But there are Jitters that work on top of the source code, so in a certain way there is a build on demand (at the time it will run) as opposed to the best known early compilation. But even this on-demand compilation is not an interpretation because it generates code to be executed and does not run directly.

Compiled languages without machine code

There are languages that strictly cannot be considered as interpreted. It executes over the bytecode (sometimes called pseudocode) but are not Jittadas. The execution is faster than the pure interpretation but not as much as the Jittada, because in a certain way there is an "interpretation" of this bytecode and it runs directly, without transformation into native code. Moon (pure, without the Luajit) is an example today.

This is not new. One of the first languages mainstream which was very successful in several parts of the world, including Brazil, was the Clipper (a dialect that survives in a modern way is the Harbour). It worked this way but as it generated an executable many programmers believed it generated code equal to C. But it was only one pcode encapsulated in the .exe. It’s similar to what . NET does today. Its programs seem to be in a native executable, but internally it has the bytecode.

But this technique has been around since the 1950s.

There are languages that do not generate a bytecode and yes a AST (Abstract Syntax Tree or abstract syntax tree). It is a step before code generation. A compiler normally (in virtually all known implementations) generates an AST after parsing and lexical processes and the other subsequent processes occur on top of this tree. Ruby default uses (or used, I may be outdated) this AST to run. The interpretation still occurs in the AST, but it is not the normal process of interpretation. Anyway there was a previous compilation process.

Of course there are implementations of Ruby that work differently, including because they run on top of the Java platform, that is, in the end the same bytecode which is generated in Java is generated by jruby and then it is Jitted by JVM. This shows the flexibility of this JIT infrastructure.

Some people consider that these languages are still interpreted (or semi-interpreted) since they do not execute native machine code, there is a lighter interpretation because part of the necessary process was done before by a compiler and something simple to manipulate was generated with the "guarantee" that has no errors. But it takes a program that understands this code and has something executed indirectly. This would be an interpretation.

What I remember from the beginning of Java was like this, I think there was never the interpretation of direct source code. I mean, always had the javac and the JVM interpreted the bytecode.

So it’s quite complicated to classify languages or even implementations as interpreted or compiled.

Languages are not interpreted

We cannot say that there are languages interpreted or compiled or even Jittadas. At most we can say that implementations have these characteristics. And they are not mutually exclusive. Although some people will say that they are different implementations provided together, it is possible to say that the three forms may exist in the implementation.

Completion

Obviously the execution of an interpreted program is much slower than a compiled program that has its machine code generated in advance. In the case of the code Jittado has a cost to generate machine code but is a much lower cost than direct interpretation. Besides this is done once and then the machine code is always reuse.

Pure interpretation today only makes sense in developmental time or to perform scripts very short. Hence any language used to make systems must have some form of compilation, even if optional.

  • 1

    ++1 You have no idea how this answer has helped me, eternally grateful.

  • @bigown you are graduated in Java?

  • @Marcusbecker as it forms in Java?

22


An interpreted language is one that needs a special program - called interpreter - for its programs to run. Contrast compiled language, in which your programs go through a process of translating converting from [semi-] human language to machine language.

When you write something like:

x = y + z;

If you are telling the computer that you want to assign to the variable x the value of the sum of the variables y and z. However, although this is the intention, At first it’s just a text file. It is necessary that this file serves as input for another program, which will do something with it and - at some point - its intention when writing the code is realized.

  • The simplest way is interpretation: the program analyzes the instruction, and then does what it is telling it to do. Simple and direct!
  • One more complex is the compilation: the program analyzes the instruction, translates it into the machine language, and produces as output another program - whose behavior should be to perform what you expressed in the code.

Between one and the other, there are several half-terms:

  • You can analyze the instruction, convert it to machine code, and execute that machine code immediately. Saving it in a kind of "cache" so that - if the same instruction has to be executed again - it uses the previously generated machine code, avoiding having to analyze it again. This process is called Just-In-Time Compilation - JIT. This is the strategy used for example in the Javascript V8 engine (used in Chrome and Node.js).

  • One can analyze the instruction and convert it not to machine code directly, but to another format (usually binary) that is simpler to interpret and/or compile. It is useful when one wants to analyze the sources only once, but without "tying" the output to any specific platform. This strategy is used by Java, which through the tool javac converts programs from textual format to format bytecodes.

    In this case, it is said that the generated code will be executed by a "virtual machine": something similar to a real machine, but with its own architecture, its specific set of instructions, everything that in principle would describe a machine. Only this machine is not physical - it’s just an interpreter/compiler of intermediate code for a specific architecture.

    • And to answer your question, in the old days bytecodes were used as input for an interpreter; today they are used as input for a JIT compiler, as described in the previous item.
  • You can analyze the instruction and convert it to an equivalent instruction in another language - then send it to be compiled/interpreted by the tools of that second language. It is widely used when programming in a restricted environment (e.g., the browser, which only supports Javascript) using a language other than that supported in that environment.

Finally, it should be remembered that the line separating compiled from interpreted is not so well defined: even what we call "machine code" often still needs to be converted into what we call microinstructions - those that are sent directly to the actual CPU. Not every architecture has this distinction (in some the "machine code" runs directly), but the most commonly used ones do. At the end of the day, the compilation result is not well targeted at a specific architecture, but rather at a set of similar architectures (e.g.: x86 and x86-64 - which cover a huge array of machines from the 1980s until today).

9

What is an interpreted language?

It is a programming language where the high-level code written by the programmer is interpreted by another computer program and then executed by the operating system, i.e., the written language is not transformed into machine code, but interpreted by another program.

Java is interpreted or compiled?

First let’s understand some terms:

Javac - Compiler that turns code written in Java to Bytecodes.

Bytecodes - Code in bytes, other than machine code, as this is not immediately executable.

JIT - Just In Time Compiler, compiles the Bytecode for machine code at runtime, performing performance optimizations.

JVM - Virtual platform that loads the class file into RAM memory, checks the Bytecode checking for access restriction violations in your code and converting it into executable machine code.

Therefore the high-level code, written in Java by the programmer is compiled by Javac that transforms to Bytecode. Bytecode is compiled by JVM through the compiler JIT for a sequence of instructions given for machine code at runtime before executing natively. Its main goal is to do heavy performance optimizations. Given this, we can say that Java is not interpreted but compiled, as it is not directly executed by another program from the high level code written by the programmer.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.