How many Strings are created in the codes below?

Asked

Viewed 220 times

8

How many Strings the JVM actually creates during the execution time of the code snippets below?

1:

String s1 = "s1";

2:

String s2 = new String("s2");

3:

String s3 = "s3";
String s4 = s3 + "s4";

4:

String s5 = "s5";
String s6 = s5 + "-" + "s6";

5:

final String s7 = "s7";
String s8 = s7 + "s8";

6:

final String s9 = "s9";
String s10 = s9 + "-" + "s10";
  • 1

    Now that I’ve seen that there’s a -1, I think it’s quite wrong, the question is sensational.

4 answers

7


It’s not a complete answer, but it gives an idea.

I compiled the following class with javac 1.8.0_111:

public class Teste {
    public String x() {
        String s1 = "s1";

        String s2 = new String("s2");

        String s3 = "s3";
        String s4 = s3 + "s4";

        String s5 = "s5";
        String s6 = s5 + "-" + "s6";

        final String s7 = "s7";
        String s8 = s7 + "s8";

        final String s9 = "s9";
        String s10 = s9 + "-" + "s10";

        return s1 + s2 + s3 + s4 + s5 + s6 + s7 + s8 + s9 + s10;
    }
}

And decompiled the result:

/*
 * Decompiled with CFR 0_118.
 */
public class Teste {
    public String x() {
        String string = "s1";
        String string2 = new String("s2");
        String string3 = "s3";
        String string4 = string3 + "s4";
        String string5 = "s5";
        String string6 = string5 + "-s6";
        String string7 = "s7s8";
        String string8 = "s9-s10";
        return string + string2 + string3 + string4 + string5 + string6 + "s7" + string7 + "s9" + string8;
    }
}

Using a different decompiler:

// 
// Decompiled by Procyon v0.5.30
// 

public class Teste
{
    public String x() {
        final String s = "s1";
        final String s2 = new String("s2");
        final String s3 = "s3";
        final String string = s3 + "s4";
        final String s4 = "s5";
        return s + s2 + s3 + string + s4 + (s4 + "-s6") + "s7" + "s7s8" + "s9" + "s9-s10";
    }
}

Note that the compiler was smart and has already made some of the concatenations by itself, and that therefore the result will not always be obvious. However, clearly, he could be even smarter than he is.

I tried to make a variant by removing the modifiers final, believing it would make no difference. To my surprise, the final made a difference yes. Here are the results of decompilation:

/*
 * Decompiled with CFR 0_118.
 */
public class Teste2 {
    public String x() {
        String string = "s1";
        String string2 = new String("s2");
        String string3 = "s3";
        String string4 = string3 + "s4";
        String string5 = "s5";
        String string6 = string5 + "-s6";
        String string7 = "s7";
        String string8 = string7 + "s8";
        String string9 = "s9";
        String string10 = string9 + "-s10";
        return string + string2 + string3 + string4 + string5 + string6 + string7 + string8 + string9 + string10;
    }
}
// 
// Decompiled by Procyon v0.5.30
// 

public class Teste2
{
    public String x() {
        final String s = "s1";
        final String s2 = new String("s2");
        final String s3 = "s3";
        final String string = s3 + "s4";
        final String s4 = "s5";
        final String string2 = s4 + "-s6";
        final String s5 = "s7";
        final String string3 = s5 + "s8";
        final String s6 = "s9";
        return s + s2 + s3 + string + s4 + string2 + s5 + string3 + s6 + (s6 + "-s10");
    }
}

I also tried one more variant, putting final in all variables:

/*
 * Decompiled with CFR 0_118.
 */
public class Teste3 {
    public String x() {
        String string = new String("s2");
        return "s1" + string + "s3" + "s3s4" + "s5" + "s5-s6" + "s7" + "s7s8" + "s9" + "s9-s10";
    }
}
// 
// Decompiled by Procyon v0.5.30
// 

public class Teste3
{
    public String x() {
        return "s1" + new String("s2") + "s3" + "s3s4" + "s5" + "s5-s6" + "s7" + "s7s8" + "s9" + "s9-s10";
    }
}

That is, the code with the final best compilation.

  • string8 and string9: only concatenated strings were created. Literals and then concatenation were not created. These two strings concatenated literals with variables final.

  • "Not a complete answer" - That was the best answer so far rsrs. So, in fact, there is a certain/considerable difference between concatenating strings with variables final, right? @bigown

  • 3

    I really think this is the one that gave more useful information, I just don’t see it as a definitive answer. @igorventurelli yes, it is clear to this compiler. What was not clear to me is what is the role of the decompiler in the optimizations there. For me it was a result loco :)

  • 1

    @igorventurelli Response edited. Surprising.

  • @bigown Reply edited.

  • @Victorstafusa if I were to do the compiler I would try to do this optimization. There are people who think it’s a little bit over the top. And then we fall into those old questions: http://answall.com/q/2130/101 and http://answall.com/q/2044/101. Very good question, too bad the staff did not appreciate it. I’ll see if I can run some C# tests later.

  • @mustache All right. I’m already trying to get excited to take a look at the code in Javac 9. Even the curious thing is that if you compile, decompile and compile again what has been decompiled, in this case, should come out a bytecode more efficient than the compiler itself generates, and what I would expect is the opposite.

  • 2

    @bigown really was very instructive.

Show 3 more comments

3

In total, over the proposed code, 16 Strings are created, as added comments:

1:

String s1 = "s1"; // Objeto 1 colocado no Pool

2:

String s2 = new String("s2"); // Objeto 2 e 3, um literal (que vai para o Pool) e outro com o new

3:

String s3 = "s3"; // Objeto 4 colocado na Pool
String s4 = s3 + "s4"; // Objeto 5 não colocado no Pool s4 E Objeto 6 "s4"

4:

String s5 = "s5"; // Objeto 7 colocado no Pool
String s6 = s5 + "-" + "s6"; // Objeto 8 não colocado no Pool s6 e Objeto 9 ("-") colocado no Pool e Objeto 10 "s6"

5:

final String s7 = "s7"; // Objeto 11 colocado no Pool
String s8 = s7 + "s8"; // Objeto 12 não colocado no Pool e Objeto 13 "s8"

6:

final String s9 = "s9"; // Objeto 14 colocado no Pool
String s10 = s9 + "-" + "s10"; // Objeto 15 não colocado no Pool (reutiliza Objeto 9 do pool) e Objeto 16 "s10"

String pool

Java has a pool of objects of the String type. Before creating a new String, first it checks in this pool if a String has the same content already exists; in this case, it reuses it, avoiding creating two exactly equal objects in memory.

It is important to note that Java only puts Strings created using literals into the pool. Strings created with the operator new are not automatically placed in the pool. Another important point is that string resulting from literal concatenations are also placed in the pool. But this only occurs when there are literals on both sides of the concatenation. If any of the objects is not a literal, the result will be a new object, outside the pool.

  • String s4 = s3 + "s4"; (assuming that we have s3 declared with some literal), how many strings are created at that point?

  • 1

    On second thought, I believe that 2: "S4" and the result of concatenation.

  • Exact. And in addition, there is a special treatment for concatenations with final

  • 1

    You can put an answer yourself!

  • Yes! I’m looking for the source where I found this to be sure.

  • I didn’t see anything about it. You’re not confused with Static?

Show 1 more comment

3

I’m going to give an alternative answer because either the question is tricky, or it’s been misspelled.

The question talks about creation at runtime. So it doesn’t count the ones that were already created at compile time, right?

It depends on the implementation of the JVM used and the Java compiler. There may or may not be optimizations in some cases. Others optimization is certainly not possible, or is only possible if done aggressively.

The language specification does not determine exactly how the Runtime in several of these cases.

Our question here is tag Java, but in the text does not talk about Java, only about JVM. By syntax it’s unlikely it could be another language, but it could, what could happen otherwise.

  • Yes, Runtime ! I’ve put various forms of creation to get to the point of final: Is there any artifact that by concatenating a literal with a variable final (String), causes fewer Strings to be created than concatenating a literal with a non-variable final, correct?

  • 1

    So, yes, but not necessarily. If you want to know if it’s possible, yes, it is. If you do it, it depends.

  • Okay, but it depends on what? I saw in the SO gringo a superficial explanation about it long ago, but I can’t find it again.

  • Depends on the compiler and the Runtime. They have a certain freedom to create as they see fit. There are some things in the specification that are mandatory, others not.

0

First, the compiler will already create the Strings "s1", "s2", "s3", "s4", "s5", "-s6", "s7", "s8", "s9" and "-s10" and these will be present in the generated bytecode and in the Strings during execution. The compiler is smart enough to transform "-" + "s6" in "-s6" and "-" + "s10" in "-s10" (at least javac 1.8.0_111 is).

So we’ve already started with these 10 Strings.

// 1. Apenas aponta para uma String já existente no pool.
String s1 = "s1";

// 2. Cria uma nova String: "s2".
String s2 = new String("s2");

// 3a. Apenas aponta para uma String já existente.
String s3 = "s3";

// 3b. Cria uma nova String: s4 = "s3s4"
String s4 = s3 + "s4";

// 4a. Apenas aponta para uma String já existente.
String s5 = "s5";

// 4b. Cria uma nova String: s5 = "s5-s6"
String s6 = s5 + "-" + "s6";

// 5a. Apenas aponta para uma String já existente.
final String s7 = "s7";

// 5b. Cria uma nova String: s8 = "s7s8"
String s8 = s7 + "s8";

// 6a. Apenas aponta para uma String já existente.
final String s9 = "s9";

// 6b. Cria uma nova String: s10 = "s9-s10"
String s10 = s9 + "-" + "s10";

Beyond 10 Strings originally present that were in bytecode, were created over 5 Strings in steps 2, 3b, 4b, 5b and 6b.

  • Yes. But it comes to the same point that I’m discussing with @bigown. When we concatenate a literal string with a variable final (also String), the behavior is (or may be) different. That’s what I want to understand.

  • 1

    @igorventurelli O final has no effect on execution, it is a thing that is entirely at compile time. In this example, the final is completely unnecessary and without effect. It would make a difference if after all you tried to make a s9 = "x";, because then it would error compilation because of the final. The final means that you can only assign the variable once.

  • Yes.. I understand that, but it doesn’t seem like it. I have long seen in the gringo OS a reply/comment inferring the special behavior of final in this context. I will search again and put here. If I do not find close the topic.

  • 1

    @Victorstafuses what leaves room for optimization. I doubt it does, but it is possible.

  • @bigown See my other answer: http://answall.com/a/185771/132

  • @Victorstafuses the most curious is the decompiler optimize :D Now I do not trust the result. I don’t know how much compiler optimization was and how much of the decompiler was. Of course the second has decompiler optimizations, in the first, I don’t know who optimized it, it can be a mix of the 2.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.