Wednesday, May 04, 2005

Evil String arithmetic revisited

So, lets take the case of a simple method like:
        public void printStr(String s){
System.out.println("prefix " + s + " suffix");

We've all been taught that you should do something like the following instead because, it will give you better performance, yada yada yada.
        public void printStr2(String s){
StringBuffer buf = new StringBuffer("prefix")
.append(" suffix");

Then people talk about what the compiler does.
So, I say, why not look at what the compiler does. So running javap against the compiled code, I see the following:
public void printStr(java.lang.String);
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: new #3; //class java/lang/StringBuffer
6: dup
7: invokespecial #4; //Method java/lang/StringBuffer."":()V
10: ldc #5; //String prefix
12: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
15: aload_1
16: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
19: ldc #7; //String suffix
21: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
24: invokevirtual #8; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;
27: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
30: return

and I get the following against the second version:
public void printStr2(java.lang.String);
0: new #3; //class java/lang/StringBuffer
3: dup
4: ldc #10; //String prefix
6: invokespecial #11; //Method java/lang/StringBuffer."":(Ljava/lang/String;)V
9: aload_1
10: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
13: ldc #7; //String suffix
15: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
18: astore_2
19: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
22: aload_2
23: invokevirtual #8; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;
26: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
29: return

So the only real difference is when we manually create the string buffer, we initialize it with the first string and when the compiler does it, it starts with an empty string buffer.
So, thats all well and good. But, what about when we compile under the new Java 1.5?
It generates the following:
public void printStr(java.lang.String);
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: new #3; //class java/lang/StringBuilder
6: dup
7: invokespecial #4; //Method java/lang/StringBuilder."":()V
10: ldc #5; //String prefix
12: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: aload_1
16: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: ldc #7; //String suffix
21: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: invokevirtual #8; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
27: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
30: return

We get a StringBuilder instead of the StringBuffer, just by rebuilding the code. Nice. And, according to Sun, its faster because the StringBuilder is not synchronized. So by following the *bad* practice, we get equal performance on Java 1.3, and better performance on Java 1.5 by simply recompiling. Oh, and the code is more readable.
Now, I am not saying to go out and get rid of all your StringBuffers. They really do have a place. I am saying to think about what your code is doing, and if there is some debate about compiler output, just run javap and disassemble the byte code. Then you really know what the output is.
You don't have to be an assembly genious to know that
3: new #3; //class java/lang/StringBuilder
creates a new object, or that
12: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
appends the StringBuilder. An average developer should be able to reason out whats happening by reading the disassembled code. Also remember that different compilers will generate different output. Test with the one you will use for your builds but, take a look at the output from the Eclipse compiler, Jikes, GCJ, IBM and Sun's JDK's. Don't fall into the trap of writing code for one compiler. And remeber the old mantra, "premature optimization is the root of all evil". Write good clean readable code first, then use a profiler to tell you where the optimizations are needed. Its easier to optimize clean functioning code than it is to debug highly optimized code.

Just my $0.02

Links to this post


At 04 May, 2005 11:16, Blogger willCode4Beer said...

I've received various responses, which I should answer with, "think about what the code is doing".

Anyway, my simplistic example didn't cover a case where String arithmetic *is* evil, loops.
So if you have something like:

String out = new String();
for(int i=0;i<cnt; i++){
out += s;
You'll find the compiler is not that smart and will create a new StringBuffer and store the results of toString ecah iteration.

The moral is as before, think about what the code is doing. If there's a question, just decompile it.


At 27 September, 2009 13:40, Anonymous Anonymous said...

In your added example, it isn't primarily the fact that the code is in a loop that makes it "evil", but the fact that it is applied to a situation where the text CHANGES. That is very different from the example discussed in your article. The most basic rule is to use String for text that doesn't change, and StringBuilder/StringBuffer for text that does change. (The fact that a small time penalty gets so much worse in a loop is secondary.)


Post a Comment

Subscribe to Post Comments [Atom]

<< Home

Links to this post on:

Create link here by posting on Blogger