For this post we are going to look at the cost of having a final static boolean in the code. They can be very useful to enable or disable certain behavior e.g. tracing, logging etc. The question is what kind of performance implications it has.
The reason for making this post is that I didn't know the implications and I asked the
question on the Mechanical Sympathy Mailinglist. So I would like to thank the people on this mailing list for answering my question.
For this post we have the following assumptions:
- we only care about the output of the C2 compiler
- we are using Java hotspot 1.8.0_91
Constant expression
Let's start with the most basic case where the final static field is initialized using a constant expression:
public class StaticFinal_ConstantExpression {
public static void main(String[] args) {
int result = 0;
for (int k = 0; k < 100_000; k++) {
result += doMath(k);
}
System.out.println(result);
}
final static boolean ENABLED = true;
public static int doMath(int a) {
if (ENABLED) {
return a + 1;
} else {
return a - 1;
}
}
}
The actual logic in the 'doMath' isn't terribly exciting. The main purpose provide easy to understand bytecode or Assembly.
When we check the bytecode for the 'doMath' method using 'javap -c StaticFinal_ConstantExpression.class' we get the following:
public static int doMath(int);
Code:
0: iload_0
1: iconst_1
2: iadd
3: ireturn
If we would convert this back to Java we would get:
public static int doMath(int a) {
return a + 1;
}
The Javac has propagated the ENABLED constant and completely removed the dead code. We don't even to look at the Assembly.
Be careful with final statics and constant expressions; if the value is changed and one or more classes that read this value are not recompiled, they will not see the new value.
Non constant expression
In the previous example there was a hard coded constant value for ENABLED. In practice you often want something more flexible, e.g. using some kind of System property. So let's change the ENABLED initialization so it gets its value from a System property 'enabled'.
public class StaticFinal_NonConstantExpression {
public static void main(String[] args) {
int result = 0;
for (int k = 0; k < 100_000; k++) {
result += doMath(k);
}
System.out.println(result);
}
final static boolean ENABLED = Boolean.getBoolean("enabled");
public static int doMath(int a) {
if (ENABLED) {
return a + 1;
} else {
return a - 1;
}
}
}
And if we display the relevant bytecode using 'javap -c StaticFinal_NonConstantExpression.class', we get the following.
static final boolean ENABLED;
public static int doMath(int);
Code:
0: getstatic #6 // Field ENABLED:Z
3: ifeq 10
6: iload_0
7: iconst_1
8: iadd
9: ireturn
10: iload_0
11: iconst_1
12: isub
13: ireturn
static {};
Code:
0: ldc #7 // String enabled
2: invokestatic #8 // Method java/lang/Boolean.getBoolean:(Ljava/lang/String;)Z
5: putstatic #6 // Field ENABLED:Z
8: return
We can see that the 'doMath' still contains the check and the logic for both branches. The Javac has not made any optimizations since it doesn't know which value ENABLED is going to be at runtime.
Lets go a level deeper and see what kind of Assembly we are going to get. To display the Assembly, we'll use the following parameters
-XX:+UnlockDiagnosticVMOptions
-XX:PrintAssemblyOptions=intel
-XX:-TieredCompilation
-XX:-Inline
-XX:CompileCommand=print,*.doMath
-Denabled=true
Tiered compilation is disabled since we are only interested in the C2 output. Inlining is disabled to prevent the 'doMath' method getting inlined into the main loop. Also we set the enabled system property to true.
When we run we get the following Assembly
Compiled method (c2) 248 8 com.constant_folding.StaticFinal_NonConstantExpression::doMath (14 bytes)
total in heap [0x00000001083a7a90,0x00000001083a7c60] = 464
relocation [0x00000001083a7bb0,0x00000001083a7bb8] = 8
main code [0x00000001083a7bc0,0x00000001083a7be0] = 32
stub code [0x00000001083a7be0,0x00000001083a7bf8] = 24
oops [0x00000001083a7bf8,0x00000001083a7c00] = 8
metadata [0x00000001083a7c00,0x00000001083a7c08] = 8
scopes data [0x00000001083a7c08,0x00000001083a7c18] = 16
scopes pcs [0x00000001083a7c18,0x00000001083a7c58] = 64
dependencies [0x00000001083a7c58,0x00000001083a7c60] = 8
Loaded disassembler from /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/jre/lib/hsdis-amd64.dylib
Decoding compiled method 0x00000001083a7a90:
Code:
[Disassembling for mach='i386:x86-64']
[Entry Point]
[Verified Entry Point]
[Constants]
# {method} {0x00000001d08e24b8} 'doMath' '(I)I' in 'com/constant_folding/StaticFinal_NonConstantExpression'
# parm0: rsi = int
# [sp+0x20] (sp of caller)
0x00000001083a7bc0: sub rsp,0x18
0x00000001083a7bc7: mov QWORD PTR [rsp+0x10],rbp ;*synchronization entry
; - com.constant_folding.StaticFinal_NonConstantExpression::doMath@-1 (line 16)
0x00000001083a7bcc: mov eax,esi
0x00000001083a7bce: inc eax ;*iadd
; - com.constant_folding.StaticFinal_NonConstantExpression::doMath@8 (line 17)
0x00000001083a7bd0: add rsp,0x10
0x00000001083a7bd4: pop rbp
0x00000001083a7bd5: test DWORD PTR [rip+0xfffffffffff74425],eax # 0x000000010831c000
; {poll_return}
0x00000001083a7bdb: ret
0x00000001083a7bdc: hlt
0x00000001083a7bdd: hlt
0x00000001083a7bde: hlt
0x00000001083a7bdf: hlt
[Exception Handler]
[Stub Code]
0x00000001083a7be0: jmp 0x000000010839af60 ; {no_reloc}
[Deopt Handler Code]
0x00000001083a7be5: call 0x00000001083a7bea
0x00000001083a7bea: sub QWORD PTR [rsp],0x5
0x00000001083a7bef: jmp 0x0000000108375d00 ; {runtime_call}
0x00000001083a7bf4: hlt
0x00000001083a7bf5: hlt
0x00000001083a7bf6: hlt
0x00000001083a7bf7: hlt
OopMapSet contains 0 OopMaps
Lot of output. Let's remove everything that isn't relevant:
0x00000001083a7bcc: mov eax,esi
;; copy the content of 'a' into eax
0x00000001083a7bce: inc eax
;; increase eax by one
The JIT has propagated the ENABLED constant and removed the dead code.
If we run with '-Denabled=false', we'll get similar Assembly:
0x000000010b4eb7cc: mov eax,esi
0x000000010b4eb7ce: dec eax ;*isub
; - com.constant_folding.StaticFinal_NonConstantExpression::doMath@12 (line 19)
So also in this case the JIT has propagated the constant and removed the dead code.
Original size of bytecode matters
So it seems that we can use static final with non constant expression to disable or enable certain behavior. Unfortunately this isn't true. Inlining can still be prevented because the choice to inline is determined based on the original bytecode size. To demonstrate this we'll use the following code:
public class StaticFinal_OriginalSizeMatters {
public static void main(String[] args) {
int result = 0;
for (int k = 0; k < 1_000_000; k++) {
result += doMath(k);
}
System.out.println(result);
}
final static boolean ENABLED = Boolean.getBoolean("enabled");
public static int doMath(int a) {
if (ENABLED) {
System.out.print("n");
System.out.print("e");
System.out.print("v");
System.out.print("e");
System.out.print("r");
return a + 1;
} else {
return a - 1;
}
}
}
When we run with using:
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintInlining
-XX:FreqInlineSize=50
-Denabled=false
We'll see the following output:
@ 12 com.constant_folding.StaticFinal_OriginalSizeMatters::doMath (54 bytes) callee is too large
@ 27 java/io/PrintStream::println (not loaded) not inlineable
@ 12 com.constant_folding.StaticFinal_OriginalSizeMatters::doMath (54 bytes) callee is too large
@ 27 java/io/PrintStream::println (not loaded) not inlineable
@ 12 com.constant_folding.StaticFinal_OriginalSizeMatters::doMath (54 bytes) hot method too big
So even though ENABLED is false, the method is still too fat to get inlined because the original bytecode is used.
Conclusion
A final static boolean with a constant expression is completely free. The Javac will do the constant propagation and dead code elimination and there is no price to pay.
A final static boolean with a non constant expression will be fully optimized by the JIT. However inlining can be prevented because the original size of the bytecode determines if something gets inlined; not what the JIT made out of it.
If i understood your point correctly, you can divide your "decider" method ('doMath' in your case) into two: one for ENABLED=true (let's call it 'doMathEnabled') and one for ENABLED=false ('doMathDisabled')
BeantwoordenVerwijderenThus, in case of ENABLED=true 'doMath' and 'doMathEnabled' will be inlined without the influence of the size of 'doMathDisabled'.
Thank you for your reply.
BeantwoordenVerwijderenGood point, it should allow the inlining of the 'doMath' methods.
However. now the calling method could be prevented from being inlined itself into an other method. Effectively the problem has been moved.
Another other problem is that you don't want to litter the caller code with if(enabled)then this else that. If the 'doMath' method is used at more than 1 place, you need to solve the problem more than once.