I have been inspired to start blogging again by people like Gil Tene, Martin Thompson, Nits Wakart, Peter Lawrey, Cliff Click and many others. They provide so much insights of low level performance issues to the community, that I want to contribute as well.
Java assert
Java asserts have been added to Java 1.4 and are an easy way to check preconditions, postconditions and invariants. A nice feature of asserts is that they can easily be enabled/disabled either globally using `-da` but can also be disabled on package or class level. For more information see the following page.Lets have a look at the following example:
public class Assert { public static void main(String[] args) { long l = 0; for (int k = 0; k < 100_000; k++) { l += twice(k); } System.out.println(l); } public static int twice(int a) { assert a != 0 : "a can't be 0"; return a * 2; } }We have a loop that does a simple calculation. We also store the result and print it so that the JIT doesn't apply dead code elimination. In the 'twice' method there is a simple assert that checks if 'a' is not 0. The assert condition isn't very important, its primary function is to give easy to understand Assembly.
If we compile the code and output the content of the class it using 'javap -c Assert.class', we get the following bytecode:
public class com.asserts.Assert { static final boolean $assertionsDisabled; public com.asserts.Assert(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."It is interesting to see how the assert has been implemented. A new synthetic static final field `$assertionsDisabled` has been added to the class which gets initialized when the class is loaded. See the 'static {}' for more detail.":()V 4: return public static void main(java.lang.String[]); Code: 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #2 // int 100000 7: if_icmpge 24 10: lload_1 11: iload_3 12: invokestatic #3 // Method twice:(I)I 15: i2l 16: ladd 17: lstore_1 18: iinc 3, 1 21: goto 4 24: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream; 27: lload_1 28: invokevirtual #5 // Method java/io/PrintStream.println:(J)V 31: return public static int twice(int); Code: 0: getstatic #6 // Field $assertionsDisabled:Z 3: ifne 20 6: iload_0 7: ifeq 20 10: new #7 // class java/lang/AssertionError 13: dup 14: ldc #8 // String a can't be 0 16: invokespecial #9 // Method java/lang/AssertionError." ":(Ljava/lang/Object;)V 19: athrow 20: iload_0 21: iconst_2 22: imul 23: ireturn static {}; Code: 0: ldc #10 // class com/asserts/Assert 2: invokevirtual #11 // Method java/lang/Class.desiredAssertionStatus:()Z 5: ifne 12 8: iconst_1 9: goto 13 12: iconst_0 13: putstatic #6 // Field $assertionsDisabled:Z 16: return }
The first instruction of the `twice` method loads this static field and then there is a check if this field is set. If `$assertionsDisabledis` false, it continues with the assert statement on #6. If `$assertionsDisabled` is true (so assert disabled), then there is a jump to the actual logic on #20.
If we convert the 'twice' bytecode back to Java, we would get:
public class Assert { static final boolean $assertionsDisabled; public static int twice(int a) { if(!$assertionsDisabled){ if(a == 0) { throw new AssertError("a can't be 0") } } return a * 2; } }So the assert doesn't require any special new bytecode instructions. It is translated to a few simple checks.
Assert disabled
As a performance engineer, I'm always worried if we need to pay the price for something that isn't used. In this case, on the bytecode level, there is one extra branch to deal with when assertions are disabled and 2 branches when assertions are enabled. Till so far I'm not terribly excited since branching and high performance code, is not really a good mixture.So lets see if the JIT can optimize it if assertions are disabled. For this I'm using the following commands:
-XX:+UnlockDiagnosticVMOptions -XX:PrintAssemblyOptions=intel -XX:-TieredCompilation -XX:-Inline -XX:CompileCommand=print,*Assert.twiceThe TieredCompilation is disabled so we only get the C2 Assembly output. Inlining is disabled so that the 'twice' method doesn't get inlined in the main loop.
This provides us the following output:
Compiled method (c2) 190 8 com.asserts.Assert::twice (24 bytes) total in heap [0x0000000101d7f310,0x0000000101d7f4e0] = 464 relocation [0x0000000101d7f430,0x0000000101d7f438] = 8 main code [0x0000000101d7f440,0x0000000101d7f460] = 32 stub code [0x0000000101d7f460,0x0000000101d7f478] = 24 oops [0x0000000101d7f478,0x0000000101d7f480] = 8 metadata [0x0000000101d7f480,0x0000000101d7f488] = 8 scopes data [0x0000000101d7f488,0x0000000101d7f498] = 16 scopes pcs [0x0000000101d7f498,0x0000000101d7f4d8] = 64 dependencies [0x0000000101d7f4d8,0x0000000101d7f4e0] = 8 Loaded disassembler from /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/jre/lib/hsdis-amd64.dylib Decoding compiled method 0x0000000101d7f310: Code: [Disassembling for mach='i386:x86-64'] [Entry Point] [Verified Entry Point] [Constants] # {method} {0x00000001ca2ba4f8} 'twice' '(I)I' in 'com/asserts/Assert' # parm0: rsi = int # [sp+0x20] (sp of caller) 0x0000000101d7f440: sub rsp,0x18 0x0000000101d7f447: mov QWORD PTR [rsp+0x10],rbp ;*synchronization entry ; - com.asserts.Assert::twice@-1 (line 14) 0x0000000101d7f44c: mov eax,esi 0x0000000101d7f44e: shl eax,1 ;*imul ; - com.asserts.Assert::twice@22 (line 15) 0x0000000101d7f450: add rsp,0x10 0x0000000101d7f454: pop rbp 0x0000000101d7f455: test DWORD PTR [rip+0xfffffffffe87fba5],eax # 0x00000001005ff000 ; {poll_return} 0x0000000101d7f45b: ret 0x0000000101d7f45c: hlt 0x0000000101d7f45d: hlt 0x0000000101d7f45e: hlt 0x0000000101d7f45f: hlt [Exception Handler] [Stub Code] 0x0000000101d7f460: jmp 0x0000000101d70f60 ; {no_reloc} [Deopt Handler Code] 0x0000000101d7f465: call 0x0000000101d7f46a 0x0000000101d7f46a: sub QWORD PTR [rsp],0x5 0x0000000101d7f46f: jmp 0x0000000101d4bd00 ; {runtime_call} 0x0000000101d7f474: hlt 0x0000000101d7f475: hlt 0x0000000101d7f476: hlt 0x0000000101d7f477: hlt OopMapSet contains 0 OopMapsThere is quite a lot of output. If we only focus on the logic of the method and strip all ceremony, the following instructions remain:
0x0000000101d7f44c: mov eax,esi 0x0000000101d7f44e: shl eax,1 ;*imul ; - com.asserts.Assert::twice@22 (line 15)The 'a' argument (stored in register esi) is copied into the eax register. Then we multiply the eax register by shifting the bits one to the left and store the result back into eax register. The eax register will be used to access the return value. Replacing the multiply by a shift left is a peephole optimization since bitshifting is faster than doing a multiply. The conclusion we can make is that if assert is disabled, the JIT completely removes the '$assertionsDisabled' check. That is good news; I hate to pay the price for something not used.
Assert enabled
What if assertions are enabled? For this we run with the following settings-XX:+UnlockDiagnosticVMOptions -XX:PrintAssemblyOptions=intel -XX:-TieredCompilation -XX:-Inline -XX:-BackgroundCompilation -XX:CompileCommand=print,*Assert.twice -ea
And we get the following Assembly
Compiled method (c2) 168 8 com.asserts.Assert::twice (24 bytes) total in heap [0x0000000109e7c310,0x0000000109e7c548] = 568 relocation [0x0000000109e7c430,0x0000000109e7c440] = 16 main code [0x0000000109e7c440,0x0000000109e7c480] = 64 stub code [0x0000000109e7c480,0x0000000109e7c498] = 24 oops [0x0000000109e7c498,0x0000000109e7c4a0] = 8 metadata [0x0000000109e7c4a0,0x0000000109e7c4a8] = 8 scopes data [0x0000000109e7c4a8,0x0000000109e7c4d0] = 40 scopes pcs [0x0000000109e7c4d0,0x0000000109e7c540] = 112 dependencies [0x0000000109e7c540,0x0000000109e7c548] = 8 Loaded disassembler from /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/jre/lib/hsdis-amd64.dylib Decoding compiled method 0x0000000109e7c310: Code: [Disassembling for mach='i386:x86-64'] [Entry Point] [Verified Entry Point] [Constants] # {method} {0x000000011217a4f8} 'twice' '(I)I' in 'com/asserts/Assert' # parm0: rsi = int # [sp+0x20] (sp of caller) 0x0000000109e7c440: mov DWORD PTR [rsp-0x14000],eax 0x0000000109e7c447: push rbp 0x0000000109e7c448: sub rsp,0x10 ;*synchronization entry ; - com.asserts.Assert::twice@-1 (line 14) 0x0000000109e7c44c: test esi,esi 0x0000000109e7c44e: je 0x0000000109e7c460 ;*ifne ; - com.asserts.Assert::twice@7 (line 14) 0x0000000109e7c450: mov eax,esi 0x0000000109e7c452: shl eax,1 ;*imul ; - com.asserts.Assert::twice@22 (line 15) 0x0000000109e7c454: add rsp,0x10 0x0000000109e7c458: pop rbp 0x0000000109e7c459: test DWORD PTR [rip+0xffffffffff6ecba1],eax # 0x0000000109569000 ; {poll_return} 0x0000000109e7c45f: ret 0x0000000109e7c460: mov esi,0x7 0x0000000109e7c465: xchg ax,ax 0x0000000109e7c467: call 0x0000000109e47120 ; OopMap{off=44} ;*new ; - com.asserts.Assert::twice@10 (line 14) ; {runtime_call} 0x0000000109e7c46c: call 0x000000010898fe44 ;*new ; - com.asserts.Assert::twice@10 (line 14) ; {runtime_call} 0x0000000109e7c471: hlt 0x0000000109e7c472: hlt 0x0000000109e7c473: hlt 0x0000000109e7c474: hlt 0x0000000109e7c475: hlt 0x0000000109e7c476: hlt 0x0000000109e7c477: hlt 0x0000000109e7c478: hlt 0x0000000109e7c479: hlt 0x0000000109e7c47a: hlt 0x0000000109e7c47b: hlt 0x0000000109e7c47c: hlt 0x0000000109e7c47d: hlt 0x0000000109e7c47e: hlt 0x0000000109e7c47f: hlt [Exception Handler] [Stub Code] 0x0000000109e7c480: jmp 0x0000000109e6bf60 ; {no_reloc} [Deopt Handler Code] 0x0000000109e7c485: call 0x0000000109e7c48a 0x0000000109e7c48a: sub QWORD PTR [rsp],0x5 0x0000000109e7c48f: jmp 0x0000000109e46d00 ; {runtime_call} 0x0000000109e7c494: hlt 0x0000000109e7c495: hlt 0x0000000109e7c496: hlt 0x0000000109e7c497: hlt OopMapSet contains 1 OopMaps #0 OopMap{off=44}Lets remove a lot of the clutter:
0x0000000109e7c44c: test esi,esi 0x0000000109e7c44e: je 0x0000000109e7c460 ;*ifne ; - com.asserts.Assert::twice@7 (line 14) 0x0000000109e7c450: mov eax,esi 0x0000000109e7c452: shl eax,1 ;*imul ; - com.asserts.Assert::twice@22 (line 15) 0x0000000109e7c454: add rsp,0x10 0x0000000109e7c458: pop rbp 0x0000000109e7c459: test DWORD PTR [rip+0xffffffffff6ecba1],eax # 0x0000000109569000 ; {poll_return} 0x0000000109e7c45f: ret 0x0000000109e7c460: mov esi,0x7 0x0000000109e7c465: xchg ax,ax 0x0000000109e7c467: call 0x0000000109e47120 ; OopMap{off=44} ;*new ; - com.asserts.Assert::twice@10 (line 14) ; {runtime_call} 0x0000000109e7c46c: call 0x000000010898fe44 ;*new ; - com.asserts.Assert::twice@10 (line 14) ; {runtime_call}This one is a bit more complicated. Lets start at `0x0000000109e7c44c` where a 'test esi,esi' is done. 'test esi,esi' checks if 'a' is 0 and updates the 'ZF' (Zero Flag) in the flags register. If ZF is 0, then we continue with the regular logic; we move 'a' in the 'eax' register and multiply it by 2 and then the procedure exit ceremony is executed. This is no different then when assert is disabled.
If ZF is 1 (so a==0), then we jump to `0x0000000109e7c460`. We are totally bypassing the regular logic and even fail to execute the exit ceremony for procedure. Everything we have done so far is frozen and then we shoot into space. The mechanism we are seeing is called the uncommon trap and required for speculative optimizations. In this case, we have never called `twice` with 0, so for the JIT there was no reason to emit the 'throw new AssertError' code. As soon as I figured out the handling of this uncommon trap, I'll create a blogpost about it.
If we would translate this back to Java, we would get something like:
public class Assert { public static int twice(int a) { if(a == 0) { uncommonTrap(); } return a * 2; } }If 'twice' would be called with 0, the uncommon trap gets executed, the code deoptimized and the interpreter will continue execution with the 'throw new AssertError' in place.
The conclusion we can make is: if assert is enabled, then the '$assertionsDisabled' check is removed.
Size matters
Till so far we have seen that the JIT completely removed all assert logic if assert is disabled. So it looks like assert is completely free if disabled. Unfortunately that isn't the case. The extra instructions inserted in the bytecode, can prevent inlining because inline limits are determined on the size of the bytecode.Lets proof this by inflating the assert logic with some bogus additional conditions:
public class AssertFat { public static void main(String[] args) { long l = 0; for (int k = 1; k < 100_000; k++) { l += twice(k); } System.out.println(l); } public static int twice(int a) { assert a != 0 && a != -1 && a != -2 && a != -3 : "bad a"; return a * 2; } }We have inflated the size of the assert artificially by adding some additional bogus checks.
If we would set the FreqInlineSize to 50, the method will not get inlined even when asserts are disabled:
-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining -XX:FreqInlineSize=50 -daIn the logging we'll see the following in the inlining output:
@ 12 com.asserts.AssertFat::twice (53 bytes) callee is too largeThis means that the `twice` method, 53 bytes, was too fat to get inlined since the maximum size for a frequently called method was set to 50 bytes.
To proof that the twice method would have been inlined without assert, lets remove the assert and run again:
@ 12 com.asserts.AssertFat::twice (4 bytes)Now the method does get inlined.
This means that even though the JIT is able to completely remove the assert code if assert is disabled, it can still prevent inlining. Therefor the assert feature isn't completely free. It doesn't mean that you should remove assert from your code; but it is good to keep in the back of your mind.
Thanks for your very good summary about optimization of asserts. It was a very good read. The inlining part has shocked me a little. I thought the inlining size is measured differently. But it is really bytecode instructions. Unfortunatly there seems to be that a useful param is missing in jdk 9 jlink. I have not found a --strip-assertions. That would be very cool. So that there is no excuse anymore for not using assertions.
BeantwoordenVerwijderen