One of the things that always wondered me is how expensive an instanceof check is. In most cases this
is never a concern, but out of professional curiosity I wondered how well the JIT is able to optimize it.
Lets start with the simplest case, an instanceof of check using a final class as second argument. For this we use the
following example:
public class InstanceOfCheck {
public static void main(String[] args) {
for (int k = 0; k < 100000; k++) {
doSomething("");
}
System.out.println(doSomething(""));
}
public static String doSomething(Object o) {
if (o.getClass() instanceof String.class) {
return "string";
} else {
return "nostring";
}
}
}
For this post we use the following Java version:
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
I'm too lazy to update the JDK for my MacBook which I hardly use.
And the following JVM parameters:
-XX:+UnlockDiagnosticVMOptions
-XX:PrintAssemblyOptions=intel
-XX:-TieredCompilation
-XX:-Inline
-XX:-BackgroundCompilation
-XX:CompileCommand=print,*InstanceOfCheck.doSomething
-XX:+PrintCompilation
This will give us the Assembly of the 'doSomething' method.
When I run the above example, I get the following output:
125 1 b java.lang.String::indexOf (70 bytes)
128 2 % b java.lang.String::indexOf @ 37 (70 bytes)
137 3 b java.lang.String::hashCode (55 bytes)
192 4 b sun.nio.cs.UTF_8$Encoder::encode (359 bytes)
213 5 b java.lang.String::charAt (29 bytes)
216 6 b java.io.UnixFileSystem::normalize (75 bytes)
220 7 b java.lang.String::lastIndexOf (52 bytes)
243 8 b com.InstanceOfCheck::doSomething (13 bytes)
Compiled method (c2) 244 8 com.InstanceOfCheck::doSomething (13 bytes)
total in heap [0x0000000106479e10,0x000000010647a090] = 640
relocation [0x0000000106479f30,0x0000000106479f48] = 24
main code [0x0000000106479f60,0x0000000106479fc0] = 96
stub code [0x0000000106479fc0,0x0000000106479fd8] = 24
oops [0x0000000106479fd8,0x0000000106479fe0] = 8
metadata [0x0000000106479fe0,0x0000000106479ff0] = 16
scopes data [0x0000000106479ff0,0x000000010647a018] = 40
scopes pcs [0x000000010647a018,0x000000010647a078] = 96
dependencies [0x000000010647a078,0x000000010647a080] = 8
nul chk table [0x000000010647a080,0x000000010647a090] = 16
Loaded disassembler from /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/jre/lib/hsdis-amd64.dylib
Decoding compiled method 0x0000000106479e10:
Code:
[Disassembling for mach='i386:x86-64']
[Entry Point]
[Verified Entry Point]
[Constants]
# {method} {0x000000010e77a480} 'doSomething' '(Ljava/lang/Object;)Ljava/lang/String;' in 'com/InstanceOfCheck'
# parm0: rsi:rsi = 'java/lang/Object'
# [sp+0x20] (sp of caller)
0x0000000106479f60: mov DWORD PTR [rsp-0x14000],eax
0x0000000106479f67: push rbp
0x0000000106479f68: sub rsp,0x10 ;*synchronization entry
; - com.InstanceOfCheck::doSomething@-1 (line 14)
0x0000000106479f6c: mov r11d,DWORD PTR [rsi+0x8] ; implicit exception: dispatches to 0x0000000106479fa1
0x0000000106479f70: cmp r11d,0xf80002da ; {metadata('java/lang/String')}
0x0000000106479f77: jne 0x0000000106479f8f
0x0000000106479f79: movabs rax,0x7955dfce0 ; {oop("string")}
0x0000000106479f83: add rsp,0x10
0x0000000106479f87: pop rbp
0x0000000106479f88: test DWORD PTR [rip+0xffffffffff73f072],eax # 0x0000000105bb9000
; {poll_return}
0x0000000106479f8e: ret
0x0000000106479f8f: mov rbp,rsi
0x0000000106479f92: mov esi,0xffffffde
0x0000000106479f97: call 0x0000000106447120 ; OopMap{rbp=Oop off=60}
;*instanceof
; - com.InstanceOfCheck::doSomething@1 (line 14)
; {runtime_call}
0x0000000106479f9c: call 0x0000000104fdfe44 ; {runtime_call}
0x0000000106479fa1: mov esi,0xfffffff4
0x0000000106479fa6: nop
0x0000000106479fa7: call 0x0000000106447120 ; OopMap{off=76}
;*instanceof
; - com.InstanceOfCheck::doSomething@1 (line 14)
; {runtime_call}
0x0000000106479fac: call 0x0000000104fdfe44 ;*instanceof
; - com.InstanceOfCheck::doSomething@1 (line 14)
; {runtime_call}
0x0000000106479fb1: hlt
0x0000000106479fb2: hlt
0x0000000106479fb3: hlt
0x0000000106479fb4: hlt
0x0000000106479fb5: hlt
0x0000000106479fb6: hlt
0x0000000106479fb7: hlt
0x0000000106479fb8: hlt
0x0000000106479fb9: hlt
0x0000000106479fba: hlt
0x0000000106479fbb: hlt
0x0000000106479fbc: hlt
0x0000000106479fbd: hlt
0x0000000106479fbe: hlt
0x0000000106479fbf: hlt
[Exception Handler]
[Stub Code]
0x0000000106479fc0: jmp 0x000000010646bf60 ; {no_reloc}
[Deopt Handler Code]
0x0000000106479fc5: call 0x0000000106479fca
0x0000000106479fca: sub QWORD PTR [rsp],0x5
0x0000000106479fcf: jmp 0x0000000106446d00 ; {runtime_call}
0x0000000106479fd4: hlt
0x0000000106479fd5: hlt
0x0000000106479fd6: hlt
0x0000000106479fd7: hlt
OopMapSet contains 2 OopMaps
#0
OopMap{rbp=Oop off=60}
#1
OopMap{off=76}
247 9 % b com.InstanceOfCheck::main @ 2 (32 bytes)
248 9 % com.InstanceOfCheck::main @ -2 (32 bytes) made not entrant
string
After filtering out the non relevant parts, the following remains:
0x0000000106479f6c: mov r11d,DWORD PTR [rsi+0x8] ; implicit exception: dispatches to 0x0000000106479fa1
0x0000000106479f70: cmp r11d,0xf80002da ; {metadata('java/lang/String')}
0x0000000106479f77: jne 0x0000000106479f8f
0x0000000106479f79: movabs rax,0x7955dfce0 ; {oop("string")}
0x0000000106479f83: add rsp,0x10
0x0000000106479f87: pop rbp
0x0000000106479f88: test DWORD PTR [rip+0xffffffffff73f072],eax # 0x0000000105bb9000
; {poll_return}
0x0000000106479f8e: ret
The first instruction loads the class address of the 'o' argument in register r11d. And the second instruction it is
compared with the String class address. If the address is not equal, we jump to the uncommon trap since we have only
passed String arguments to the 'doSomething' method.
So it seems that doing an instanceof check using a final class is very cheap. And we get exactly the same Assembly as direct
address comparison in the Java code:
public static String doSomething(Object o) {
if (o.getClass() == String.class) {
return "string";
} else {
return "nostring";
}
}
Which gives the following relevant Assembly:
0x000000010e0b296c: mov r11d,DWORD PTR [rsi+0x8] ;*invokevirtual getClass
; - com.InstanceOfCheck::doSomething@1 (line 14)
; implicit exception: dispatches to 0x000000010e0b29b1
0x000000010e0b2970: cmp r11d,0xf80002da ; {metadata('java/lang/String')}
0x000000010e0b2977: jne 0x000000010e0b298f ;*if_acmpne
; - com.InstanceOfCheck::doSomething@6 (line 14)
0x000000010e0b2979: movabs rax,0x7955db818 ; {oop("string")}
0x000000010e0b2983: add rsp,0x10
0x000000010e0b2987: pop rbp
0x000000010e0b2988: test DWORD PTR [rip+0xffffffffff72a672],eax # 0x000000010d7dd000
; {poll_return}
0x000000010e0b298e: ret
Non final but concrete class
In the previous example we saw that the JIT perfectly optimized an instanceof check comparing with a final class. But what about a non final concrete class like e.g. LinkedList? Will the JIT be able to optimize that? For this we modify the example:
public class InstanceOfCheck {
public static void main(String[] args) {
List l = new LinkedList();
for (int k = 0; k < 100000; k++) {
doSomething(l);
}
System.out.println(doSomething(l));
}
public static String doSomething(Object o) {
if (o.getClass() == LinkedList.class) {
return "linkedlist";
} else {
return "nolinkedlist";
}
}
}
If we extract only the relevant Assembly:
0x000000010307ccec: mov r11d,DWORD PTR [rsi+0x8] ;*invokevirtual getClass
; - com.InstanceOfCheck::doSomething@1 (line 15)
; implicit exception: dispatches to 0x000000010307cd31
0x000000010307ccf0: cmp r11d,0xf8008c83 ; {metadata('java/util/LinkedList')}
0x000000010307ccf7: je 0x000000010307cd0f ;*if_acmpne
; - com.InstanceOfCheck::doSomething@6 (line 15)
0x000000010307ccf9: movabs rax,0x7955dd8d0 ; {oop("nostring")}
0x000000010307cd03: add rsp,0x10
0x000000010307cd07: pop rbp
0x000000010307cd08: test DWORD PTR [rip+0xffffffffff6642f2],eax # 0x00000001026e1000
; {poll_return}
0x000000010307cd0e: ret
This Assembly looks very similar to the String version. The JIT sees that we only have passed arguments of type LinkedList and therefor it optimizes this with a cheap class address comparison and adds an uncommon trap in case of a different argument being passed.
Non concrete class check?
What would happen if we would check instanceof using a non concrete class like List. In this case it isn't possible to do a cheap class comparison:
public class InstanceOfCheck {
public static void main(String[] args) {
List l = new LinkedList();
for (int k = 0; k < 100000; k++) {
doSomething(l);
}
System.out.println(doSomething(l));
}
public static String doSomething(Object o) {
if (o instanceof List) {
return "list";
} else {
return "nolist";
}
}
}
When we extract the relevant Assembly we get the following:
0x000000010807e66c: mov r11d,DWORD PTR [rsi+0x8] ; implicit exception: dispatches to 0x000000010807e6a1
0x000000010807e670: cmp r11d,0xf8008c83 ; {metadata('java/util/LinkedList')}
0x000000010807e677: jne 0x000000010807e68f
0x000000010807e679: movabs rax,0x7955ddf58 ; {oop("list")}
0x000000010807e683: add rsp,0x10
0x000000010807e687: pop rbp
0x000000010807e688: test DWORD PTR [rip+0xffffffffff61f972],eax # 0x000000010769e000
; {poll_return}
0x000000010807e68e: ret
This is a nice surprise! Because we have only passed LinkedLists as arguments, the instanceof with List as argument
is implemented as a cheap pointer comparison with LinkedList class-address. If anything else than a LinkedList is
passed, an uncommon trap is executed. So this means that even in this case the JIT is able to optimize!