Regarding PrioQueue
Because of how Java runs the JIT and GC in separate threads, I just tried a couple quick things:
If I force Java to run on only one core, I get this result:
- #1 Skilgannon's Cache-hit KDTree [0.0334] - #2 Rednaxela's kd-tree (3rd gen) [0.0343] - #3 Rednaxela's kd-tree (2nd gen) [0.0375] - #4 Voidious' Linear search [0.5844]
If I force Java to run on two cores, I get this result:
- #1 Rednaxela's kd-tree (3rd gen) [0.0280] - #2 Skilgannon's Cache-hit KDTree [0.0304] - #3 Rednaxela's kd-tree (2nd gen) [0.0341] - #4 Voidious' Linear search [0.4806]
Compared to allowing all 6 cores, only allowing 2 cores improved the linear search result (more dramatic than I sepected!), but it hurt all of the kd-trees still.
Maybe that's why my linear search score is so much better than yours?
BTW, newest code is a little bit faster.
So, it turns out that if I use Oracle Java in Windows instead of OpenJDK on Linux, the performance is pretty different:
- #1 Skilgannon's Cache-hit KDTree [0.0275] - #2 Rednaxela's kd-tree (3rd gen) [0.0290] - #3 Rednaxela's kd-tree (2nd gen) [0.0309] - #4 Voidious' Linear search [0.5553]
The relative performance of things looks much more similar to what you saw, with a smaller difference between my 3rd and 2nd gen tree, with your one performing better.
Between OracleJava/Windows and OpenJDK/Linux, my 3rd gen tree and your cache-hit tree, swap places it seems.
java version "1.7.0_25" Java(TM) SE Runtime Environment (build 1.7.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
(Also turns out the server JVM is much better suited for the kd-tree test than the client JVM. Edited this post to switch the results to those using the server JVM)