Mini-Sized

Fragment of a discussion from Talk:Kd-tree
Jump to navigation Jump to search

I ran across this while doing my KD-Tree research http://www.michaelpollmeier.com/selecting-top-k-items-from-a-list-efficiently-in-java-groovy/ It looks like once you calculate your distances to everything (two for loops) it's just about 4 lines of code.

Skilgannon (talk)18:37, 17 July 2013

The trick with the performance of linear search is what method you use to keep track of the "N closest points so far". The methods in that article have a problem, because they're storing every point in the list in the PriorityQueue, sorted list or whatever other structure. That is inefficient. It's far better to store the only "N closest points so far". You can do this using PriorityQueue by removing the highest distance point from the queue whenever the queue's length is greater than N.

The optimal solution for speed of a linear search is to use a bounded-size heap to store the "N closest points so far" because you waste the fewest operations on points which are not within the N closest. PriorityQueue is implemented using a heap like this, however PriorityQueue is inefficient in practice due to Comparator/Comparable OOP stuff.

Personally I'd probably try making the smallest custom implementation of a heap that I could. Alternatively, you could use use Collections.sort() or PriorityQueue, however that adds it's own codesize because you have to have some wrapper object around points which implements Comparable/Comparator, and I fear that wrapper may add more codesize than tiny specialized implementation of a heap would.

Rednaxela (talk)20:07, 17 July 2013