Difference between revisions of "User talk:Duyn/kd-tree Tutorial"

From Robowiki
Jump to navigation Jump to search
(A tradeoff in how we use the BOB checks differently)
(Response to heap.)
 
(27 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
Interesting work here. Personally I'd consider such a code-heavy tutorial to be more of a 'code-explanation' than a tutorial, but still very good. Also, pretty good job optimizing fairly well there :) --[[User:Rednaxela|Rednaxela]] 16:07, 27 February 2010 (UTC)
 
Interesting work here. Personally I'd consider such a code-heavy tutorial to be more of a 'code-explanation' than a tutorial, but still very good. Also, pretty good job optimizing fairly well there :) --[[User:Rednaxela|Rednaxela]] 16:07, 27 February 2010 (UTC)
  
Also, I'd say that the 'Bounds-overlap-ball check' optimization is probably one of the most important things in how this tree benchmarks well. I also find it interesting how that optimization is in that paper, I've never seen it mentioned before in texts on kd-trees. However when I implemented that type of check in my own kd-tree, it just came to mind to do in a "... why hasn't anyone else done this? It seems  so obvious" type of moment. You may be interested in one detail of how I was doing it differently though. I didn't use those bounds checking for the path order, I just used the conventional method based on the split. In addition, I only do did the 'bounds-overlap-ball check' for evaluating if 'second choice' branches are worthwhile. The reason for this is that:
+
'''Notice:''' ''Some performance discussion here moved to [[User talk:Duyn/BucketKdTree]]'' --[[User:Rednaxela|Rednaxela]] 01:34, 3 March 2010 (UTC)
# Bounds checks are expensive
 
# The 'first choice' branch to descend is very likely to have what we're looking for since it's parent branch was either a 'first choice' branch or had the bounds check done on it.
 
Benchmarks showed that those effect were significant enough that the detailed bounds check was only worthwhile in pruning needless 'second choice' branches. I'm curious if your implementation would show similar results if it skipped the bounds chck in those circumstances. --[[User:Rednaxela|Rednaxela]] 01:43, 28 February 2010 (UTC)
 
  
Thank you for the suggestion. I have tried it, but found it didn't make a significant impact on the performance of the tree:
+
I think we should move discussion of performance to [[User:Duyn/BucketKdTree]] since it was more of a footnote in this walkthrough.—[[User:Duyn|duyn]] 20:42, 2 March 2010 (UTC)
 +
: Good catch. Done :) --[[User:Rednaxela|Rednaxela]] 01:34, 3 March 2010 (UTC)
  
<pre>
+
Nice tutorial work here! Quick question: Did you measure the effect of using the variance for splitting? I tried that before (I think last week) and didn't seem to see a performance advantage. --[[User:Rednaxela|Rednaxela]] 17:26, 12 March 2010 (UTC)
...[snip]...
+
: Not rigorously&mdash;I tried splitting half way along the widest dimension like you appear to do, but found that made search times worse.&mdash;[[User:Duyn|duyn]] 18:02, 12 March 2010 (UTC)
 +
:: I just did some more testing using variance for split dimension like you do. Out of where to put the exact split value, middle of the bounds seemed to still work best. Using variance instead of widest dimension seemed to gain just a few microseconds, and lost just as many microseconds in the adding time, and since there are twice as many data points as searches at least in this benchmark, I'll stick with middle of the widest dimension. --[[User:Rednaxela|Rednaxela]] 18:10, 13 March 2010 (UTC)
  
RESULT << k-nearest neighbours search with duyn's Bucket kd-tree >>
+
Impressive speed with the final result there Duyn! I assume this is still comparing to the version of my tree in the old benchmark package? I'll have to compare it to my more up-to-date ones some time. That with the d-ary non-implicit tree also tempts me to to perhaps make a heap benchmark... :) --[[User:Rednaxela|Rednaxela]] 19:13, 12 March 2010 (UTC)
: Average searching time      = 0.078 miliseconds
+
: Yes, it's with the old one bundled in the package. I wanted people to be able to replicate the results.&mdash;[[User:Duyn|duyn]] 02:10, 13 March 2010 (UTC)
: Average worst searching time = 17.077 miliseconds
+
:: Well as far as replicating, just in case you missed it, KNN.jar here has since been updated (though maybe not worth downloading yet, I messed up the inclusion of source, it's there but a bit weird). I'll update KNN.jar again both fixed and with your latest tree versions included if you don't mind. --[[User:Rednaxela|Rednaxela]] 02:17, 13 March 2010 (UTC)
: Average adding time          = 2.49 microseconds
+
::: Not at all.&mdash;[[User:Duyn|duyn]] 06:20, 13 March 2010 (UTC)
: Accuracy                    = 100%
+
::: [Addendum: ] Although you may prefer to wait until activity on this tutorial dies down since all three trees are essentially in-development right now.&mdash;[[User:Duyn|duyn]] 11:58, 13 March 2010 (UTC)
  
...[snip]...
+
By the way, the title 'Implicit d-ary Heap' seems misleading to me. The d-ary heap you implement is not an implicit heap, an implicit heap that encodes it's tree structure solely in array indices instead of creating objects for each node. --[[User:Rednaxela|Rednaxela]] 08:31, 14 March 2010 (UTC)
 +
:I am not sure I understand. The heap as presented ''does'' encode its tree structure solely in array indices. It only creates new <code>PrioNode</code>s so users can have access to priorities after a <code>poll()</code>. That step has nothing to do with the structure of the heap. The heap is implicit in that the parent/children of a node are worked out from array indices (as seen in the siftXXX methods), rather than storing explicit references in each node.&mdash;[[User:Duyn|duyn]] 11:34, 14 March 2010 (UTC)
 +
::Oh wait... Sorry, I misread the first bit of the code I was looking at, nevermind. --[[User:Rednaxela|Rednaxela]] 15:01, 14 March 2010 (UTC)
  
BEST RESULT:
+
By the way, I just tried out your implicit d-ary heap with my tree. The result shows that 3-ary and 4-ary are roughly tied for speed, both notably better than 2-ary and 5-ary. Despite this, it's still just slightly slower than using my heap ([http://bitbucket.org/rednaxela/knn-benchmark/src/tip/ags/utils/dataStructures/BinaryHeap.java code link]) in my tree. --[[User:Rednaxela|Rednaxela]] 22:41, 15 March 2010 (UTC)
- #1 Rednaxela's Bucket kd-tree [0.0536]
+
: That fits with my experience. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.7774 The Influence of Caches on the Performance of Heaps] suggests that if you take the cost of a swap operation into account, remove-min is cheapest for a 3-ary/4-ary heap. It was at that point the heap stopped being a bottleneck in my tree and so I stopped optimising it.&mdash;[[User:Duyn|duyn]] 07:12, 17 March 2010 (UTC)
- #2 duyn's Bucket kd-tree [0.0777]
 
- #3 Simonton's Bucket PR k-d tree [0.1625]
 
- #4 Voidious' Bucket PR k-d tree [0.2203]
 
- #5 Nat's Bucket PR k-d tree [0.3654]
 
- #6 Voidious' Linear search [0.5836]
 
 
 
Benchmark running time: 554.32 seconds
 
</pre>
 
 
 
Although these results are perilously close to the rounding edge, it works out to a gain on previous performance of:
 
    <math>{0.0787 - 0.0777 \over 0.0787} = 1.27\%\ \mathrm{improvement}</math>
 
 
 
Profiling with netbeans indicates it is the search() and searchXXX() functions, not the distance/bounds calculations which are slowing the tree most. This suggests that:
 
* the tree's balance is not optimal so we have to search a lot of trees;
 
** unlikely: quickly hacking in support for re-balancing every time the number of exemplars in a tree doubles did not speed up searches.
 
* the regions are not very square so our search hypersphere overlaps with a lot of hyperrects;
 
** unlikely: splitting on the widest dimension whenever it turns out to be twice the width of the thinnest dimension did not speed up searches.
 
* using a TreeMap for collecting results is slow; or
 
** this is now confirmed. A quick hack to use PriorityQueue instead of TreeMap gave the following results (still with your old kd-tree):
 
    <pre>
 
RESULT << k-nearest neighbours search with duyn's Bucket kd-tree (heap, less bounds checking) >>
 
: Average searching time      = 0.024 miliseconds
 
: Average worst searching time = 10.333 miliseconds
 
: Average adding time          = 2.77 microseconds
 
: Accuracy                    = 100%
 
 
 
RESULT << k-nearest neighbours search with duyn's Bucket kd-tree (heap) >>
 
: Average searching time      = 0.026 miliseconds
 
: Average worst searching time = 10.58 miliseconds
 
: Average adding time          = 2.37 microseconds
 
: Accuracy                    = 92%
 
 
 
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree >>
 
: Average searching time      = 0.054 miliseconds
 
: Average worst searching time = 12.269 miliseconds
 
: Average adding time          = 2.33 microseconds
 
: Accuracy                    = 45%
 
 
 
 
 
BEST RESULT:
 
- #1 duyn's Bucket kd-tree (heap, less bounds checking) [0.0242]
 
- #2 duyn's Bucket kd-tree (heap) [0.0263]
 
- #3 Rednaxela's Bucket kd-tree [0.0542]
 
    </pre>
 
* there is little more to gain without another theoretic insight.
 
** ie. the algorithm's good, but my coding sucks =).
 
 
 
The small improvement from your suggestion becomes more significant now. I shall have to update the main page to reflect this eventually.&mdash;[[User:Duyn|duyn]] 04:41, 28 February 2010 (UTC)
 
 
 
Ahh yeah. Well, I didn't expect a miraculous improvement, and that is pretty close to the rounding amount. I do seem to remember though from my tests that while the improvement was small it was unambiguous, and I'd expect it to matter increasingly as the tree becomes increasingly deep beyond what this test data does. I doubt the TreeMap is the issue. While I used a max-heap in mine, since it's theoretically fastest and my implementation is very good, I doubt it's used quite enough to be a issue in this case. If you're concerned about that though, perhaps test out replacing it with the max-heap from my tree as a test? About the regions and balance, that's quite possible. Hmm... this reminds me, I really should experiment making a tree with "incremental near neighbor search", which instead of searching for a fixed number of nearest neighbors gives an iterator. Also, I should probably experiment with automatic rebalancing parts by tracking depth since insertion is orders of magnitude faster than searches anyway... --[[User:Rednaxela|Rednaxela]] 05:44, 28 February 2010 (UTC)
 
 
 
By the way, if you're curious, the [[User:Rednaxela/kD-Tree|newest version of my tree]] is faster than the one bundled with the benchmark, which I'm pretty sure you compared to:
 
<pre>RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree BENCHMARK BUNDLED VERSION>>
 
: Average searching time      = 0.083 miliseconds
 
: Average worst searching time = 1.836 miliseconds
 
: Average adding time          = 7.17 microseconds
 
: Accuracy                    = 100%
 
 
 
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree NEWEST VERSION >>
 
: Average searching time      = 0.065 miliseconds
 
: Average worst searching time = 1.326 miliseconds
 
: Average adding time          = 7.1 microseconds
 
: Accuracy                    = 100%</pre>--[[User:Rednaxela|Rednaxela]] 03:22, 1 March 2010 (UTC)
 
 
 
Form your latest result above, it seems that your k-d tree is lack of accuracy. You'll have to improve that though.
 
 
 
Just a note about the accuracy, the benchmark use the first implementation (as ordered in <code>searchAlgorithms</code> array) to check the accuracy, so make sure that you still have <code>FlatKNNSearch.class</code> as the first implementation. Or if you think this may slow down your benchmark, I believe Rednaxela's k-d tree is stable enough to be use as a reference implementation. And the order of output isn't important for accuracy calculation, it just check it have the same result set.  --[[User:Nat|<span style="color:#099;">Nat</span>]] [[User talk:Nat|<span style="color:#0a5;">Pavasant</span>]] 11:32, 1 March 2010 (UTC)
 
.
 
Thanks for pointing that out. That wasn't obvious from a quick scan of the source code. Accuracy does drop when using the PriorityQueue (probably because of incorrect handling of duplicates), but the intent was just to show that there is a lot to be gained by using a more efficient data structure than a TreeMap to collect the results.&mdash;[[User:Duyn|duyn]]
 
 
 
One note, is with my own tree, I made a modified version that did "BOB" checks for 'first choice' branches and also using those "BOB" calculations on the path ordering like you do. The result? The average search time did not change, but the max search time decreased by 20% which I found interesting. This means there is actually a significant tradeoff happening which could be exploited. I'm now trying to determine an appropriate heuristic to decide if it's worth doing "BOB" on 'first choice' and for path selection. I currently suspect a good heuristic would be to use the somewhat expensive "BOB" checks (including for path selection) only when the total number of points in the tree below the node in question is above a certain count. I'll post results with that soon if you wish. --[[User:Rednaxela|Rednaxela]] 15:21, 1 March 2010 (UTC)
 

Latest revision as of 09:12, 17 March 2010

Interesting work here. Personally I'd consider such a code-heavy tutorial to be more of a 'code-explanation' than a tutorial, but still very good. Also, pretty good job optimizing fairly well there :) --Rednaxela 16:07, 27 February 2010 (UTC)

Notice: Some performance discussion here moved to User talk:Duyn/BucketKdTree --Rednaxela 01:34, 3 March 2010 (UTC)

I think we should move discussion of performance to User:Duyn/BucketKdTree since it was more of a footnote in this walkthrough.—duyn 20:42, 2 March 2010 (UTC)

Good catch. Done :) --Rednaxela 01:34, 3 March 2010 (UTC)

Nice tutorial work here! Quick question: Did you measure the effect of using the variance for splitting? I tried that before (I think last week) and didn't seem to see a performance advantage. --Rednaxela 17:26, 12 March 2010 (UTC)

Not rigorously—I tried splitting half way along the widest dimension like you appear to do, but found that made search times worse.—duyn 18:02, 12 March 2010 (UTC)
I just did some more testing using variance for split dimension like you do. Out of where to put the exact split value, middle of the bounds seemed to still work best. Using variance instead of widest dimension seemed to gain just a few microseconds, and lost just as many microseconds in the adding time, and since there are twice as many data points as searches at least in this benchmark, I'll stick with middle of the widest dimension. --Rednaxela 18:10, 13 March 2010 (UTC)

Impressive speed with the final result there Duyn! I assume this is still comparing to the version of my tree in the old benchmark package? I'll have to compare it to my more up-to-date ones some time. That with the d-ary non-implicit tree also tempts me to to perhaps make a heap benchmark... :) --Rednaxela 19:13, 12 March 2010 (UTC)

Yes, it's with the old one bundled in the package. I wanted people to be able to replicate the results.—duyn 02:10, 13 March 2010 (UTC)
Well as far as replicating, just in case you missed it, KNN.jar here has since been updated (though maybe not worth downloading yet, I messed up the inclusion of source, it's there but a bit weird). I'll update KNN.jar again both fixed and with your latest tree versions included if you don't mind. --Rednaxela 02:17, 13 March 2010 (UTC)
Not at all.—duyn 06:20, 13 March 2010 (UTC)
[Addendum: ] Although you may prefer to wait until activity on this tutorial dies down since all three trees are essentially in-development right now.—duyn 11:58, 13 March 2010 (UTC)

By the way, the title 'Implicit d-ary Heap' seems misleading to me. The d-ary heap you implement is not an implicit heap, an implicit heap that encodes it's tree structure solely in array indices instead of creating objects for each node. --Rednaxela 08:31, 14 March 2010 (UTC)

I am not sure I understand. The heap as presented does encode its tree structure solely in array indices. It only creates new PrioNodes so users can have access to priorities after a poll(). That step has nothing to do with the structure of the heap. The heap is implicit in that the parent/children of a node are worked out from array indices (as seen in the siftXXX methods), rather than storing explicit references in each node.—duyn 11:34, 14 March 2010 (UTC)
Oh wait... Sorry, I misread the first bit of the code I was looking at, nevermind. --Rednaxela 15:01, 14 March 2010 (UTC)

By the way, I just tried out your implicit d-ary heap with my tree. The result shows that 3-ary and 4-ary are roughly tied for speed, both notably better than 2-ary and 5-ary. Despite this, it's still just slightly slower than using my heap (code link) in my tree. --Rednaxela 22:41, 15 March 2010 (UTC)

That fits with my experience. The Influence of Caches on the Performance of Heaps suggests that if you take the cost of a swap operation into account, remove-min is cheapest for a 3-ary/4-ary heap. It was at that point the heap stopped being a bottleneck in my tree and so I stopped optimising it.—duyn 07:12, 17 March 2010 (UTC)