Talk:Kd-tree

From Robowiki
Revision as of 13:28, 16 August 2009 by Nat (talk | contribs) (it isn't the same, even Simonton's)
Jump to navigation Jump to search

Bucket PR k-d tree

I think you all know about Bucket PR k-d tree (aka Simonton's tree), right? I'll get directly to my point.

The Bucket PR k-d tree (I'll refer as just a tree from now on) is a binary tree split by the median of the bound, right? I wonder if I make it m-ary tree, let say, Ternary or Quaternary tree instead of just binary? Will it better? » Nat | Talk » 09:12, 15 August 2009 (UTC)

Note, it doesn't by definition have to be split by the median, though the median is usually considered the best place. As far as your second question... it may be better, or it may not. I'm pretty sure it will depend heavily on the implementation details and the data set in question. --Rednaxela 14:51, 15 August 2009 (UTC)

In a binary tree you do 1 comparison to narrow it down to 1/2 the tree (if it's well balanced). In a 3-ary you do (on average) 1.5 comparisons to narrow it to 1/3. 4-ary 2 comparisons for 1/4 of the tree. 1*1/2 = 1.5*1/3 = 2*1/4. They're all theoretically the same. --Simonton 15:33, 15 August 2009 (UTC)

  • Also note that with a 4-ary tree it's best to do a binary search at each node, which is pretty much the same thing as going back to a binary tree, except you'd be doing two comparisons for the same dimension instead of stepping to the next. --Simonton 15:36, 15 August 2009 (UTC)

(Edit conflict) It could be faster splitting 4-ways, sure, but my gut says not by very much. I'd probably recommend doing the normal 2-way (binary) one first, since it will be much simpler to develop and to debug, and then you can try to modify it to do 4-way. You'll need the 2-way version, anyway, in order to compare the speeds. Good luck. --Voidious 15:38, 15 August 2009 (UTC)

It seems that Ternary tree is the fastest of binary, ternary and quaternary. But I've some question, does the Kd-Tree output the same as linear (brute-force) search. Some of my test state that:

Starting Bucket PR k-d tree performance test...
Generating points...
Data generated.
Performing linear search...
Linear search complete; time = 0.003425156
Performing binary k-d tree search...
Binary tree search complete; time = 0.001995574
: accuracy = 0.4666666666666667
Performing ternary k-d tree search...
Ternary tree search complete; time = 2.91098E-4
: accuracy = 1.0
Performing quaternary k-d tree search...
Quaternary tree search complete; time = 3.22178E-4
: accuracy = 1.0

Data completely random. Accuracy calculate by number of result that is same as linear / cluster size. This test with a hundred data points and cluster size of 15. If I increase the data points to 26,000 then the accuracy drop to zero. Is this my tree problem or it is known problem with Kd-Tree? » Nat | Talk » 16:34, 15 August 2009 (UTC)

The kd-tree should definitely give the same results as a brute force search, so you must still have some bugs to work out. --Voidious 16:38, 15 August 2009 (UTC)

Really? Simonton's one sometimes wrong too! » Nat | Talk » 11:28, 16 August 2009 (UTC)