Difference between revisions of "User talk:Pedersen/kdTree"
(branching) |
|||
Line 4: | Line 4: | ||
: It could be (also) using the nearest value to determine if the next path is viable, rather than the furthest in the current matching set. --[[User:Chase-san|Chase]] 02:06, 6 March 2010 (UTC) | : It could be (also) using the nearest value to determine if the next path is viable, rather than the furthest in the current matching set. --[[User:Chase-san|Chase]] 02:06, 6 March 2010 (UTC) | ||
+ | |||
+ | I've suspected that the problem is when I split more than once in the same dimension, but that speculation hasn't borne out through trials. I know it has to do with the logic for talking alternate branches, since if I remove the conditional logic and force it (i.e. a full tree search) it gets 100% every time. --[[User:Pedersen|Martin]] 05:03, 6 March 2010 (UTC) |
Revision as of 06:03, 6 March 2010
I don't have a application for a tree yet (though conceivably it would work with a gun if I bothered to develop weights for the various dimensions, after determining what those dimensions should be) but I figured it would be fun to whip one up. It turns out it was pretty quick to crank out, and after a little debugging I not only had blazing fast speeds at high volumes, faster by far than the samples that came with the benchmark at about 5000 data points, I found that at about that range I started dipping below 100% accuracy. I have spent most of today, longer than it took to write the first draft, trying to find out what is going wrong. As the day winds to a close, I've found that I can get it to bleed accuracy by setting the capacities really low, which will also help with debugging (in terms of diagramming out in code and by hand). When I set the dimensions to 2, neighbors to 10, and sample size to 50, along with a bucket size of 4, I caused it to hit 90% accuracy. My time is up today though, so I'll have to proceed with debugging another evening. --Martin 00:27, 6 March 2010 (UTC)
The fact that accuracy bleeds as bucket capacity goes down, suggests to me that something is wrong with the logic for when to descend a branch or not is broken but the logic for processing points within a leaf is fine. Chances are you'll see accuracy decrease much further if you increase the count of fetched neighbors while keeping bucket size constant. Given the results you've said, it's quite possible that it is never descending into 'second choice' paths ever in fact. --Rednaxela 01:10, 6 March 2010 (UTC)
- It could be (also) using the nearest value to determine if the next path is viable, rather than the furthest in the current matching set. --Chase 02:06, 6 March 2010 (UTC)
I've suspected that the problem is when I split more than once in the same dimension, but that speculation hasn't borne out through trials. I know it has to do with the logic for talking alternate branches, since if I remove the conditional logic and force it (i.e. a full tree search) it gets 100% every time. --Martin 05:03, 6 March 2010 (UTC)