Difference between revisions of "User talk:Duyn/kd-tree Tutorial"
m (Changed misleading heading.) |
(Response to heap.) |
||
(9 intermediate revisions by 2 users not shown) | |||
Line 8: | Line 8: | ||
Nice tutorial work here! Quick question: Did you measure the effect of using the variance for splitting? I tried that before (I think last week) and didn't seem to see a performance advantage. --[[User:Rednaxela|Rednaxela]] 17:26, 12 March 2010 (UTC) | Nice tutorial work here! Quick question: Did you measure the effect of using the variance for splitting? I tried that before (I think last week) and didn't seem to see a performance advantage. --[[User:Rednaxela|Rednaxela]] 17:26, 12 March 2010 (UTC) | ||
: Not rigorously—I tried splitting half way along the widest dimension like you appear to do, but found that made search times worse.—[[User:Duyn|duyn]] 18:02, 12 March 2010 (UTC) | : Not rigorously—I tried splitting half way along the widest dimension like you appear to do, but found that made search times worse.—[[User:Duyn|duyn]] 18:02, 12 March 2010 (UTC) | ||
+ | :: I just did some more testing using variance for split dimension like you do. Out of where to put the exact split value, middle of the bounds seemed to still work best. Using variance instead of widest dimension seemed to gain just a few microseconds, and lost just as many microseconds in the adding time, and since there are twice as many data points as searches at least in this benchmark, I'll stick with middle of the widest dimension. --[[User:Rednaxela|Rednaxela]] 18:10, 13 March 2010 (UTC) | ||
Impressive speed with the final result there Duyn! I assume this is still comparing to the version of my tree in the old benchmark package? I'll have to compare it to my more up-to-date ones some time. That with the d-ary non-implicit tree also tempts me to to perhaps make a heap benchmark... :) --[[User:Rednaxela|Rednaxela]] 19:13, 12 March 2010 (UTC) | Impressive speed with the final result there Duyn! I assume this is still comparing to the version of my tree in the old benchmark package? I'll have to compare it to my more up-to-date ones some time. That with the d-ary non-implicit tree also tempts me to to perhaps make a heap benchmark... :) --[[User:Rednaxela|Rednaxela]] 19:13, 12 March 2010 (UTC) | ||
− | : Yes, it's with the old one bundled in the package. I wanted people to be able to replicate the results. | + | : Yes, it's with the old one bundled in the package. I wanted people to be able to replicate the results.—[[User:Duyn|duyn]] 02:10, 13 March 2010 (UTC) |
+ | :: Well as far as replicating, just in case you missed it, KNN.jar here has since been updated (though maybe not worth downloading yet, I messed up the inclusion of source, it's there but a bit weird). I'll update KNN.jar again both fixed and with your latest tree versions included if you don't mind. --[[User:Rednaxela|Rednaxela]] 02:17, 13 March 2010 (UTC) | ||
+ | ::: Not at all.—[[User:Duyn|duyn]] 06:20, 13 March 2010 (UTC) | ||
+ | ::: [Addendum: ] Although you may prefer to wait until activity on this tutorial dies down since all three trees are essentially in-development right now.—[[User:Duyn|duyn]] 11:58, 13 March 2010 (UTC) | ||
+ | |||
+ | By the way, the title 'Implicit d-ary Heap' seems misleading to me. The d-ary heap you implement is not an implicit heap, an implicit heap that encodes it's tree structure solely in array indices instead of creating objects for each node. --[[User:Rednaxela|Rednaxela]] 08:31, 14 March 2010 (UTC) | ||
+ | :I am not sure I understand. The heap as presented ''does'' encode its tree structure solely in array indices. It only creates new <code>PrioNode</code>s so users can have access to priorities after a <code>poll()</code>. That step has nothing to do with the structure of the heap. The heap is implicit in that the parent/children of a node are worked out from array indices (as seen in the siftXXX methods), rather than storing explicit references in each node.—[[User:Duyn|duyn]] 11:34, 14 March 2010 (UTC) | ||
+ | ::Oh wait... Sorry, I misread the first bit of the code I was looking at, nevermind. --[[User:Rednaxela|Rednaxela]] 15:01, 14 March 2010 (UTC) | ||
+ | |||
+ | By the way, I just tried out your implicit d-ary heap with my tree. The result shows that 3-ary and 4-ary are roughly tied for speed, both notably better than 2-ary and 5-ary. Despite this, it's still just slightly slower than using my heap ([http://bitbucket.org/rednaxela/knn-benchmark/src/tip/ags/utils/dataStructures/BinaryHeap.java code link]) in my tree. --[[User:Rednaxela|Rednaxela]] 22:41, 15 March 2010 (UTC) | ||
+ | : That fits with my experience. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.7774 The Influence of Caches on the Performance of Heaps] suggests that if you take the cost of a swap operation into account, remove-min is cheapest for a 3-ary/4-ary heap. It was at that point the heap stopped being a bottleneck in my tree and so I stopped optimising it.—[[User:Duyn|duyn]] 07:12, 17 March 2010 (UTC) |
Latest revision as of 08:12, 17 March 2010
Interesting work here. Personally I'd consider such a code-heavy tutorial to be more of a 'code-explanation' than a tutorial, but still very good. Also, pretty good job optimizing fairly well there :) --Rednaxela 16:07, 27 February 2010 (UTC)
Notice: Some performance discussion here moved to User talk:Duyn/BucketKdTree --Rednaxela 01:34, 3 March 2010 (UTC)
I think we should move discussion of performance to User:Duyn/BucketKdTree since it was more of a footnote in this walkthrough.—duyn 20:42, 2 March 2010 (UTC)
- Good catch. Done :) --Rednaxela 01:34, 3 March 2010 (UTC)
Nice tutorial work here! Quick question: Did you measure the effect of using the variance for splitting? I tried that before (I think last week) and didn't seem to see a performance advantage. --Rednaxela 17:26, 12 March 2010 (UTC)
- Not rigorously—I tried splitting half way along the widest dimension like you appear to do, but found that made search times worse.—duyn 18:02, 12 March 2010 (UTC)
- I just did some more testing using variance for split dimension like you do. Out of where to put the exact split value, middle of the bounds seemed to still work best. Using variance instead of widest dimension seemed to gain just a few microseconds, and lost just as many microseconds in the adding time, and since there are twice as many data points as searches at least in this benchmark, I'll stick with middle of the widest dimension. --Rednaxela 18:10, 13 March 2010 (UTC)
Impressive speed with the final result there Duyn! I assume this is still comparing to the version of my tree in the old benchmark package? I'll have to compare it to my more up-to-date ones some time. That with the d-ary non-implicit tree also tempts me to to perhaps make a heap benchmark... :) --Rednaxela 19:13, 12 March 2010 (UTC)
- Yes, it's with the old one bundled in the package. I wanted people to be able to replicate the results.—duyn 02:10, 13 March 2010 (UTC)
- Well as far as replicating, just in case you missed it, KNN.jar here has since been updated (though maybe not worth downloading yet, I messed up the inclusion of source, it's there but a bit weird). I'll update KNN.jar again both fixed and with your latest tree versions included if you don't mind. --Rednaxela 02:17, 13 March 2010 (UTC)
By the way, the title 'Implicit d-ary Heap' seems misleading to me. The d-ary heap you implement is not an implicit heap, an implicit heap that encodes it's tree structure solely in array indices instead of creating objects for each node. --Rednaxela 08:31, 14 March 2010 (UTC)
- I am not sure I understand. The heap as presented does encode its tree structure solely in array indices. It only creates new
PrioNode
s so users can have access to priorities after apoll()
. That step has nothing to do with the structure of the heap. The heap is implicit in that the parent/children of a node are worked out from array indices (as seen in the siftXXX methods), rather than storing explicit references in each node.—duyn 11:34, 14 March 2010 (UTC)- Oh wait... Sorry, I misread the first bit of the code I was looking at, nevermind. --Rednaxela 15:01, 14 March 2010 (UTC)
By the way, I just tried out your implicit d-ary heap with my tree. The result shows that 3-ary and 4-ary are roughly tied for speed, both notably better than 2-ary and 5-ary. Despite this, it's still just slightly slower than using my heap (code link) in my tree. --Rednaxela 22:41, 15 March 2010 (UTC)
- That fits with my experience. The Influence of Caches on the Performance of Heaps suggests that if you take the cost of a swap operation into account, remove-min is cheapest for a 3-ary/4-ary heap. It was at that point the heap stopped being a bottleneck in my tree and so I stopped optimising it.—duyn 07:12, 17 March 2010 (UTC)