Danger function

Fragment of a discussion from Talk:Cunobelin
Jump to navigation Jump to search

Maybe I just had a better idea of what I was doing when I made my KNN gun, but I really think there's more to it than that. In KNN, you just decide what attributes are important, and then, if you want to, try to guess the importance. In VCS you decide which attributes to use in each set of segmentation, then how fine that segmentation should be, and the conditions for using that segmentation.

A quick outline of my proof that a perfect VCS dominates (or should dominate, in that it has identical or better data) KNN. Assuming infinite memory and CPU resources, you create every possible set of segmentations for VCS given certain attributes. When aiming, select the histogram that is centered on the current data point with the size of each dimension's bin so that it includes all points that would be included in the KNN search. It contains at least all points from the KNN search (it may contain more if there are multiple points exactly the same distance from the query point) Obviously, this assumes a range search with a hyper rectangle, but a more complicated algorithm could do the same thing for a hypersphere.

AW (talk)17:19, 19 June 2013