Guess Factors and Wave Surfing (GFWS)

Dynamic Clustering using k-nearest-neighbor

KNN Distance Algorithms

The basic algorithms for finding nearby neighbors can be found on the Dynamic_Clustering page. The most common approach in Robocode use I would guess is Manhattan distance, which is |x2-x1|+|y2-y1|+.... But what are these distance values?

For XanderCat, the "distance" values are based on the segment indexes. A segmenter slices some attribute in a fixed n number of slices. The "distance" is how far the current situation segment index is from some candidate situation segment index. But a straight computation using segment index will not work well, because each segmenter can have a different value for the number of slices n. XanderCat therefore uses a percent difference across the range n instead.

So, assume segmenter A has 10 slices (indicies from 0 to 9, n=10). Assume the current situation falls under segment index 4 (call this value a2), while the candidate situation falls under segment index 6 (call this value a1).

Using straight Manhattan distance, this would look like |a2-a1| or |4-6| == 2.

But to account each segmenter having a different number of segmenters, XanderCat uses percentages instead.

Using percentages, this would look like |(a2-a1)/n| or |(4-6)/10| == 0.2 (the difference between is 20%).

Now assume there is a second segmenter B with a difference of 0.1 (10% diference)

Now the Manhattan distance is 0.2 + 0.1 = 0.3 (lower is a closer match)

While this type of distance calculation will be tested soon, initial versions of XanderCat did not use this approach exactly. Instead, it multiplied the inverse percentages together, like so:

(1-0.2)*(1-0.1) = 0.8 * 0.9 = 0.72 (higher is a closer match)

Is this better or worse? Honestly, as I type this, I don't know yet. This is something I will be testing out soon. The latter requires all of the segmenters distances to remain somewhat close to be a good match, while with the former straight Manhattan approach, a large seperation on one segmenter can be made up for by the others being closer together.

I think the best way to find distance values is to use the raw values you would have used to decide which segments a scan goes in. So use the raw lateral velocity, the raw distance, the raw acceleration etc. This way you get better accuracy than the segmentation technique would have, as well as not having to worry about how fine your slices are. Most people use Squared Euclidian distance, although for my gun I've found that Manhattan distance works better, and I use Manhattan in CunobelinDC (the top minibot). I tried using a multiplicative distance algorithm which did a sort of (1 + |a1 - b1|)*(1 + |a2 - b2|)*... and it worked fairly well in my gun tests but lost points in the rumble. Also, if you decide to use kNN for a gun I hope you use a KD Tree - the speedup will make testing so much more bearable =) --Skilgannon 07:01, 7 October 2011 (UTC)

XanderCat/GFWS

Guess Factors and Wave Surfing (GFWS)

Dynamic Clustering using k-nearest-neighbor

KNN Distance Algorithms

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools