Difference between revisions of "Dynamic Clustering"

From Robowiki
Jump to navigation Jump to search
(copy and past content, cleaning later)
 
(rewrite, hope someone can write better than me)
Line 1: Line 1:
Consider a history of the enemy state (log, array, whatever). For each entry you have multiple data on the enemy, like distance, speed, distance to wall, etc. Now, DynamicClustering consists in categorizing those entries into clusters, so that each cluster contains "similar" entries. "Similar" is defined by beeing close to each other, the measure of distance beeing the euclidian distance between entries: sqrt(sqr(dist1 - dist2) + sqr(speed1 - speed2) + ...). If you search Google for Clustering algorithms, you'll get lots of references to K-means clustering, which involves iteratively changing the centers of the clusters and reassigning each entry to it's new cluster. I tried it, it's very slow and I didn't manage to make it work good enough. My aproach is simpler, you just take the current entry and consider it the center of the cluster, then you find the N closest entries in the log and use them to decide where to shoot.
+
Dynamic clustering is a technique to find similar item in your log. The ideas is to put a "state" for each item in your log. The state can contain any data that you seem valuable, such as [[Lateral Velocity|lateral velocity]] or distance. Save this along with your data. When you need a data, consider "cluster" them. The easiest way doing this to to find a "distance" between current state and past states. Distance can be Euclidian (<code>sqrt(sqr(dist1 - dist2) + sqr(lat1 - lat2))</code>) or another way, such as Manhatton distance. Took ''N'' states with the lower distance, that's your data.
  
Now for the wheighting of each dimension. Each dimension can be wheighted by some factor. Those factors can be decided intuitively (is velocity more important than distance to wall?), by trial and error, or you can use some statistical entities to dynamically decide for you. I tried many ;), the latest beeing the entropy association between each dimension and the hitting GuessFactor (yes, I currently use GuessFactors in Shadow's gun ;)). Read more about it [http://www.library.cornell.edu/nr/bookcpdf/c14-4.pdf here].
+
Earlier method doing this linearly by iterate through the log and calculate distance for each one respectively. If you have more than 100,000 states in your log, it will heavily slow. The new approach is [[K-d Tree]]. [[User:Corbos|Corbos]] was the first one who mentioning this on the wiki, and get catch in interest by [[User:Chase-san|Chase-san]] and [[User:Simonton|Simonton]]. As of now, [[User:Simonton|Simonton]]'s [[K-d Tree]] is one of fastest tree.
 +
 +
=== Bot using this technique ===
 +
* [[Tron]] & [[Shadow]]: First tow bots using this technique. This technique used to called "Trons Gun", but later renamed.
 +
* [[Chalk]]: First dynamic clustering bot that released with [[Open Source|source code]].
 +
* [[DCBot]]: A simplified version of gun using by earlier version of [[Tron]] and [[Shadow]].
 +
* [[Lukious]] & [[Firebird]] & [[Hydra]]: Dynamic Clustering version of [[Dookious]], [[Phoenix]], [[WaveSerpent]] respectively.
 +
* [[Horizon]] & [[RougeDC]] & [[YersiniaPestis]]: Newer dynamic clustering bots.
 +
* [[X2]] & [[Ali]] & [[DrussGT]]: These bots use Dynamic Clustering only for their gun.
  
For a slightly simplified version of my DC gun used in [[Shadow]] and [[Tron]] see: [[DCBot]]
+
== See Also ==
 +
* [[Visit Count Stats]]
 +
* [[K-d Tree]]

Revision as of 11:22, 26 April 2009

Dynamic clustering is a technique to find similar item in your log. The ideas is to put a "state" for each item in your log. The state can contain any data that you seem valuable, such as lateral velocity or distance. Save this along with your data. When you need a data, consider "cluster" them. The easiest way doing this to to find a "distance" between current state and past states. Distance can be Euclidian (sqrt(sqr(dist1 - dist2) + sqr(lat1 - lat2))) or another way, such as Manhatton distance. Took N states with the lower distance, that's your data.

Earlier method doing this linearly by iterate through the log and calculate distance for each one respectively. If you have more than 100,000 states in your log, it will heavily slow. The new approach is K-d Tree. Corbos was the first one who mentioning this on the wiki, and get catch in interest by Chase-san and Simonton. As of now, Simonton's K-d Tree is one of fastest tree.

Bot using this technique

See Also