DrussGT/Understanding DrussGT

From Robowiki
Jump to navigation Jump to search

Understanding DrussGT (3.1.4)

This is an attempt to explain the internal mechanics of DrussGT. As far as possible, no references to actual code will be made. Instead, I will attempt to explain everything in terms of concepts and ideas, so that this turns into a learning exercise instead of one of copy-pasting code. The following expects some understanding of Visit Count Stats , KNN , Guess Factors, Wave Surfing and Robocode/Game Physics, so if you aren't up to scratch with those, this might be a bit confusing.


Movement

No-Wave fallback movement

The fallback movement system is what happens when there are no detected waves in the air. It is used primarily at the beginning of rounds before bullets have been fired, in close range combat when the bullet flight time is lower than the gun cooling time, and sometimes at the end of a battle when an enemy is low on energy and stops firing.

This movement is similar to classic Wall Smoothing Random Movement implementations such as Raiko, although without the decisions to reverse. Rather, both movement directions are tested, and the one which, when smoothed, takes DrussGT further from the enemy is chosen. The distancing angles are carefully tuned against GrubbmThree to make the best escape against RamBot-style aggressive movements.

Surfing

This is the main movement system, which is employed whenever a wave is available, which is most of the time against most bots.

Data Generation

Firstly, data about DrussGT's movements is stored into ArrayLists so that it will be available for future reference. This includes location, lateral velocity, heading, lateral direction, advancing velocity and angle from enemy. This data will be used later for determining attributes that the stats system will use to build a danger profile for fired waves.

Enemy Gunheat Tracking

When the enemy is detected as having fired, a variable which represents their gunheat is increased. The rest of the time, this variable is continually decreased by gun cooling rate. Only once this value is zero does the movement consider enemy energy drops as possible wave firing events. This helps prevent spurious waves from being fired due to wall hits which cause energy drops.

Gunheat Waves

Due to Robocode physics, all of the data which is used by the enemy to decide where to fire is available two ticks before it is possible to detect their energy drop from firing. As such, if you can predict what bullet power they fire at, it is possible to start surfing the wave before detecting it. This can help close-range fighting, because those extra two ticks can give crucial extra reaction time which can mean the difference between dodging a bullet and being hit.

Wave Tracking

Every tick, all active waves are inspected to see if they have passed the robot location. If there is no chance that the wave could still cause a bullet hit, it is removed.

Stats System

This is one of the very unique features of DrussGT. The stats system originated as a standard VCS system with a single buffer, similar to Komarious, and slowly evolved into the highly complex system it is today. Two desired characteristics were behind the development: adaptive yet accurate prediction of enemy targeting, and being able to do so quickly.

Accuracy

In order to achieve the accuracy required to unlock the potential of a high-end surfer, over 100 buffers were created in parallel, each of which has a randomly generated set of slices for its attributes. In addition, another 50 buffers are used for the regular flattener, and 20 for the tick-flattener. In order to ensure that as many possible situations are covered, each buffer is limited to only having 5 used attributes per buffer. In addition, several hand-tuned buffers with less attributes are added to ensure that some data will always be available. In order to get accurate Guess Factors, a high number of bins (171) are used. Low rolling averages are used (0.5 – 1.5) in order to keep the stats system highly adaptable to changes in the enemy targeting.

The following attributes are used for the random selection:

  • Lateral Velocity
  • Advancing Velocity
  • Bullet Flight Time
  • Time Since Direction Change
  • Time Since Deceleration
  • Acceleration
  • Lateral Distance Last 10 Ticks, ie. abs(Sum of lateral velocity for last 10 ticks)
  • Forward Wall Distance
  • Reverse Wall Distance

The random selection is performed once during coding using a quick utility I wrote, so each time DrussGT runs it will have the same performance.

Execution Speed

Obviously, to deal with so many buffers, each with its own set of attributes, and such a large number of bins, requires a great deal of running time. In fact, with a naive brute-force implementation, this system will be too slow and consume too much memory to run with the rest of the robot. Three main optimisations were implemented to overcome this problem:

  1. Stats Retrieval Indexes: instead of keeping a set of slices for each buffer, which would then require a slice calculation for every single slice in every single buffer every time a new wave was created, an index for retrieval data was created. Because for each attribute I generally have 4 sets of possible slices {empty, coarse, regular, fine}, I create a one-dimensional array which is associated with each buffer and says for each attribute which slice set is chosen, as a number from 0 to 3, 0 corresponding to empty and 3 corresponding to fine. Then, at enemy wave creation time, a two-dimensional array is created which has the correct slice index for each slice set for each attribute. From this 2D array the individual buffers can simply look up their required slice index using their fixed 1D array, instead of calculating all of them every single time.
  2. Lazy Initialisation: Because there are many areas of the space which will never be accessed, it doesn't make sense to allocate memory for all of the possible areas that could ever be written. For instance, if DrussGT is decelerating, then immediately we know that the velocity is less than 8 and the time-since-decel is 0, so there is no point in initialising those areas. By only allocating buffers as they are accessed the amount of memory is cut drastically.
  3. The final, and probably most important optimisation was the move away from keeping a set of bins, where the hit was smoothed with a profile in the set, in every buffer for every situation. Clearly, with such a low rolling average of 0.5-1.5, this is a highly inefficient way of storing where the last 2 – 3 hits were, as any hits before that are below the threshold of any realistic influence, and we are only storing at a granularity of the number of bins anyway. Instead, I now keep a circular buffer of the bin indexes of the last 2*rollingAverage + 1 hits which occurred in the buffer. At wave creation time I then take these queued bin indexes and increase the corresponding bin in my surfing buffer with a weight which rolls off the further back in the queue the index is stored, following the same decay rate as the rolling average would have given. Once this has run for all buffers I smooth all non-zero bins into a final buffer which is then surfed. This is substantially faster than simple bins because each buffer only needs a few writes to the main bins, and often multiple buffers will all increment the same pre-smoothed bin, reducing the number of bins which need to be smoothed.

Flattener

DrussGT has a standard flattener, where as each wave passes over it logs into the stat buffers, regardless of whether or not there was a bullet hit. In addition, it has two other flatteners: the 'tick-flattener', so called because a wave for it is generated every single tick, regardless of if the enemy is even firing, and the anti-bullet-shadow flattener, which adds danger to any points on the wave which are more likely to contain a bullet based on the premise that the enemy is using bullet-shadows and made sure they were in a shadow as my wave passed over them. Because bullet shadows are an interaction of movement and targeting, this is an attempt to learn the enemy targeting better through analysis of any artefacts it displays in their movement.

The flattener buffers are entirely disabled (not even logging data) until the enemy hitrate passes a weighted threshold of 9%. At this point all of them are enabled, and weighted equally with the regular dodging buffer.

Bullet Shadows

DrussGT has a standard by-the-book Bullet Shadows implementation. The fact that it uses bins instead of a continuous danger profile probably makes this implementation easier than it would have been otherwise. When the gun fires a bullet, it passes a reference of the bullet object back to the movement. The movement then plays this bullet forward through all of the enemy waves and identifies which regions of which waves would hit the bullet. Similarly, every time a wave is added all bullets are played forward to see where they will create 'shadows' on the wave. It is then assumed that these 'shadow' regions are entirely safe, because if they contain a bullet it will be taken out by bullet intersection before the enemy bullet can hit me. It uses a tick-by-tick play-it-forward method to determine when and where the intersections occur.

Surfing Algorithm

DrussGT uses go-to surfing, which is very similar to Minimum Risk Movement. While minimum risk movement simply generates goto points and evaluates them, then goes to the one with the lowest danger, DrussGT adds an additional step: seeing how dangerous travelling to that point is, rather than simply the danger of the point itself. As such, there are three steps to the actual surfing algorithm:

  1. Point generation – creating a bunch of points which are likely to be good places to evaluate
  2. Calculating the wave-hit locations – figure out where the wave will cross me if I attempt to go to each point
  3. Danger evaluation – given the wave hit locations, how dangerous does the stats system say this location is, plus some heuristics like distance-to-enemy and wave weighting.
Point Generation

Because the amount of time to run the surfing algorithm increases quadratically (for 2 wave surfing) as the number of points increase, it is important to reduce the number of points to the bare minimum. As such, the generated points must meet the following criteria:

  • Be reachable from the current location
  • Have correct distancing
  • Cover a wide range of GF values
  • Preferably be generated in an order so that all points closer than point K can be declared 'reachable' and don't need to have the wave hit location calculated (the slow part of the algorithm).

Because of these criteria, I decided to use a precise-prediction from my current location. Because it was already available, I used the precise prediction from Apollon. I then added a filter which removed points if another point closer than 7 pixels already existed. By controlling the precise prediction I can cover all reachable GF values, without attempting to generate points which were completely unreachable; it is also easy to control distancing, and the 'order' criteria above is also met.

Wave-hit Locations

This is the most critical part of the surfing mechanics, and the easiest to make mistakes in. This is what separates a go-to surfing movement from simple minimum risk movement such as that found in HawkOnFire or Shiz, and this method is also used in a melee setting in Neuromancer. The concept behind this is to treat each go-to point as a possible movement option, and calculate where the wave would hit the robot if it attempted to drive to it.

More formally, given a starting {location, velocity, heading} and an end destination as well as an enemy wave, calculate at what point along the robot's path the wave hits the robot, and which angles in the wave would cause a bullet to hit the robot according to Robocode physics.

My implementation of this method does a tick-by-tick simulation of the robot, including turning, acceleration and movement. The required destination angle and distance remaining is continuously recalculated, just as if the robot was actually performing a go-to type movement to the selected destination. As the robot moves, the enemy wave radius expands at the bullet velocity, and the simulation continues until the enemy wave intersects with the robot position. As a true-surfing analogy, each point generated can be considered as a movement option similar to each of {forward, stop, reverse}, and the intersection point/range can be equated to the true-surfing precise-prediction result. Note that this is the actual precise prediction required in a go to movement, and the precise prediction in the points generation function is just there because it happens to create a path of points in a nice shape.

I have two methods for calculating wave hit locations, one which implements the precise intersection math to calculate the width of the robot-wave intersection and is used for the first wave, and the second which just gives an intersection point, running much faster as a result, and is used for the second wave. I tried using precise intersection on the second wave, but it ran far too slowly.

Danger Evaluation

Once I know which section of the wave will hit for each given movement option, I can evaluate how dangerous the option is. I sum the bins which correspond to the hit angles, then divide by number of bins and multiply by angle width to get rid of any aliasing effects. I also scale the danger based of the damage the wave would cause and the energy differential it would create, the amount of time the wave has to hit me, and how close the enemy is to the wave-hit location.

Second Wave Speed Optimisations

In order to speed up the second wave surfing, I devised an algorithm which would allow me to not calculate the second wave dangers in a large portion of the movement options, decreasing runtime significantly. This is done by first calculating all first wave dangers, then sorting them so that they would be processed from least dangerous to most. Then, a minimum total danger comprising sum of first and second wave danger can be kept and as the list is processed, once the danger of the first wave exceeds this minimum total danger, we can safely ignore the rest of the list because they also have a danger greater than the minimum. In this way a large portion of the second-wave can be ignored.

A second round of optimizations also happens by, before calculating the precise prediction for all points on the second wave for a given first-wave intersection point, just calculate ~20 values between min and max theoretically reachable GFs (ignoring acceleration, angles etc). This quickly gives a lower bound, meaning I can skip more of the first wave points since they are already higher than the minimum first + second wave danger.

Goto Function Behaviour Tweaks

Because I only need to get to the destination by a certain time, there are certain occasions where I might have extra time to get to the destination. In these cases I test whether accelerating or decelerating will prevent me from reaching the destination on time, and if it doesn't prevent me from reaching the destination I will do one of these options. This can throw off pattern matching and KNN/VCS targeting which use things like time-since-decel to determine enemy movements.

Also, with a go-to function, at very short distances the bearing-to-destination calculation becomes undefined. As such, if the destination is less than 1 pixel away I don't try to turn.


Targeting

The targeting system uses a fairly standard KNN gun, although I have tuned it extensively to hit rumble bot as well as possible. I use two different tunings, one for static movement with a low decay rate, and one for adaptive movement with a higher decay rate. The optimal weightings are found using saved data and genetic algorithms, with the different systems tested against different test beds to optimise their scores against different populations. There is also a random gun, which shoots between the min and max precise GF, and is basically added as a fallback to prevent someone somehow managing to dodge my statistical guns.

Data Generation

Unlike the movement, the only saved data are the store variables such as enemy time-since-decel and last-velocity to calculate acceleration. I also stored an array of enemy locations so that I can have distance-last-x attributes.

Bullet Power

DrussGT uses a fairly complex bullet-power selection algorithm. Firstly, a simple base bullet power of 1.95 is chosen, unless the enemy is near or my average hitrate is above 33%, in which case a power of 2.95 is chosen. If the standard value is chosen, it is then scaled based on my energy and the energy the enemy is expected to have at the time this bullet hits if they continue to shoot every time their gunheat is 0. The values for the enemy bullet power are predicted, similarly to those in the Gunheat Waves. Additionally, if it is determined that the enemy is shooting at a lower bullet power than DrussGT, it attempts to undercut or at least match the enemy bullet power. Finally, the bullet power is rounded to the nearest value which through trial and error was determined to exploit a bug in BasicSurfer, which many robots are based off of.

Attributes

The following attributes are used in the gun:

  • Lateral Velocity
  • Advancing Velocity
  • Distance
  • Acceleration
  • Time-since-direction-change
  • Time-since-decel
  • Distance-last-10
  • Forward wall – calculated using precise max GF
  • Reverse Wall – calculated using precise min GF
  • Current GF – this is the latest GF value that they are at, according to my targeting system.
  • My expected mirror rotation at bullet hit time
  • Number of bullets shot

An important feature which is tuned in the genetic algorithms is the normalisation function that these attributes are stored through. For instance, the time-since attributes are stored as f(x) = 1/(1+K*x) so that they normalise nicely, because their boundaries are not fixed, and a value to 100 is not so different from 500 in terms of movement prediction capabilities. Similarly, number of bullets shot should increase quickly in the beginning to learn new movement quirks like multi-mode, but not so quickly in the end since the enemy movement will be fairly stable by this point.

One quirk which I found helped slightly was treating the direction as the acceleration direction, rather than the movement direction.

Storage

The data in DrussGT's gun in stored in my own Kd-Tree, although theoretically any KD-Tree which supports KNN search should work. (The advantage my tree has over the others is that it does its own memory management, specifically to try to keep data in adjacent leaves close in memory. As such, it should perform better in situations where your data is often evicted from cache between searches.) At the data point location is the minimum and maximum GF for a hit calculated using Precise Intersection. The GF is not scaled using standard asin(8/bulletVel) calculations, but rather in the positive range using the maximum forward angle achievable by the enemy determined using precise prediction, and the negative range is calculated using the reverse.

Data Retrieval & Processing

KNN Lookup

A fairly standard KNN lookup for the current state is used, however one unique feature is the use of the Manhattan distance instead of regular Euclidean distance for evaluating distances within the KD-Tree. In my testing I found that using Euclidean distance decreased my score against real-world targets considerably, although I'm not sure if this is just a result of my attribute scaling choices or whether Manhattan is better in all situations.

Kernel Density Calculation

Rather than finding the densest area of hit angles, DrussGT uses the highest area of overlapping ranges. The scans are weighted according to a Gaussian rolloff based on the distance to the target scan, with the average of the first P scans providing a 50% weight normalisation value. In this way the rolloff adapts to increasing data density. The density calculation is done using Precise Intersection data where each looked up scan creates two 'indices', one positive and one negative, with a height of the weight of the scan, and a position at the GF of the minimum and maximum GF respectively. These indices are then all sorted by position, and an integral of the indices is calculated to determine the height of the overlap at each point. The highest peak is determined by iterating through all of the scans, and the chosen GF is the midpoint of the (square) peak.

Gun Aiming

Care is taken to aim all waves from the next location the robot will be, as this is where all bullets will be fired from. Additionally, all waves are also fired from the next location the robot will be.

Virtual Gun System

Every time a bullet is fired it's respective wave is marked as containing a bullet. When the wave is ready for removal, it is checked whether the aiming angle of the Anti-Random and the Anti-Adaptive guns fell within the hit-angles calculated by the Precise Intersection. If so, the respective gun score is increased. The random gun's score is increased by the botwidth/preciseEscapeAngleArea, to prevent aliasing and keep it accurate (why do monte-carlo with a small number of samples when you can calculate it directly?)


Utilities

Enemy Bullet Power Prediction

This uses a KD-Tree to predict the enemy bullet power based on the input data of {enemyEnergy,myEnergy,distance}, which was found to be the basis for most bullet-power choosing algorithms. When a bullet power needs to be predicted, a KNN search for the min(sqrt(size), 20) nearest cases is performed. A kernel density calculation using pseudo-Gaussian kernels is then performed to find the bullet power with the highest score, which is then chosen as the predicted power.

Bullet Shielding

The Bullet Shielding implementation is based off of that in EnergyDome. It uses the same precise line-circle (ie. bullet-wave) intersection code used in my bullet shadow implementation for my movement, making it very accurate and good at shooting down marginal cases. However, because bullet shielding does not work against many bots, this mode is only continued until 50 bullet damage is received. Once this threshold is reached, the bullet shielding is disabled and the DrussGT continues with its regular movement.

FastTrig

I wrote a fast trig library which replaces most trig calls with a bunch of Multiply-Add (MA) operations. Originally this was table based, inspired by the other User:Rednaxela/FastTrig libraries, but I found that although the tables benchmarked well, in real life they caused lots of cache hits and ended up being slower than the MA approach. The constants in the MA were calculated from Chebychev equations to minimize the maximum error on the curve, or from a paper I found which had done it already. Some of these were merged back into the FastTrig page, but I think some are still only in my own code.