Robot Benchmarking Thoughts
The highlighted comment was created in this revision.
So, it recently occurred to me that the targeting/movement challenges are currently done based on the score outcome, but it really all comes down to hitrate. The thing is, there should be less noise/variability in a measure of only hitrate. So... to facilitate I started writing a tool that uses the "robocode.control.*" APIs to measure hitrate without needing to modify bots to keep track of it, and I'm thinking I ought to expand this tool to gather lots of information from the battle. Anyone have any suggestions of information I should track?
Here's an interesting pair of graphs from 500 battles of DrussGT versus Diamond
It seems that Diamond keeps it's hitrate up into the 2nd round whereas DrussGT's falls sharply right away, but it doesn't fall as far and seems to recover more. I think this type of hitrate graph could be rather interesting for investigating the dynamics of surfers versus adaptive targeting...
During the first round the surfing is very predictable, making the guns learn much quicker; there is also very little data in the gun, so it acts like a very fast rolling gun.
I wonder how these are affected by when flatteners and alternate guns are enabled and disabled?
Also, keep in mind that DrussGT fires lower bullet power (or did, last time I checked).
Yep, I'm aware. It doesn't surprise me that the hitrate is much higher in the first round, though I do find it interesting that Diamond manages to keep the hitrate high for a little longer.
Yeah, that would be interesting. Well, if robots output that information in the terminal, or (better yet) track it in "AdvancedRobot.setDebugProperty(key, value)", it would be easy enough for me to make this tool keep track of it...
That is a good point, though I mostly am planning on doing hitrate analysis for comparing different versions of a bot, with the fixed bullet power of the movement/targeting challenges.
That sounds sweet Rednaxela! I've had the same thoughts about TC scoring being very odd compared to hit rate - it (basically) gives a 100 for anything over a certain hit rate in a round, then scales down to 0 for anything under that. It's one thing I really like about using WaveSim.
Another thing I would suggest adding is energy ratio ((damage done + energy gained) / energy fired). That's what I key off of when using WaveSim, and to me, it seems like the best "real world" fitness measurement.
Another interesting thing to me would be which bot is approaching/retreating most, or which bot is usually occupying center field vs being pushed to the corners. It's generally advantageous APS-wise to keep as much distance as possible, but between two evenly matched bots, the bot with the lower desired distance is sometimes at a huge advantage, since he has a bigger MEA than the bot stuck in the corner.