Pris/VersionHistory
< Pris
Jump to navigation
Jump to search
- 0.88 - Death to MirrorMicro
- Added a new movement network input based on the Guess Factor currently being visited (i.e. if the enemy fires targeting waves every tick, this is the GF of the currently breaking wave when the enemy fires).
- This should improve performance against MirrorMicro to about 80%, rather than 45%!
- Many thanks to Positive and Skilgannon for pointing this out.
- Incidentally, I implemented this without using waves...
- Added in better GF0 and circular aim avoidance before the NN has enough training data for surfing.
- Fixed a few misc bugs and probably added even more.
- 0.86 - 2 Networks
- RoboRumble ‒ APS: 82.24% (18th), PL: 735-6 (5th), Survival: 91.42%
- Added a second network with hidden inputs for improved surfing danger values.
- Movement network training now uses an exponential moving average scheme to combine old data with new.
- Added a "circularity" input to movement network.
- Tweaked aiming method for more precision (thanks to rsim's BulletCatcher for making me notice this); still using Gaff's older gun.
- Switched to a better bullet power strategy rather than the ancient one accidentally included.
- 0.84 - Neural Surfing
- RoboRumble ‒ APS: 80.77% (33rd), PL: 731-11 (7th), Survival: 89.68%
- Another dev. version, this one was scoring as high as the latest (unreleased) version of Holden, so out into the rumble it goes!
- Uses latest version of Holden's wave surfing but with a twist: danger values are generated by a neural network. Many of the concepts used are similar to Gaff's Targeting, although training was (and remains) a tricky problem to solve.
- Framework updates gives this version anti-ramming and lots of bugfixes/refinements since 0.82.
- Still uses older version of Gaff's gun (ie. the published one)
- 0.82 - Surfing Learner
- RoboRumble ‒ APS: 72.39% (74th), PL: 647-91 (72nd), Survival: 82.22%
- Introduced a variation on Wave Surfing which used the current state and wave danger as inputs to a Reinforcement Learning algorithm.
- Combining WS+RL needed a lot more work and didn't seem to make sense given how straightforward danger evaluation can be. CPU time might be put to better use learning other heuristics (eg. bullet power selection).
- Retired 27-Aug-2009
- 0.36c - Retrained & repackaged
- RoboRumble ‒ APS: 69.76% (91st), PL: 653-73 (69th), Survival: 80.58%
- Same as 0.36, but freshly trained and packed with care
- 0.36 - Bullet power experiment + quick fix
- Quick fix of the 0.34's problem using the latest experimental version on hand
- Ranked 129 in RoboRumble -- 63.77 APS, 1537 ELO, 1786 Glicko2, 2119 battles (8-Jun-2009)
- Scored lower by 5-10% versus same bots as benchmark, so may have mis-packaged this version
- 0.34 - Expanded learning
- Added many new learning inputs including hit counts and movement history
- Considers more movement options
- Dropped exploration/randomness way down
- Scores 1% higher than Gaff on benchmark test
- Ranked 167 in RoboRumble -- 62.62 APS, 1526 ELO, 1769 Glicko2, 1237 battles (7-Jun-2009)
- Forgot to comment out code specific to 1.6+ versions of Robocode and received some 0 scores from 1.5.4 clients.
- 0.20 - Development release
- First version to score in the same neighbourhood as Gaff in my 1v1 test bed.
- Ranked 99 in RoboRumble -- 69.02 APS, 1619 ELO, 1857 Glicko2, 2008 battles (5-Jun-2009)