Difference between revisions of "Pris/VersionHistory"
< Pris
Jump to navigation
Jump to search
m (add category) |
Darkcanuck (talk | contribs) (Pris 0.82, 0.84 notes) |
||
Line 1: | Line 1: | ||
− | '''0.36c | + | ;0.84 - Neural Surfing |
+ | * Another dev. version, this one was scoring as high as the latest (unreleased) version of [[Holden]], so out into the rumble it goes! | ||
+ | * Uses latest version of Holden's wave surfing but with a twist: danger values are generated by a neural network. Many of the concepts used are similar to [[Gaff/Targeting | Gaff's Targeting]], although training was (and remains) a tricky problem to solve. | ||
+ | * Framework updates gives this version anti-ramming and lots of bugfixes/refinements since 0.82. | ||
+ | * Still uses older version of [[Gaff]]'s gun (ie. the published one) | ||
+ | |||
+ | |||
+ | ;0.82 - Surfing Learner | ||
+ | * {{RumbleStatsDefault|link=http://darkcanuck.net/rumble/RatingsDetails?game=roborumble&name=darkcanuck.Pris+0.82|rumble=RoboRumble|scorelabel=APS|score=72.39|rank=74th|win=647|loss=91|plrank=72nd|glicko2=1906.8|score2label=Survival|score2=82.22}} | ||
+ | * Introduced a variation on [[Wave Surfing]] which used the current state and wave danger as inputs to a [[Reinforcement Learning]] algorithm. | ||
+ | * Combining WS+RL needed a lot more work and didn't seem to make sense given how straightforward danger evaluation can be. CPU time might be put to better use learning other heuristics (eg. bullet power selection). | ||
+ | * Retired 27-Aug-2009 | ||
+ | |||
+ | |||
+ | ;0.36c - Retrained & repackaged | ||
+ | * {{RumbleStatsDefault|link=http://darkcanuck.net/rumble/RatingsDetails?game=roborumble&name=darkcanuck.Pris+0.36c|rumble=RoboRumble|scorelabel=APS|score=69.76|rank=91st|win=653|loss=73|plrank=69th|glicko2=1868.2|score2label=Survival|score2=80.58}} | ||
* Same as 0.36, but freshly trained and packed with care | * Same as 0.36, but freshly trained and packed with care | ||
− | + | ;0.36 - Bullet power experiment + quick fix | |
* Quick fix of the 0.34's problem using the latest experimental version on hand | * Quick fix of the 0.34's problem using the latest experimental version on hand | ||
* Ranked 129 in [[RoboRumble]] -- 63.77 APS, 1537 ELO, 1786 Glicko2, 2119 battles (8-Jun-2009) | * Ranked 129 in [[RoboRumble]] -- 63.77 APS, 1537 ELO, 1786 Glicko2, 2119 battles (8-Jun-2009) | ||
Line 9: | Line 24: | ||
− | + | ;0.34 - Expanded learning | |
* Added many new learning inputs including hit counts and movement history | * Added many new learning inputs including hit counts and movement history | ||
* Considers more movement options | * Considers more movement options | ||
Line 18: | Line 33: | ||
− | + | ;0.20 - Development release | |
* First version to score in the same neighbourhood as [[Gaff]] in my 1v1 test bed. | * First version to score in the same neighbourhood as [[Gaff]] in my 1v1 test bed. | ||
* Ranked 99 in [[RoboRumble]] -- 69.02 APS, 1619 ELO, 1857 Glicko2, 2008 battles (5-Jun-2009) | * Ranked 99 in [[RoboRumble]] -- 69.02 APS, 1619 ELO, 1857 Glicko2, 2008 battles (5-Jun-2009) | ||
[[Category:Bot Version Histories]] | [[Category:Bot Version Histories]] |
Revision as of 16:42, 27 August 2009
- 0.84 - Neural Surfing
- Another dev. version, this one was scoring as high as the latest (unreleased) version of Holden, so out into the rumble it goes!
- Uses latest version of Holden's wave surfing but with a twist: danger values are generated by a neural network. Many of the concepts used are similar to Gaff's Targeting, although training was (and remains) a tricky problem to solve.
- Framework updates gives this version anti-ramming and lots of bugfixes/refinements since 0.82.
- Still uses older version of Gaff's gun (ie. the published one)
- 0.82 - Surfing Learner
- RoboRumble ‒ APS: 72.39% (74th), PL: 647-91 (72nd), Survival: 82.22%
- Introduced a variation on Wave Surfing which used the current state and wave danger as inputs to a Reinforcement Learning algorithm.
- Combining WS+RL needed a lot more work and didn't seem to make sense given how straightforward danger evaluation can be. CPU time might be put to better use learning other heuristics (eg. bullet power selection).
- Retired 27-Aug-2009
- 0.36c - Retrained & repackaged
- RoboRumble ‒ APS: 69.76% (91st), PL: 653-73 (69th), Survival: 80.58%
- Same as 0.36, but freshly trained and packed with care
- 0.36 - Bullet power experiment + quick fix
- Quick fix of the 0.34's problem using the latest experimental version on hand
- Ranked 129 in RoboRumble -- 63.77 APS, 1537 ELO, 1786 Glicko2, 2119 battles (8-Jun-2009)
- Scored lower by 5-10% versus same bots as benchmark, so may have mis-packaged this version
- 0.34 - Expanded learning
- Added many new learning inputs including hit counts and movement history
- Considers more movement options
- Dropped exploration/randomness way down
- Scores 1% higher than Gaff on benchmark test
- Ranked 167 in RoboRumble -- 62.62 APS, 1526 ELO, 1769 Glicko2, 1237 battles (7-Jun-2009)
- Forgot to comment out code specific to 1.6+ versions of Robocode and received some 0 scores from 1.5.4 clients.
- 0.20 - Development release
- First version to score in the same neighbourhood as Gaff in my 1v1 test bed.
- Ranked 99 in RoboRumble -- 69.02 APS, 1619 ELO, 1857 Glicko2, 2008 battles (5-Jun-2009)