smart battles

Jump to navigation Jump to search
Revision as of 12 August 2012 at 20:54.
The highlighted comment was created in this revision.

smart battles

So I'm planning to implement smart battle selection this weekend. Every bot (or bot set) will get at least two battles, then I will choose battles to run (in batches since I don't want idle threads) based on trying to decrease standard error in the least amount of time. Maybe with some random battles sprinkled in as well.

I'm thinking I will choose bots with the highest value for: <math>{{stDev \over \sqrt{numBattles}} - {stDev \over \sqrt{numBattles + 1}}} \over {avgBattleTime}</math>

I think this will lead to an overall result with the highest confidence in the least amount of time.

I like testing against a test bed with an average score about the same as my RoboRumble APS. The problem with this is it includes a lot of bots with super low variance (eg, 99.9% scores), so running lots of battles against them is a waste of time. But ignoring them and using a stronger test bed risks specializing against stronger bots.

    Voidious16:52, 10 August 2012

    That looks like a good metric for choosing fast stability. Now I'm wishing I'd included variance in the LiteRumble scores...

      Skilgannon18:33, 10 August 2012
       

      Yeah, do you just store a running tally of average score? I'll need to update RoboRunner to keep scores from every individual battle, too, along with battle times.

        Voidious20:15, 10 August 2012
         

        Yeah, I do a online mean calculation, so newMean = oldMean*(n/(n+1)) + newScore/(n+1), n++

        I've actually thought quite a bit about this, and it all depends what score you're trying to stabilise. If you're trying to stabilise the PL, for instance, you need to run lots of battles for pairings at or near the 50/50 mark. If you're doing Schultz then lots of battles need to go to where a weak bot beat a strong bot. It's all about which battle has the most potential influence.

          Skilgannon07:54, 11 August 2012
           

          Got this working, just dogfooding it a bit myself before posting it since it's a pretty major change. Data files are now (gzipped) XMLs with the raw scores from every battle and everything's recalculated on the fly. (That was actually most of the work.) Comes out to about 100 kb for 3k battles.

          It runs 2 seasons vs each bot then does smart battle selection with the formula above to try to increase overall accuracy as quickly as possible. It's nice to see test runs where only 2 battles were run vs HawkOnFire. =) 5% of the time, it instead chooses randomly among the bots with fewest battles, to try to mitigate cases where the variance was randomly low in the initial battles. (I can make this configurable if/when anyone cares.)

          It won't schedule two battles vs the same bot unless the number of bots is <= the number of threads. Otherwise, you'd keep scheduling that bot until the battle finishes. I could instead estimate how many times in a row it would still be worth scheduling it, but that seems like a lot of work for a corner case.

          I think this is going to save a heck of a lot of CPU time. The XML data files will also make it easier to let you store arbitrary score data in the custom scoring stuff.

            Voidious21:50, 12 August 2012

            Though I'm still figuring out how to avoid potentially corrupting the data file if you ctrl-C your run. I'm not sure if skipping the gzipping would help or if it's just become more likely because the data files are so much larger. Maybe I need to add a keyboard option to safely exit.

              Voidious21:54, 12 August 2012