calculating confidence of an APS score

Fragment of a discussion from Talk:RoboRunner
Jump to navigation Jump to search

Some results with normal battles. Diamond 1.8.16 vs 50 random bots for 10 seasons.

  • Dumb battles: took 6338.8s, 89.87 +- 0.188
  • Smart battles: took 6010.6s, 89.94 +- 0.148

Looks like it hit ~0.18 by 5 seasons with smart battles. Right now I'm using a much rougher calculation for printing overall confidence between battles, for speed. I will be improving this with some caching of the random samples for the overall scores. I do a much more thorough calculation for the final score.

It's a slightly different calculation with the scoring groups, so maybe I only have a bug there. Or maybe there just wasn't much difference in the TCRM. Or maybe TC scores are so far from normally distributed that it throws it off. Or maybe it was just a fluke - the same confidence down to 3 digits seems pretty unlikely even with the same battle selection.

Voidious19:38, 14 August 2012