calculating confidence of an APS score

Fragment of a discussion from Talk:RoboRunner
Jump to navigation Jump to search

Well, the verdict is in. Looks like a combination of fluke and the TCRM battles just not being particularly optimizable. I ran another 25 seasons each way and got:

  • Dumb battles: Took 2690.4s, 89.13 +- 0.362
  • Smart battles: Took 2858.8s, 89.4 +- 0.338

So this time smart battles actually took longer, but had a better confidence and were again much closer to the true average. I also tested that the groups and non-groups versions of overall confidence were giving the same for TCRM (because groups are of equal size). I'm going to skip any fancy attempts to optimize for a more accurate overall confidence between battles, round the final confidence to 2 digits instead of 3, and get this posted.

Voidious03:31, 15 August 2012