calculating confidence of an APS score

Fragment of a discussion from Talk:RoboRunner
Jump to navigation Jump to search

Those +-, are they the standard error or the stddev?

The only thing I can think of testing is whether you are calculating the right number of random battles for each in the Monte-Carlo method. If you were only doing one battle for each, then the numbers you are getting would be the same for the standard as for the smart battles. It looks like the prioritisation is working well though - Sparrow and Yngwie both have low number of battles as well as low error/stddev.

Skilgannon14:16, 14 August 2012