Calculating Confidence

Jump to navigation Jump to search
Revision as of 10 December 2012 at 00:32.
The highlighted comment was created in this revision.

Calculating Confidence

@Voidious -- I'm not sure what your plan for confidence is, but I eagerly went ahead and developed my own confidence calculator. I was looking over your code for calculating confidence and was having trouble following it, so I instead went to my wife's Principles of Biostatistics book and read the chapter on Confidence Intervals. For the sake of simplicity, I will stick with 95% confidence intervals, as that is what you used in your code (that's where the 1.96 comes from) and it seems reasonable. The confidence interval for a single robot turns out to be pretty simple to calculate (in special-character-challenged terms, it is x +- 1.96 * s / sqrt(n), where x is the mean, s is the standard deviation, and n is the sample size). Where it gets more complicated is in calculating the confidence interval for groups and the overall total score.

Lets talk groups first. What I did for a group was to take the first score for each opponent, average them all, and that becomes data point 1. Then take the second score for each opponent, average them, and that becomes data point 2. I determine how many data points to use by calculating the average number of battles for an opponent in the group, rounded. This means some data points for opponents with more scores end up getting thrown away, and some data points for opponents with fewer scores don't have enough scores. For the latter, I use as many extra randomly generated scores as I need where the random score falls within the confidence interval of scores for that particular robot. Once I have all of the data points, I then use the original means for calculating a confidence interval on the collected data points.

Now for the overall total. If there is only 1 group (or no groups, depending on how you look at it), then there is nothing more to do -- use the values calculated for the 1 group. But if there are multiple groups, then what? We should probably respect that the overall total is an average of the group totals. This would end up being just like calculating the group confidence intervals, only treating the groups like the robots.

Did that make sense? How is this different from what what you have done in RoboRunner?

    Skotty00:36, 10 December 2012

    Heh, well, what I did is a little complicated, but I think it's about the best you can do for a set of bots that each have their own distributions. Basically I run 1,000 or whatever random simulations of the overall score, based on the averages / standard deviations of each individual bot's score distribution. Then I can take those "overall score" samples, supposedly generated from the same distribution as the real scores, and use them as additional samples to calculate the confidence interval of the overall score. It's a fairly basic Monte Carlo method.

      Voidious01:23, 10 December 2012
       

      I see there was a discussion about it on the RoboRunner page. I should probably go read that. Never heard of the Monte Carlo method, so I'll look into it.

        Skotty01:27, 10 December 2012
         

        I'd heard the term, but it was totally Skilgannon that knew enough to suggest it. Once I looked into it, though, it was pretty simple.

        But I also wanted to mention, I was planning to pass some object with all the confidence interval info you might need about the current battle in the new listener. I figured that was among the things you'd want in the application output, since it's among the things I print in the console version. But of course you're free to use whatever you like. :-)

          Voidious01:32, 10 December 2012