Calculating Confidence
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page.
Heh, well, what I did is a little complicated, but I think it's about the best you can do for a set of bots that each have their own distributions. Basically I run 1,000 or whatever random simulations of the overall score, based on the averages / standard deviations of each individual bot's score distribution. Then I can take those "overall score" samples, supposedly generated from the same distribution as the real scores, and use them as additional samples to calculate the confidence interval of the overall score. It's a fairly basic Monte Carlo method.
I see there was a discussion about it on the RoboRunner page. I should probably go read that. Never heard of the Monte Carlo method, so I'll look into it.
I'd heard the term, but it was totally Skilgannon that knew enough to suggest it. Once I looked into it, though, it was pretty simple.
But I also wanted to mention, I was planning to pass some object with all the confidence interval info you might need about the current battle in the new listener. I figured that was among the things you'd want in the application output, since it's among the things I print in the console version. But of course you're free to use whatever you like. :-)
I'll use it if it's there. I use the ScoreLog to show data from past battles, and wasn't sure if confidence information would also be available from the ScoreLog after your updates. If not, I can keep using my own confidence calculator for past data.
Hmm. Well first off, I am pretty sure you should make sure you are using the [t-distribution], not the normal distribution. Using that, I would generate a confidence interval for each individual bot. I am nearly certain that there is a way to generate a confidence interval from the mean of several other intervals. I can't remember off the top of my head but I vaguely recall it being something like the square root of the sum of the squares of the standard errors (not standard deviations since the sample size is presumably fairly small). I'll tell you if I can find it.
http://www.hilemansblog.com/?tag=root-sum-of-squares and https://www.westgard.com/lesson35.htm#6
I didn't read through them carefully (kind of busy with school), but skimming through them quickly, it appears that the square root of the sum of the variances of the individual distributions is correct.
I think that's correct if all of them have the same number of samples. However, with the cool new 'variance minimizer' pairings selection algorithm that isn't necessarily guaranteed. Although you may be right - could you see if your Monte Carlo gives the same results as a root-sum-of-squares, Voidious?