Difference between revisions of "Thread:Talk:RoboRunner/calculating confidence of an APS score"
(New thread: calculating confidence of an APS score) |
(typo) |
||
Line 1: | Line 1: | ||
− | Hey resident brainiacs - I'm displaying confidence using [[wikipedia:Standard error|standard error]] calculations on a per bot basis in RoboRunner now. What I'm not sure of is how to calculate the confidence | + | Hey resident brainiacs - I'm displaying confidence using [[wikipedia:Standard error|standard error]] calculations on a per bot basis in RoboRunner now. What I'm not sure of is how to calculate the confidence of the overall score. |
If I had the same number of battles for each bot, then the average of all battles would equal the average of all per bot scores. So I think then I could just calculate the overall average and standard error, ignoring per bot averages, and get the confidence interval of overall score that way. But what I want is the average of the individual bot scores, each of which has a different number of battles. | If I had the same number of battles for each bot, then the average of all battles would equal the average of all per bot scores. So I think then I could just calculate the overall average and standard error, ignoring per bot averages, and get the confidence interval of overall score that way. But what I want is the average of the individual bot scores, each of which has a different number of battles. |
Latest revision as of 21:26, 13 August 2012
Hey resident brainiacs - I'm displaying confidence using standard error calculations on a per bot basis in RoboRunner now. What I'm not sure of is how to calculate the confidence of the overall score.
If I had the same number of battles for each bot, then the average of all battles would equal the average of all per bot scores. So I think then I could just calculate the overall average and standard error, ignoring per bot averages, and get the confidence interval of overall score that way. But what I want is the average of the individual bot scores, each of which has a different number of battles.
Something like (average standard error / sqrt(num bots)) makes intuitive sense, but I have no idea if it's right. Or maybe sqrt(average(variance relative to per bot average)) / sqrt(num battles)?
This would also allow me to measure the benefits of the smart battle selection.