Difference between revisions of "Talk:Darkcanuck/RRServer/Ratings"
Jump to navigation
Jump to search
(Battles per Pairing) |
Darkcanuck (talk | contribs) (black swans) |
||
Line 2: | Line 2: | ||
I just wanted to comment on the statement, "It's uncertain how well it works with less battles or incomplete pairings." My experiment with the MC2K7 shows that separate runs of 75 battles can still show more than 1% variation for a given pairing. This affects any scoring system, and is a fact that we have to live with. The reliability of output can only be as good as input, no matter how fancy the interpolation is for incomplete pairings. The hope is that the variance will become a wash when seen over 600+ pairings. --[[User:Simonton|Simonton]] 15:25, 26 September 2008 (UTC) | I just wanted to comment on the statement, "It's uncertain how well it works with less battles or incomplete pairings." My experiment with the MC2K7 shows that separate runs of 75 battles can still show more than 1% variation for a given pairing. This affects any scoring system, and is a fact that we have to live with. The reliability of output can only be as good as input, no matter how fancy the interpolation is for incomplete pairings. The hope is that the variance will become a wash when seen over 600+ pairings. --[[User:Simonton|Simonton]] 15:25, 26 September 2008 (UTC) | ||
+ | |||
+ | :I think [[David Alves]] commented that targeting challenge scores also varied by almost 1% at 15 seasons, so I agree there's lots of evidence that more battles per pairing are needed, which would take a very, very long time in a 600+ competitor environment. You're right that as the number of competitors increases, variabilities cancel each other out. But at the same time, the bigger the competition, the more risk of a "black swan" competitor whose scores are ''all'' skewed in one direction. -- [[User:Darkcanuck|Darkcanuck]] 15:31, 26 September 2008 (UTC) |
Revision as of 16:31, 26 September 2008
Battles per Pairing
I just wanted to comment on the statement, "It's uncertain how well it works with less battles or incomplete pairings." My experiment with the MC2K7 shows that separate runs of 75 battles can still show more than 1% variation for a given pairing. This affects any scoring system, and is a fact that we have to live with. The reliability of output can only be as good as input, no matter how fancy the interpolation is for incomplete pairings. The hope is that the variance will become a wash when seen over 600+ pairings. --Simonton 15:25, 26 September 2008 (UTC)
- I think David Alves commented that targeting challenge scores also varied by almost 1% at 15 seasons, so I agree there's lots of evidence that more battles per pairing are needed, which would take a very, very long time in a 600+ competitor environment. You're right that as the number of competitors increases, variabilities cancel each other out. But at the same time, the bigger the competition, the more risk of a "black swan" competitor whose scores are all skewed in one direction. -- Darkcanuck 15:31, 26 September 2008 (UTC)