Difference between revisions of "Talk:Offline batch ELO rating system"

Revision as of 17:34, 12 August 2011

Quite interesting. I fail to see how this makes "less assumptions on results than APS" however, since I don't believe APS makes any assumptions whatsoever, being a simple averaging of all pairings a robot is involved in. I don't see this as being less biased. Similarly valid certainly, but not less biased.
One thing I think is important to note is, I'm pretty sure the fact that you're rounding the result to a win/loss/draw probably explains the *vast* majority of the difference between this and the APS/ELO/Glicko-2 on the rumble server. I suspect that APS or the iterative roborumble ELO result wouldn't be that different if they performed the same rounding.
--Rednaxela 13:12, 12 August 2011 (UTC)

APS assumes a bots strength is proportional to the proportional difference in scores.

If a bot scores 2 and the opponent scores 1 (67%/33%), it is stronger than if it scored 3 and bot B scored 2 (60%/40%). Both bots increased the score by the same amount, but the APS system assumes one is stronger than the other.

If a bot scores 100% against MyFirstRobot and 40% against DrussGT, it is stronger than if it scored 60% against MyFirstRobot and 60% against DrussGT. APS assumes that +40% against MyFirstRobot is worth more than the +20% against DrussGT, leading to king-making.

Arpard Elo original rating system inferred strength on frequency of wins/losses alone, not proportional score difference. If a bot beats another, it is only better, not a little better or a lot better. --MN 16:32, 12 August 2011 (UTC)

@@ Line 2: / Line 2: @@
 One thing I think is important to note is, I'm pretty sure the fact that you're rounding the result to a win/loss/draw probably explains the *vast* majority of the difference between this and the APS/ELO/Glicko-2 on the rumble server. I suspect that APS or the iterative roborumble ELO result wouldn't be that different if they performed the same rounding. <br/>
 --[[User:Rednaxela|Rednaxela]] 13:12, 12 August 2011 (UTC)
+:APS assumes a bots strength is proportional to the proportional difference in scores.
+:If a bot scores 2 and the opponent scores 1 (67%/33%), it is stronger than if it scored 3 and bot B scored 2 (60%/40%). Both bots increased the score by the same amount, but the APS system assumes one is stronger than the other.
+:If a bot scores 100% against MyFirstRobot and 40% against DrussGT, it is stronger than if it scored 60% against MyFirstRobot and 60% against DrussGT. APS assumes that +40% against MyFirstRobot is worth more than the +20% against DrussGT, leading to king-making.
+:Arpard Elo original rating system inferred strength on frequency of wins/losses alone, not proportional score difference. If a bot beats another, it is only better, not a little better or a lot better. --[[User:MN|MN]] 16:32, 12 August 2011 (UTC)

Difference between revisions of "Talk:Offline batch ELO rating system"

Revision as of 17:34, 12 August 2011

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools