Difference between revisions of "Talk:The 2000 Club/Nano"
(ELO rating is required for this club) |
|||
(8 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
As far as I know the ticket for this clus is 2000 ELO, not Glicko-2. 2000 ELO is roughly equal to 85APS, so I don't think LBB is quilified for this club yet. --[[User:Nat|<span style="color:#099;">Nat</span>]] [[User talk:Nat|<span style="color:#0a5;">Pavasant</span>]] 05:26, 17 February 2010 (UTC) | As far as I know the ticket for this clus is 2000 ELO, not Glicko-2. 2000 ELO is roughly equal to 85APS, so I don't think LBB is quilified for this club yet. --[[User:Nat|<span style="color:#099;">Nat</span>]] [[User talk:Nat|<span style="color:#0a5;">Pavasant</span>]] 05:26, 17 February 2010 (UTC) | ||
+ | |||
+ | What "2000 ELO" is has drifted since moving from the old server. I'm not sure ELO measures makes a good long-term comparison anymore. --[[User:Rednaxela|Rednaxela]] 05:39, 17 February 2010 (UTC) | ||
+ | |||
+ | I don't know what APS was 2k on the old server, but it definitely wasn't 85 (pretty sure it wasn't even 84). [[Komarious]] was about 2k in General 1v1 and is currently around 80 APS. Also, since this club didn't exist until now, we can define it however we want. =) --[[User:Voidious|Voidious]] 05:52, 17 February 2010 (UTC) | ||
+ | |||
+ | : Well, new bots can make your APS drop =) --[[User:Nat|<span style="color:#099;">Nat</span>]] [[User talk:Nat|<span style="color:#0a5;">Pavasant</span>]] 15:27, 17 February 2010 (UTC) | ||
+ | |||
+ | The question is... what to define it as? Neither ELO nor Glicko are long-term stable. APS isn't quite stable either... Perhaps the ratio between your bot's APS and SandboxDT's APS? That's about the most stable benchmark I can come up with without becoming too elaborate... --[[User:Rednaxela|Rednaxela]] 05:55, 17 February 2010 (UTC) | ||
+ | |||
+ | Excellent - a discussion :) Truth is, ELO ratings have dropped quite a bit in 2 years. I'm currently sitting at 1988 ELO - the highest ELO rating in any catagory currently. While not 2k until I get another 0.5% APS or so, I followed the lead of some other posts by looking at the score saved via the rumble archive (glicko only) and used that. Without that archive, I have no other proof. If we want to setup a different scoring / ratio / whatever, let me know and I'll gladly withdraw my bot until such a time as everyone is satisfied. BTW, all other Club bots are now at their respective scores with Glicko scores. It seems to me that that particular score is more relevant right now. Still trying to get the undisputed PL ranking too - grrr Fuatisha :) --[[User:Miked0801|Miked0801]] 06:48, 17 February 2010 (UTC) | ||
+ | |||
+ | I'm sure that we really in need of a new standard for scoring. I tried to start a discussion about this many times already (in different ways) but fail because I think no one have any solutions. New server have problem with ELO system, which starting to drop via unknown reason (perhaps wrong formula? I don't know) I can think of one solution for the server, which will solve problem with bad result too. Why don't server cache current ranking, and all uploads are just store in upload table in database. And every half and hour or so, query all upload for all active robot and calculate ranking, write to cache and sleep (do this via cron) But that still doesn't solve problem with the stability of the rating... --[[User:Nat|<span style="color:#099;">Nat</span>]] [[User talk:Nat|<span style="color:#0a5;">Pavasant</span>]] 15:27, 17 February 2010 (UTC) | ||
+ | |||
+ | Not as drastic as ELO, but Glicko-2 seems to have drifted down 2-5 points for most bots since October ([[RumbleArchives:RoboRumble 20091012|RoboRumble 20091012]] vs [[RumbleArchives:RoboRumble 20100215|RoboRumble 20100215]]). If we're going to consider using an existing bot as an "anchor", I think we should consider doing so in the ratings themselves. David had the idea to use two bots to anchor the ratings of his (never completed) RR server, to control the rating offset and scaling. | ||
+ | |||
+ | But I don't think APS clubs are such a bad idea, either. The ratings/scores are relative, anyway. Since few people introduce new HOT bots to 1v1, APS will probably go down over time, but that's life. Latecomers get the benefit of past wisdom, tutorials, etc, so it doesn't seem that unfair to me. (Even if "80 APS Club" doesn't have the same ring to it as "2000 Club". =)) And btw, the more I think about it, I think 2k on old server was in the 80-81 range before the Big Upward Drift. Ascendant was ~2080 with 84 something APS and each ELO point was like .04 or .06 APS on average. | ||
+ | |||
+ | --[[User:Voidious|Voidious]] 15:36, 17 February 2010 (UTC) | ||
+ | |||
+ | That idea of anchoring the rankings based on reference bots sounds exactly like what I'd like to see! One note, is that we could use more than two reference/anchor bots to get an average scaling/offset that would be more immune to noise. I think the key in some ways, is deciding what bots should be the anchor bots. They shouldn't be particularly atypical, and we'd want to have some in all size classes too probably. The same version would also need to permanently stay in the rumble so it may be preferable to chose ones that are less likely to update any time soon (and if they do, we should still keep the old version in the rumble) --[[User:Rednaxela|Rednaxela]] 15:50, 17 February 2010 (UTC) | ||
+ | |||
+ | Just curious what would happen if the Glicko/ELO ranking were re-calculated from the current battle values stored per bot. If the drift is being caused by truncation/imprecison over time, this would fix it. --[[User:Miked0801|Miked0801]] 16:59, 17 February 2010 (UTC) |
Latest revision as of 17:59, 17 February 2010
As far as I know the ticket for this clus is 2000 ELO, not Glicko-2. 2000 ELO is roughly equal to 85APS, so I don't think LBB is quilified for this club yet. --Nat Pavasant 05:26, 17 February 2010 (UTC)
What "2000 ELO" is has drifted since moving from the old server. I'm not sure ELO measures makes a good long-term comparison anymore. --Rednaxela 05:39, 17 February 2010 (UTC)
I don't know what APS was 2k on the old server, but it definitely wasn't 85 (pretty sure it wasn't even 84). Komarious was about 2k in General 1v1 and is currently around 80 APS. Also, since this club didn't exist until now, we can define it however we want. =) --Voidious 05:52, 17 February 2010 (UTC)
The question is... what to define it as? Neither ELO nor Glicko are long-term stable. APS isn't quite stable either... Perhaps the ratio between your bot's APS and SandboxDT's APS? That's about the most stable benchmark I can come up with without becoming too elaborate... --Rednaxela 05:55, 17 February 2010 (UTC)
Excellent - a discussion :) Truth is, ELO ratings have dropped quite a bit in 2 years. I'm currently sitting at 1988 ELO - the highest ELO rating in any catagory currently. While not 2k until I get another 0.5% APS or so, I followed the lead of some other posts by looking at the score saved via the rumble archive (glicko only) and used that. Without that archive, I have no other proof. If we want to setup a different scoring / ratio / whatever, let me know and I'll gladly withdraw my bot until such a time as everyone is satisfied. BTW, all other Club bots are now at their respective scores with Glicko scores. It seems to me that that particular score is more relevant right now. Still trying to get the undisputed PL ranking too - grrr Fuatisha :) --Miked0801 06:48, 17 February 2010 (UTC)
I'm sure that we really in need of a new standard for scoring. I tried to start a discussion about this many times already (in different ways) but fail because I think no one have any solutions. New server have problem with ELO system, which starting to drop via unknown reason (perhaps wrong formula? I don't know) I can think of one solution for the server, which will solve problem with bad result too. Why don't server cache current ranking, and all uploads are just store in upload table in database. And every half and hour or so, query all upload for all active robot and calculate ranking, write to cache and sleep (do this via cron) But that still doesn't solve problem with the stability of the rating... --Nat Pavasant 15:27, 17 February 2010 (UTC)
Not as drastic as ELO, but Glicko-2 seems to have drifted down 2-5 points for most bots since October (RoboRumble 20091012 vs RoboRumble 20100215). If we're going to consider using an existing bot as an "anchor", I think we should consider doing so in the ratings themselves. David had the idea to use two bots to anchor the ratings of his (never completed) RR server, to control the rating offset and scaling.
But I don't think APS clubs are such a bad idea, either. The ratings/scores are relative, anyway. Since few people introduce new HOT bots to 1v1, APS will probably go down over time, but that's life. Latecomers get the benefit of past wisdom, tutorials, etc, so it doesn't seem that unfair to me. (Even if "80 APS Club" doesn't have the same ring to it as "2000 Club". =)) And btw, the more I think about it, I think 2k on old server was in the 80-81 range before the Big Upward Drift. Ascendant was ~2080 with 84 something APS and each ELO point was like .04 or .06 APS on average.
--Voidious 15:36, 17 February 2010 (UTC)
That idea of anchoring the rankings based on reference bots sounds exactly like what I'd like to see! One note, is that we could use more than two reference/anchor bots to get an average scaling/offset that would be more immune to noise. I think the key in some ways, is deciding what bots should be the anchor bots. They shouldn't be particularly atypical, and we'd want to have some in all size classes too probably. The same version would also need to permanently stay in the rumble so it may be preferable to chose ones that are less likely to update any time soon (and if they do, we should still keep the old version in the rumble) --Rednaxela 15:50, 17 February 2010 (UTC)
Just curious what would happen if the Glicko/ELO ranking were re-calculated from the current battle values stored per bot. If the drift is being caused by truncation/imprecison over time, this would fix it. --Miked0801 16:59, 17 February 2010 (UTC)