Talk:Darkcanuck/RRServer

From Robowiki
Revision as of 04:39, 26 September 2008 by Darkcanuck (talk | contribs) (More on Glicko, APS, random conjecture)
Jump to navigation Jump to search

Fire away...

Just a suggestion for an additional check. I have never seen score a bot more than 8000 points, so this could be checked too. When examining the results that messed up the original roborumble rating beyond repair, I saw results of 20000 against 16000 (Thats what you get when running OneOnOne with MELEE=YES). For the time being I let my client running (unattended) for ABC's server, as I don't really have the time for bughunting. Your effort however seems promising. Good luck. -- GrubbmGait

  • Thanks! That's a good check, will be combining that with the survival >=35 (also your suggestion I think) once I rearrange the error handling and failure output to the client. Then I'll look into ELO... --Darkcanuck
  • Your checks have both been implemented. -- Darkcanuck

Looking very nice! I have a couple questions and thoughts I thought I'll mention. So what is this "Ideal" column in the results mean? One thought I had about ratings, is perhaps it would be best to make the APS fill missing pairings with Glicko-based estimates? I'm thinking that would give the best long term stability/accuracy once pairings are complete while having something a more meaningful before pairings are complete before the pairings are complete. --Rednaxela 01:18, 26 September 2008 (UTC)

Thanks! I've just posted a bit more about ratings here. The "Ideal" column is my attempt to reverse calculate a rating based on a bot's APS. I just inverted the Glicko formula for "E" (expected probability) to yield a rating given if given E (i.e. APS) and a competitor's rating and RD. For the latter two I used the defaults (1500 and 350) so theoretically if the APS represents the score vs an average bot (and there's a uniform distribution?) then the rating might converge to the "ideal" value. But I have no idea if it works, just wanted to see how close it might be. I'm not sure you could fill in the pairings using Glicko + APS -- the reason systems like Glicko exist is to get around the problem of incomplete pairings, so the Glicko rating should be enough in itself. If it's accurate, that is -- we'll see once the ratings catch up to the pairings already submitted... -- Darkcanuck 03:39, 26 September 2008 (UTC)