Talk:Darkcanuck/RRServer

From Robowiki
Revision as of 03:18, 26 September 2008 by Rednaxela (talk | contribs)
Jump to navigation Jump to search

Fire away...

Just a suggestion for an additional check. I have never seen score a bot more than 8000 points, so this could be checked too. When examining the results that messed up the original roborumble rating beyond repair, I saw results of 20000 against 16000 (Thats what you get when running OneOnOne with MELEE=YES). For the time being I let my client running (unattended) for ABC's server, as I don't really have the time for bughunting. Your effort however seems promising. Good luck. -- GrubbmGait

  • Thanks! That's a good check, will be combining that with the survival >=35 (also your suggestion I think) once I rearrange the error handling and failure output to the client. Then I'll look into ELO... --Darkcanuck
  • Your checks have both been implemented. -- Darkcanuck

Looking very nice! I have a couple questions and thoughts I thought I'll mention. So what is this "Ideal" column in the results mean? One thought I had about ratings, is perhaps it would be best to make the APS fill missing pairings with Glicko-based estimates? I'm thinking that would give the best long term stability/accuracy once pairings are complete while having something a more meaningful before pairings are complete before the pairings are complete. --Rednaxela 01:18, 26 September 2008 (UTC)

You cannot post new threads to this discussion page because it has been protected from new threads, or you do not currently have permission to edit.

Contents

Thread titleRepliesLast modified
retiring ELO column616:51, 17 February 2012
FatalFlaw's uploads have suspicious APS for Tomcat005:58, 16 February 2012
kidmumu uploads318:16, 1 February 2012
Feature Request: average APS diff in bots compare616:55, 17 November 2011
Performance123:48, 13 November 2011

retiring ELO column

Now that everyone's ELO rating is subzero in General 1v1 =), is it maybe time to retire it altogether?

Voidious23:07, 14 February 2012

I'm all for it =) Although, doesn't the LRP depend on ELO data? Maybe shift that over to Glicko data instead? And if there was some way to make the LRP show the 'expected' option by default... that would make my day =)

Skilgannon09:55, 15 February 2012
 

I'd also support removal of ELO from the rumble, and replacing it with Glicko or Glicko2 in the places that use it (LRP).

Rednaxela17:50, 15 February 2012
 

I also agree

Jdev18:53, 15 February 2012
 

Yeah, ELO doesn't do much anymore. So agreed as well.

Chase-san20:51, 15 February 2012
 

Elo is working fine, even with negative scores, but keeping both Elo and Glicko-2 is redundant. So, removing one of them is fine by me.

MN21:00, 15 February 2012
 

Luckily we still have the music . . .

GrubbmGait16:51, 17 February 2012
 

FatalFlaw's uploads have suspicious APS for Tomcat

FatalFlaw's uploads have suspicious APS for Tomcat:

  • lxx.Tomcat 3.55 VS voidious.mini.Komarious 1.88
  • lxx.Tomcat 3.55 VS "baal.nano.N 1.42
  • lxx.Tomcat 3.55 VS gf.Centaur.Centaur 0.6.7

Darkcanuck, can you rollback all his uploads?

Jdev05:58, 16 February 2012

kidmumu uploads

Results from kidmumu uploads don´t come close to my uploads. Is there something wrong?

MN22:03, 29 January 2012

I haven't had a chance to check if this could affect mn.Combat, but my #1 guess would be that perhaps it's a java version issue (i.e. kidmumu is using Java 5 and Combat requires Java 6?).

Failing that, I'd have to think that kidmumu's client may be skipping turns.

Rednaxela20:21, 31 January 2012
 

[Combat vs Corners]

[Combat vs MyFirstRobot]

[Combat vs TrackFire]

Probably a Java version issue. I´ll downgrade to 1.5 in future versions. But didn´t check other bots scores.

MN17:56, 1 February 2012
 

I'm sure there are lots of bots that require Java 6, right? We might want to have Darkcanuck rollback all his uploads until we can get kidmumu onto Java 6.

Voidious18:16, 1 February 2012
 

Feature Request: average APS diff in bots compare

I find that until all pairings is done it's very useful to know current avarage difference in APS between two versions - after about 100 random battles this number says fairly exactly is newer version better, than older.
Darkcanuck, can you schedule to add row for columns "% Score", "% Survival" in section "+/- Difference" in bots compare page with avarage value of corresponds columns? I think, there're work for 1-2 hours maximum

Jdev11:12, 17 November 2011

I think this is already covered by the 'Common % Score (APS)' and 'Common % Survival', the lowest two lines in the top-table. At least I use it to check if my changes have a positive (or negative) result when the pairings are not complete yet.

GrubbmGait13:10, 17 November 2011
 

No. May be i wrote not clear.
I mean, that i want to know average difference in pairings between 2 versions. According to my tests, this number stabilizes mach faster, than APS. And more, Common % Score does not make sense, because while there only 1 battle in every pairing it's exactly equals to APS and in another case, there may be 10 battles against Walls and 1 battle against Druss.

Jdev13:32, 17 November 2011
 

As far as I know, when your new version has for example 100 pairings, you will see the average APS for that 100 pairings. AND for your older version you will also see the APS for that 100 pairings. And you are right, this indicates much more reliable what your final score will be (relative to your older version) than plain APS. The one who can really answer this question is Darkcanuck.

GrubbmGait13:46, 17 November 2011
 

Wow, if things is like you say, it's really what i want, thanks:)

Jdev13:53, 17 November 2011
 

The common %score is calculated just like APS, but only for pairings that the old and new versions have in common. That makes it easier to compare two versions when the new one is still missing many pairings, or in the case where the old bot may have pairings against a lot of retired bots (and may be missing scores vs newer bots). I think that's what you're looking for...

Darkcanuck16:51, 17 November 2011
 

Yes, thank you:)

Jdev16:55, 17 November 2011
 

Performance

Can you turn on .htaccess browser caching for results.

#Caching
ExpiresActive on
ExpiresByType image/gif "access plus 1 year"

Other performance enhancing things you can do are: Set specific size for the images, inline or via CSS. CSS would be easier. This would speed up the page loading and be also less annoying while all the requests are going through (having default sized images deforming the table before they load). Minify the HTML/CSS/JS (less to send).

Not doing/doable for known reasons: Serve identical files from the same url. (flag images)

Chase-san22:31, 11 October 2011

Just added the cache expiry directives -- let me know if that helps. The minification isn't necessary, my server already sends all files gzip'd, so the performance enhancement would be minimal at best.

Darkcanuck23:48, 13 November 2011