Talk:Darkcanuck/RRServer/KnownIssues

From Robowiki
Jump to navigation Jump to search

Survival count should total 35

I want to confirm what has already been mentioned: survival count should not necessarily total 35. It appears that when bots die on the same tick, they both get credited with a first place finish. --Simonton 22:49, 27 September 2008 (UTC)

  • There was only one unmatching battle (17+14) in the over 5000 results I sent to this server. But indeed, 34 and 36 should be accepted too in my opinion. I think that battles against rambots (used to) have the most chance on getting these non-35 results. --GrubbmGait 23:16, 27 September 2008 (UTC)

The current check is that the survival totals at least 35 should that should take into account ties. But the results that are getting stuck on my clients are for <35 1st place survivals. One client has 24-8 (32 total) for Cephalosporin vs UrChicken2. Two of my clients (not physically here, checked on them yesterday) had about a dozen battles involving RougeDC Classic, also with less than 34. What to do? --Darkcanuck 23:55, 27 September 2008 (UTC)

Well, could you try running the normal Robocode (non rumble) with pairings like you're seeing this with in rumble? If you can't reproduce it that way, than it sounds like it might be a bug in the rumble client, and if you can reproduce it it should become evident what's happening by watching the battles when this happens (perhaps use the replay feature? Not sure how well that works though). I can't seem to see anything at all like that happening here. --Rednaxela 00:02, 28 September 2008 (UTC)

I only see those clients once every few weeks, so it will be awhile. They're running 1.5.4 with the latest version of Java on Windows 2000. It's only ~12 battles out of 1500+, so a very low occurrence. If I turn off the survival check, you'll see them come through... --Darkcanuck 00:19, 28 September 2008 (UTC)

I would not accept results lower than 34 or higher than 40. This check is mainly to intercept results from wrongly setupped clients, as that can be devastating for the rankings. A rare glitch of Robocode or the client can easily be repaired (when noticed). --GrubbmGait 01:16, 28 September 2008 (UTC)

HTTP response code: 503

Hmm, I started getting a bunch of "HTTP response code: 503" errors when uploading results. Anyone know why? --Rednaxela 18:09, 2 December 2008 (UTC)

I've got the same problem, and sometimes downloading rating page --lestofante 19:04, 2 December 2008 (UTC)

Yeah, also when the client tries to download the ratings here too. One thing I've noticed is that when it tries to download the rating pages is that sometimes it's 503 and sometimes it's 500, but it's always 503 for uploading. Anyways right now I'm running my RR client with upload/download disabled in order to build up battles to upload in bulk when uploading is working again. --Rednaxela 21:06, 2 December 2008 (UTC)

Hmm, the server does not look happy -- some sort of high load condition, I'll what I can do. --Darkcanuck 04:04, 3 December 2008 (UTC)

ATTENTION! Someone has a bad participants list -- almost half the bots in the rumble have been removed! This is a very slow operation, especially when other clients are trying to add them back in. I've disabled bot removal for now while I investigate the source... --Darkcanuck 04:43, 3 December 2008 (UTC)

Mea culpa, yesterday I've run for at least 20 minutes 5 hour the rumble with an empty participants list before I see, understood and fixed my client's problem.. next time I will play with his option with Upload flag set to false! Now everything is running and I've over 2000 battle's to upload --lestofante 11:29, 3 December 2008 (UTC)

Ok, I've re-enabled your username + IP so you can upload again. 2000 battles, eh? Just make sure you're using one of the approved clients (1.5.4 or 1.6.0). Participant removals are still disabled just in case since the participant list can't change anyway.

Retired Bots

In details pages like this, bots that are 'retired' like Cunobelin 0.2.1 are showing up (see ELO score 0 bots on details pages). Did this accidentally happen when you disabled removal/retiring of bots temporarily? (Also, would it be possible to make details pages for retired bots viewable with the right URL? :)) --Rednaxela 16:52, 3 December 2008 (UTC)

Thanks for pointing that out -- I was trying to figure out why the rumble pairings weren't going back to normal. I'm not quite sure how those bots got reactivated, need to think on that a bit. (Could be the economic crisis has affected their retirement plans?) It doesn't look like any rumble clients are trying to remove them though -- probably due to the failure to download the participants list. I'll see what I can do to remove them. --Darkcanuck 02:31, 4 December 2008 (UTC)
Hm, I think its a bug in the reactivation process... if both participants in a pairing have been retired, reactivating either also reactivates the pairing, ugh. --Darkcanuck 02:48, 4 December 2008 (UTC)

Performance

I'm uploading many battle I've run off-line, but the server is too slow. in 29 minutes I've uploaded only 262 battles, about 9 for minute!! Maybe are you updating all scoring system at every battle upload?--lestofante 13:30, 4 December 2008 (UTC)

The scoring is calculated for each upload yes. If it wasn't the results page and such wouldn't be nice and live and such. Apparently Darkcanuck is planning some things that may improve performance such as getting rid of table locking in favor of transactions, but really, I do think the roughly 2 seconds per battle that you describe is very acceptable. After all, running a battle takes considerably longer than 2 seconds in almost all cases, so it's not like it's that huge an impact on the total number of battles uploaded. --Rednaxela 13:47, 4 December 2008 (UTC)

I think the scoring system have to be updated only when requested the scoring page or similia.. so we have to update only WHAT we need WHEN we need, saving many time and resources. Pay attention: the upload relay is about 60sec/9battle=6sec/battle, not 2... acceptable? lets see.

  1. 1 execute of roborumble.sh = 1 iteration
  2. 1 iteration = 10 battle
  3. 6sec/battle * 10 battle = 60second of upload time for iteration
  4. 150-200 seconds = 1 iteration on amd64 x2 4000+, ubuntu OS, openJDK as virtual machine, UPLOAD=NOT,DOWNLOAD=NOT + roborumble loading time (calculate with unix script "time")

Result: upload time is about 1/4 of my rumble running time in the best case...add the ratings downloads time+new bot check and as result use on-line rumble takes about DOUBLE time than off-line(300-360 sec, calculated with unix script "time"), IMHO it's not acceptable If you have unix(machintosh is unix) you can calculate your execution time (I don't know how do it for windows), it's easy: time -p ./roborumble.sh, results is the first line, expressed in seconds--lestofante 14:30, 4 December 2008 (UTC)

Whoops, I read it as "262 in 9 minutes". Still 6 seconds is acceptable I think. Certainly far from ideal, but acceptable enough. I believe the real fix is making the RR client upload in the background while battles are running. And... er.... updating the scoring system when loading the scoring page would... often take a few minutes or longer if the scoring page hasn't been viewed in a while I believe. I'm quite sure updating it incrementally is a necessity. --Rednaxela 15:02, 4 December 2008 (UTC)

The background upload is a great idea! now I will try to implement it, a little java program that grab the 1vs1result.txt(roborumble must have UPLOAD=NOT) copy it in RAM, clean the original, and in background upload the copy in RAM. we will see. --lestofante 15:59, 4 December 2008 (UTC)

Haha, well, I kind of think it would be easier to just implement it as a patch to the normal roborumble client that just moves the upload process into another thread that runs in the background. Plus as a patch to the existing RR client, I think it's likely it would be adopted for the next official release of Robocode ;) --Rednaxela 18:11, 4 December 2008 (UTC)

I agree that 2 seconds would be nicer... Think of it as break to let your processor cool off a bit? If you look elsewhere in these pages, scoring used to be batched but performance degraded dramatically as the database grew. Its faster (and more reliable) when scoring is done on each upload -- only the two bots uploaded get updated, but the Elo algorithm requires a lot of data to do so. Right now there's a scaling problem where many simultaneous uploads cause each one to slow down and I'm planning to addres this soon. But if lots of clients are recovering from the past two days problems (I took the server offline several hours last night) then we have a higher volume of uploads right now too. --Darkcanuck 16:08, 4 December 2008 (UTC)

I think most clients should be done recovering by now. I noticed the server refusing uploads for a bit last night and my client was easily back to normal long before I woke up. --Rednaxela 18:11, 4 December 2008 (UTC)

I've implemented a the Radnaxela idea (see Talk:RoboRumble/Development#Background_Uploader), and in about 5 hour I've duplicated my month upload... wow! And no collateral effect ^^" --lestofante 01:11, 5 December 2008 (UTC)

Very cool! But can you do like Rednaxela suggested and make this another thread in the rumble client? You guys are determined to push my poor server to its limits... :) --Darkcanuck 01:29, 5 December 2008 (UTC)
Well don't worry, I won't be using background upload on my laptop here... at least not until I get an external cooling fan to boost airflow... because if it's maintained at high CPU use with no chances to cool off... it gets up past 93C sometimes which I call iffy when the processor is only designed to operate up to 95C (and the builtin auto-shutoff limit is at 105C) --Rednaxela 03:11, 5 December 2008 (UTC)
Another performance tip: I've had several hard drive failures/corruptions on machines I've used as clients which run continuously. My guess is that it's provoked by the constant disk writes caused by bots writing to their data directory at the end of each round. So I now have my robotcache dir mounted on a ramdisk -- its much quieter and battles run faster too. But I do put in a nice delay in between iterations for cooling to keep the cpu under 80. --Darkcanuck 04:04, 5 December 2008 (UTC)
Ok, look at Talk:RoboRumble/Development#Background_Uploader for the last really working version and use instruction. Now I'm running (and uploading) 1 iteration of 10 battle in 150 seconds (p.s. total CPU use under 60% with amule + azureus + firefox + eclipse + compiz). Now I'm going to implement a patch for the 1.6.0's code, is only one class to modify (UploaderResult) but my SVN client don't want run and manually import robocode's code in Eclipse give me some path problem (help me please, see Talk:Robocode/Developers_Guide_for_building_Robocode) --lestofante 23:14, 5 December 2008 (UTC)

With the last server revision is not either necessary run the script cause upload's speed as been considerable incremented --lestofante 12:53, 6 December 2008 (UTC)

I hope this is the right place for this considering the time since the last post... Are there any ideas in the works to improve scalability of the server to accept more concurrent users submitting battle results? It works just fine when there are only the normal 2-5 clients working, and even works well when I add another 44 clients. However, the one time I unleashed 80 clients the upload process was very very slow for each individual client and slowed the response time of the web server greatly. Perhaps accepting uploads immediately and adding them to a queue to be processed? This would cause the rankings to no longer be real-time, however if you could accept 10+ battles per second then that point may be moot. I also understand due to participation and available time that this is likely not a high priority. --bwbaugh 12:44, 12 June 2011 (UTC)

While I believe that queueing the request and use daemon to process the result, there are also many problem that may arise. What if the daemon can't process the input fast enough? (It's likely to be the case here if you fire up all your clients). The queue may group to hundreds of thousands of request, and that wouldn't help the situation. Also, queueing in database add more work to the disk, while if you cache in memory, it may run out quickly. I don't know if the server is currently running on dedicate box (or VPS, for that matter) or could hosting, but I think the server is already scratch to its limit. RoboRumble server is very database-write intensive, thus making it disk-intensive. (Replacing the disk with SSDs in RAID-0 might help, but that might make CPU the bottleneck, and the cost of SSDs)

I am not sure, but Darkcanuck, if you replace Apache with nginx or lighttpd (I prefer nginx) with php-fpm, would it work slightly faster due to lower amount of threads/processes? --Nat Pavasant 14:45, 12 June 2011 (UTC)

Proofing against add/remove wars

I was just thinking, in order to proof against add/remove wars in the future, maybe it would be good to make the server check the participants list itself to verify the addition/removal? It make make addition/removal of a bot to/from rumble slightly slower, but those two operations are rare under normal conditions. --Rednaxela 22:54, 4 December 2008 (UTC)

I was thinking along similar lines too, but I would make it a periodic check. Say every hour, the server gets the new participant list and does the add/removes directly and preventing clients from doing so. Also, the list could be obtained from the server so we can easily point to new locations or handle cases when the wiki is unavailable. --Darkcanuck 01:29, 5 December 2008 (UTC)
Maybe have it also double-check the participants list again whenever a client sends results for a new bot? I'm saying that just because I think it's nice to be able to see your bots show up in the results as soon as the first battles are uploaded. --Rednaxela 03:11, 5 December 2008 (UTC)
True. This feature is not high on my list yet -- I want to fix the locking first, then look at the details comparison stuff (including retired bot details). --Darkcanuck 04:04, 5 December 2008 (UTC)

Sort order on "Glicko-2 (RD)"

Really minor thing, but I noticed the "Glicko-2 (RD)" column doesn't sort correctly - eg, for descending, all 3 digit numbers list before all 4 digit numbers. Super minor, but figure it's worth letting ya know. Rankings page looks great, btw, nice work. --Voidious 00:02, 5 January 2009 (UTC)

Just to note, a fair number of columns are like that not just the Glicko-2 one. Also in the bot details pages. --Rednaxela 00:42, 5 January 2009 (UTC)

Thanks for the heads-up. Looks like tablesorter gets confused by the parentheses around the RD value and possibly the class attributes for high/low scores... --Darkcanuck 05:49, 5 January 2009 (UTC)

Just a reminder about this ;-) It seems all the columns are being sorted as text, not numbers. For instance, when sorting highest to lowest, 900 comes before 2000. --Skilgannon 16:00, 30 April 2009 (UTC)

Ah, good catch. I fixed this problem on every table *but* the main rankings during the last update. Should work for the main tables now too. --Darkcanuck 03:43, 1 May 2009 (UTC)
Thanks =) It still seems to sort by text on the Glicko-2 column though... I'm not sure how easy the fix would be because of the RD value. Can you set custom comparators? Or is this nothing like Java? :-) --Skilgannon 10:31, 1 May 2009 (UTC)

Dropping of ELO and Glicko rating

I don't know where was the discussion located, so I post it here. Maybe the drop of ELO and Glicko rating is due to imprecision of calculation? Because your current system and DB only store them in INT(5) field, not floating-point field? --Nat Pavasant 15:01, 25 May 2010 (UTC)

Not yet.  I'm still trying to 2k it :) --Miked0801 17:18, 12 June 2011 (UTC)