Talk:LiteRumble

From Robowiki
Jump to navigation Jump to search

Contents

Thread titleRepliesLast modified
API1415:34, 26 March 2013
links in new tab122:02, 25 March 2013
1.8.1.0007:33, 25 March 2013
Down (for a few hours)108:17, 24 March 2013
MOAR CLIENTS806:10, 24 March 2013
LiteRumble 1.7.4.2 superpack420:45, 22 March 2013
Vote Ranking1120:39, 22 March 2013
non-Robocode LiteRumble usage219:50, 22 March 2013
Contributor stats118:06, 22 March 2013
Individual Battle Scores406:42, 14 February 2013
server load?119:47, 12 February 2013
NPP over 100?112:28, 13 January 2013
running gigarumble1017:09, 11 January 2013
CSS905:47, 29 October 2012
Client Version121:40, 23 August 2012
Prettifying + Bot Comparisons216:28, 25 July 2012
Lost Pairings217:31, 9 July 2012
Trying New Rumbles308:39, 8 June 2012
Nice work and some thoughs1813:57, 1 June 2012
Problem Bot Index713:49, 1 June 2012
First page
First page
Next page
Next page
Last page
Last page

Ok, I have a simple API up. Just add &api=True to any Rankings or BotDetails page and it will return in nice easy-to-parse JSON-ish format. I say JSON-ish because real JSON puts double-quotes around everything, whereas I'm lazy and don't feel like doing a ton of double-quote escaping, and besides, we don't have commas or colons in our data so there isn't any risk.

If you don't want the entire pairings detail from the BotDetails, add &limit=0 to the page and it will leave them out.

Some usage examples:

http://literumble.appspot.com/Rankings?game=nanorumble&api=True

http://literumble.appspot.com/BotDetails?game=nanorumble&name=sheldor.jk.Yatagan%201.0.5&api=True

http://literumble.appspot.com/BotDetails?game=nanorumble&name=sheldor.jk.Yatagan%201.0.5&api=True&limit=0

Of course, I'd rather you call Rankings once than call BotDetails with &limit=0 985 times, because although it doesn't generate the JSON, it still has to pull all the data in, which adds up for lots of requests. I'd ask that if you are getting more than 3, rather use the Rankings.

If there's anything else you'd like me to add to the API just ask, if it's already saved in my data it should be easy to whip something up to return it.

Skilgannon22:57, 24 March 2013

Sweet! I'll take a look at updating RR API clients in the next couple days. (That's @roborumble on Twitter, Category:Archived RoboRumble Rankings, RumbleStats, and BedMaker.)

I'll let you know how it goes or if I hit any snags. Hopefully the JSON libs I use don't mind JSON-ish. ;)

Voidious23:16, 24 March 2013
 

Well, the Perl JSON library is indeed unhappy about the lack of quoting, but I think I've got it covered:

$content =~ s/([\{,]\s*)([^:\{,]*):/$1\"$2\":/g;
$content =~ s/:([^\[][^,\}]*)([,\}]*)/:\"$1\"$2/g;

=)

Voidious00:57, 25 March 2013
 

Cool, got it working with archived rankings, eg: RumbleArchives:RoboRumble 20130324. This checks back hourly when rumbles don't have complete pairings, but that may be forever without priority battles so I'll kill it in a couple days. =)

Only useful thing that I found missing so far is that Darkcanuck's results came back sorted and had a "rank" field. Seems like you have logic for sorting in there so it'd be nice if you could honor the "order" param like you do for the normal pages, but not a big deal (/shamefully hides his bubble sort). The quote thing wasn't a huge deal for me but that might also screw some people up since I imagine most clients use a JSON parser instead of doing it manually.

Voidious02:46, 25 March 2013

And @roborumble, yay: [1]

I'll look at RumbleStats and BedMaker tomorrow. Will be nice that other people can actually use BedMaker now. =) If anyone's interested, I had to make one more tweak to my regex to properly quote the &limit=0 format:

$content =~ s/([\{,]\s*)([^:\{,]*):/$1\"$2\":/g;
$content =~ s/:([^\[][^,\}\n]*)([,\}]*)/:\"$1\"$2/g;
Voidious03:29, 25 March 2013
 

OK, you've convinced me, I've added the quotes :-p I've also added "rank" to both Rankings and BotDetails, and the sort works too.

Skilgannon07:02, 25 March 2013
 

Nice, thanks! I turned off my quoting regex this morning and it parsed fine.

Voidious21:18, 25 March 2013
 

I didn't think adding qoutes would be that hard if you don't have to escape anything.

Chase05:08, 25 March 2013
 

I would like a query which returns the full pairing matrix, with scores from all pairings. Number of battles from all pairings would also de nice.

Probably the costliest query anyone could ask.

MN15:14, 25 March 2013

Unfortunately, that would take more memory than I have available in a frontend, and frontends only give me 10 seconds to respond or the process gets killed. Besides, all I'd be doing in the background is doing a Rankings query to get all N of the bot names, then N x BotDetails queries to get all of the pairwise data, so it's nothing that you wouldn't be able to do just as effectively with the tools you already have.

If you do something like this, I'd ask that you put a half-second delay between all of the BotDetails queries just to leave spare capacity for client uploading.

But I agree, it would be quite interesting to see what trends etc could be found in the data.

Skilgannon15:22, 25 March 2013
 

It would be used mostly to try different ranking systems over the scores, and to build custom priority battles systems.

MN15:40, 25 March 2013
 

For building batch rankings systems I can understand, but it would take too long to do priority battles, which need to be run much more regularly.

Skilgannon15:44, 25 March 2013
 

Hmm, getting an error from API calls for Rankings, but not from the normal pages: [1]

Voidious15:27, 26 March 2013

Ah, thanks. Flags are up now, and the API returns the flag code as well, which is how that error crept in (index array out of bounds, now that I have another element).

Skilgannon15:32, 26 March 2013
 

Cool, I'll get the flags into the archived rankings next time. Thx for the quick fix.

Voidious15:34, 26 March 2013
 
 

links in new tab

I know this is about as low priority as it gets, but... The always opening a new tab when I select a rumble annoys me. =) I'm pretty conscious of ctrl or shift clicks if I want a new window or tab, so I'd rather just keep control of that myself.

Voidious21:19, 25 March 2013

I'm also starting to get annoyed by it, I thought it might improve useability not having to click 'back' the whole time. It is now gone.

Skilgannon22:02, 25 March 2013
 

Please switch clients over to 1.8.1.0 to take advantage of priority battles for mini, micro and nano as well as fix the skipped turns issues.

1.7.4.2 will be disabled in the next few days.

Skilgannon07:33, 25 March 2013

Down (for a few hours)

I was playing around with the backend and restarted it a few times, which chowed my new (increased) daily quota pretty much instantly. So just hang in there for another hour and a half and it will be back to normal, I promise.

Skilgannon06:44, 24 March 2013

Aaaand it's back.

Skilgannon08:17, 24 March 2013
 

MOAR CLIENTS

Guys, if anybody has some spare CPU-power lying around, could you point it at the 1v1 and melee rumbles? It seems that I was getting bad stability in the mini/micro/nano rumbles, and I've just fixed that, but we have a bunch of pairings which need to be filled in. I also want to see how it handles a higher load, so I know whether it is worth upgrading to paid. Thanks!

Skilgannon21:12, 20 March 2013

Sure, I'll go from 1 to 4 in a couple hours. ;)

Voidious21:14, 20 March 2013
 

Seems to handle my four clients ok so far, running mostly/all nano battles too.

Edit: Oops, not all nano battles. But they're running pretty fast. :-)

Voidious23:33, 20 March 2013
 

Ok, looks like we managed to overload it. ;)

Voidious03:15, 21 March 2013
 

That's from hitting the Read quota. That hasn't happened before, it might have something to do with me clearing the memcache so that I could enable priority battles for mini/micro/nano. The reset is in an hour, let's see what happens ;-)

Skilgannon07:19, 21 March 2013

Are there priority battles happening for the lower weight classes? Even Mini 1v1 mostly doesn't have full pairings. Was it just way behind or is it not working correctly?

Voidious21:59, 23 March 2013
 

The client doesn't accept priority battles from the lower weight classes. I've filed a bug and Fnl has fixed it, so as soon as 1.8.1 is released we'll be switching to that.

https://sourceforge.net/p/robocode/bugs/355/

Skilgannon06:10, 24 March 2013
 

I've made the plunge, LiteRumble is now on the paid tier. So, open the taps and let's see what this puppy can do!

Skilgannon08:30, 21 March 2013
 

Sweet! Let me know if there's somewhere (eg PayPal) I can drop a donation. :-)

Also it's probably time we update a bunch of wiki pages to note the LiteRumble as the main rumble server.

Voidious14:13, 21 March 2013
 

LiteRumble 1.7.4.2 superpack

I created a fresh "superpack" pointing at Skilgannon's rumble server: LiteRumble 1.7.4.2 superpack (31.5 MB) ... We can put this somewhere more prominent if we want to make the transition more official.

I fixed the participant list links to darkcanuck.net to point to Rednaxela's archive instead and downloaded most bots for all the rumbles in the above zip. There are still some stragglers with broken links that we should fix on the participants pages. I also set all the configs to ITERATE=NOT and updated the shell scripts to loop, which is my preferred setup. Just change your name at the top of roborumble/*.txt before running.

I don't know how many clients Skilgannon's server can handle - last I heard it was just a few. For now I'm just running 1 client for General 1v1.

Voidious18:45, 16 March 2013

Thanks. My client is in my university lab and I need the processing power during the day for the next few months at least, so feel free to run as many as you like. It should also have a more graceful quota-exceeded behaviour than it used to, but we'll actually have to hit that before I can be sure.

Skilgannon21:05, 16 March 2013
 

If you remove the CPU constant from the robocode.properties file, won't it automatically recalculate it? If so, might be a good idea to remove the CPU contant from the robocode.properties file in the superpack. Otherwise, anyone who runs the superpack without thinking about updating the CPU constant will be running with the CPU constant packaged in the superpack.

Skotty20:28, 22 March 2013
 

Ah, good call. I'm not sure it works that way but I'll test it. I'll also shore up the missing bots.

Voidious20:29, 22 March 2013
 

Alternatively what about just outright removing robocode.properties? That's what I always did when I made superpacks.

Rednaxela20:45, 22 March 2013
 

Vote Ranking

I gotta say, I love the Vote ranking. Gives you a completely different perspective on the rumble.

Chase11:48, 22 March 2013

Yeah, I really like it as well. I'm thinking of doing away with (A)NPP though, it seems a bit redundant, and it uses tons of memory to calculate.

Skilgannon12:09, 22 March 2013
 

I have no problem with that. But I think we should get some other opinions on it.

Chase12:20, 22 March 2013
 

ANPP is actually the one I love =), but I'm fine with dropping it if you want. I find it pretty meaningful in the GigaRumble.

Voidious14:20, 22 March 2013
 

Hmm. My main concern with it is that it requires me to do a full nxn grid of scores, which is taking up a full GB of RAM on the backend, and causing soft-kills. If I could somehow implement it as an incremental score, which is updated along with APS, I wouldn't mind it so much.

At least with the KNN-PBI I only need sqrt(rumblesize) in memory at once.

Skilgannon14:26, 22 March 2013
 

Pretty sure you could do it without nxn grid. You could do min/max for each bot just loading one row (n) at a time. Then ANPP vs each bot the same way. Then each bot's avg with another pass one row at a time. Unless even that process requires nxn with the code/data model. I certainly don't see anyway to do it drastically quicker, like on par with APS.

It's still probably the most CPU intensive and least useful of the rankings, so I totally agree it's a good candidate for the axe.

Voidious15:59, 22 March 2013
 

I think incrementally updated ANPP could actually be very fast most of the time if you had a table to cache the highest and lowest scores against each bot. Most of the time, the bot holding the highest/lowest score against another won't change, so those times only the bot with the newly submitted score would be affected. It could also be made less resource intensive by not including a bot in ANPP until it's pairings are complete, further reducing how often the min/max score against it changes. If the highest/lowest score changes, it affects the resulting ANPP score of all other robots, but that update could be done with a low memory footprint I'd think.

Rednaxela16:10, 22 March 2013
 

Yeah, I was trying to think that through... Agree that obviously min/max change is fairly rare and that's a fast case if it doesn't change. If min/max does change, you need to recalculate everyone's ANPP vs that bot. Then you need to update everyone's overall ANPP, but that doesn't need to be a complete recalculation, just (((overall * numBots) - oldScore + newScore) / numBots).

Voidious16:16, 22 March 2013
 

Is Vote determined by how many opponents a robot gets top score against? I vaguely recall something like that. Kind of interesting to see the shielders near the top when sorted that way.

Skotty19:27, 22 March 2013
 

Yeah, my next comment is that LiteRumble could use some info on what all these crazy rankings are. :-)

Vote is what % of bots you are the best against. (Each opponent "votes" for its worst matchup.)

Voidious19:29, 22 March 2013
 

I like the Vote ranking. It finally gives exploitative bots (like Epeeist) the recognition that they deserve.

Sheldor20:31, 22 March 2013
 

I like it too - Diamond is #1. ;) Of course I presume EnergyDome steals quite a bit of score from DrussGT.

Voidious20:39, 22 March 2013
 

non-Robocode LiteRumble usage

Been thinking I'll probably try a LiteRumble instance when I get to creating a BerryBots Rumble. I was wondering if you could say, at a high level, if there's much Robocode-specific about it? Just looking for some general info, I'm happy to tackle the details myself when I get into setting it up.

I know obviously the protocol is whatever the RR client uses, and the scoring is sorta Robocode-centric. But that shouldn't be too hard to adapt if I need to. I guess the next thing that comes to mind is any CodeSize related stuff? I'm starting on the control API stuff soon, which would enable writing a rumble client, and someone posted the first public user-created bot on the forums today (woohoo!), so I might experiment with it soonish.

Thanks!

Voidious19:27, 22 March 2013

It doesn't have anything Robocode dependant, and doesn't even know that the roborumble is more related to the mini/micro/nano than, say, the meleerumble. The homepage does a manual categorisation based on whether they are 1v1, teams, melee etc., but that's as far as it goes. You could even create, say, a 'berryrumble' on the same server if you wanted.

Skilgannon19:42, 22 March 2013
 

Awesome, thanks!

Voidious19:50, 22 March 2013
 

Contributor stats

Aw, I saw in the diff you removed the "in case I ever get" line and thought maybe that meant you'd added it. ;) Though I'd probably still be a ways behind you for a while...

Voidious18:04, 22 March 2013

I wasn't keeping track of who anything was from, so it would start from scratch =)

I'm thinking of something like "who contributed in the last hour", or something along those lines, just so that it's easy to see if anything needs running.

Skilgannon18:06, 22 March 2013
 

Individual Battle Scores

Is there any way to see individual battle scores in the LiteRumble? I didn't see any way to do that. It is the one thing I would really miss about Darkcanuck's server if we only had LiteRumble in the future. I find individual battle scores to be of great use from time to time (e.g. if you have a score of 62 , it makes a big difference if the individual scores are 61, 63, 62 or 86, 88, 12). And what client gave that score is also of use. It can help identify Robocode version issues. For example, just go look at the history of battles for Krabby2, and it becomes obvious something broke in Krabby2 after Robocode 1.7.3.0+ came into use.

All that aside, if Darkcanuck's server ever goes away, LiteRumble as it is would be far preferable to no rumble at all.

Skotty20:48, 12 February 2013

Unfortunately it only keeps the averaged result and the number of battles. If I changed it to keep pairings as well I suspect that each bot entry might exceed 1MB, which would break Google App Engine rules. I've considered the possibility of keeping 'min' and 'max' though, would that satisfy you?

Skilgannon08:14, 13 February 2013

That would be nice. Or maybe some kind of confidence or standard deviation or something. Not a big deal, but I would use it if it were there.

Skotty06:42, 14 February 2013
 

If you persist each battle independently, without associating it with other records (no foreign key constraints), then the only limit is total database size.

MN15:03, 13 February 2013
 

It would also increase total database writes, which is my current bottleneck.

Skilgannon15:06, 13 February 2013
 

server load?

Any idea how much load your server can handle now? Just curious... Still thinking about the feasibility of switching to a new rumble server.

Voidious17:42, 12 February 2013

If I upgrade to paid it would probably handle as much as Darkcanuck's server. As it is it is mostly limited by number of database writes, probably 4 1v1 clients and 2 melee clients at i5/i7 execution speeds would be the limit. If I/we upgrade I could also tune the caching so that it loses less data.

I'm actually looking at doing a Java implementation just because the number crunching is so much faster. Also now that I've done the Python one it would just be a translation and not an entire rewrite.

Skilgannon19:47, 12 February 2013
 

NPP over 100?

Does it mean you have a bug if I have NPP over 100 against some bots? [1] Or are the ranges only recalculated at some interval, so maybe I scored higher than the previous max score against that bot?

Voidious05:26, 13 January 2013

Because the different halves of each pairing are stored separately, each might have a slightly different idea of what its score is against each other because one of the scores being dropped from cache before it is written to disk. Now that I'm not trying to complete pairings for Melee and Roborumble I might as well reduce the caching to less dangerous levels. I've also considered fixing this in my batch processing by taking averages, but never got around to it.

Skilgannon12:15, 13 January 2013
 

running gigarumble

does someone has a roborumble.txt for the gigarumble available? Or can someone run the gigarumble, I want to know if I finally leave the the last spot now that I have entered the top-10 in the normal rumble.

GrubbmGait20:13, 10 January 2013

Sure, I'll post my config this evening when I'm home (like 3-4 hours from now). I think I'm the only one that was running it. I was partly just experimenting with LiteRumble, but then I really liked GigaRumble. =)

Voidious20:54, 10 January 2013
 

I can run it if someone tells me how to configure roborumble.txt.

I didn´t even know it existed, much less having a bot in it.

MN21:00, 10 January 2013
 

Ok, given the skyrocketing level of interest in the GigaRumble ;), I SSH'd home and uploaded my gigarumble.txt: [1]

Voidious21:10, 10 January 2013
 

I've got one (fairly weak) client going now. Also note that you need to use 1.7.4.2 for LiteRumble - Skilgannon has it rejecting all other Robocode versions.

(Also I fixed the wiki, as you may have noticed.)

Voidious03:06, 11 January 2013
 

Also run 2 clients on i7, but do it in last moment and do not ensure, that results loading works

Jdev04:33, 11 January 2013
 

Oh, now I remember why I only had 1 client configured - Skilgannon asked me not to run more than that because he was already close to max load with his LiteRumble clients. Not sure if that is still true. GigaRumble clients are much slower, anyway, so it may not be much of an issue.

Voidious04:58, 11 January 2013

Sorry, I can stop them only 8-10 hours later...

Jdev05:05, 11 January 2013
 

Leave them running - I'm not running my other clients so it is fine.

Skilgannon06:18, 11 January 2013
 

Congrats, GrubbmGait! [1] =)

Voidious15:37, 11 January 2013
 

Thanks! It is a nice jump from #30 to #25. Improvement against all except 2 bots, I feel like Dookious is in reach in the normal rumble. ;-) And thanks for running your client, I have not been home yet.

GrubbmGait17:09, 11 January 2013
 

It is a relatively minor issue. But it is driving me crazy.

I think the page would be better served with at least some minor CSS. Here I have some CSS-1 only. All browsers (except text based ones) implement it exactly the same. If you set cellspacing="1" on the table html, it should look very similar (viusally) to the normal roborumble page.

Replace the bgcolor attribute tags with class="even" or class="odd", depending.

body {
	font-family: helvetica, arial, sans-serif;
}
table {
	background-color: #D3D3D3; /* LightGrey */
}
table th {
	background-color: #EEEEEE;
}
table th a {
	color: black !important; /* !important, this keeps it from having odd colors on visited links */
	text-decoration: none;
}
table td, table th {
	padding: 2px 4px;
}
table th {
	padding-right: 1em;
}
tr.odd {
	background-color: #F8F8F8;
}
tr.even {
	background-color: #FFFFFF; /* White */
}

If you are okay with CSS2, all of the html formatting attribute tags can be hacked out. (Just let me know).

Chase-san22:31, 24 October 2012

I don't really have time for editing this right now, but if you fork and edit I'm happy to merge your pull request =) The latest code is up: https://bitbucket.org/jkflying/literumble/overview I apologise in advance for my terrible coding style, this was a "learn python and Google AppEngine" project for me. If I was starting from scratch I would have made several changes in the structure and how much was abstracted, but this grew slowly and is very hackish inside.

Skilgannon14:34, 25 October 2012

Pull request sent.

Chase-san17:14, 27 October 2012
 

Cool, thanks. I've merged it in and uploaded. I had to fix a few small bugs - html_header should be structures.html_header, and I needed to add style.css to app.yaml but otherwise perfect. Thanks, and the changes are now live =)

Skilgannon18:21, 27 October 2012
 

Ahh okay, I have never done python before, so I was just guessing mostly on everything was suppose to go. I know only the basics of how it is 'suppose' to work. That and google. I used google a lot.

Chase-san02:41, 28 October 2012
 

Actually there is another bug which is completely my fault. Can't sort compare by APS or Survival since I renamed the columns to be APS (A) and so forth, where it expects A APS.

Also reading your logs you fixed a bit more then that, I tried to fix most the html problems. But since I couldn't really test, I was fixing it blind.

Chase-san02:53, 28 October 2012
 

fixed!

Skilgannon08:31, 28 October 2012
 

It looks much better now. To me anyway.

Chase-san05:47, 29 October 2012
 

Although I'm not sure why it needs to be CSS instead of plain HTML. It's not like the HTML adds much code size overhead, and surely it prevents the browser from having to look up all of the cell values each time. bgcolor=F8F8F8 vs class="even" isn't much different, and these pages are completely dynamic so it's not like having an abstracted reference makes it any easier to maintain. If it can be done without CSS I'd prefer it that way.

Skilgannon14:54, 25 October 2012
 

Almost all the old html formatting tags and attributes have been deprecated since 1999 (I think <b> and <em> were still valid).

You could go and edit your code/html if you wanted to alter the appearance. But it is very likely it is easier to edit a CSS file, even if your html is dynamically generated.

I'll see about forking and editing it.

Chase-san15:05, 27 October 2012
 

Client Version

I've updated the code to reject clients other than 1.7.4.2, as it has quite a few improvements for the client over 1.7.3.[0|2]. These are:

  • Timeouts for slow connections, instead of hanging the client
  • GZip http compression for retrieving the ratings file
  • Always running a bot that isn't in the ratings file if it shows up in the participants list - even if there are priority battles waiting. 1.7.3.0 would always run bots below 2000 battles, but this doesn't help, for instance, in the Gigarumble, while 1.7.3.2 ignored the 2000 battles limit and when no priority battles were available would run random battles until it stumbled on the new entry. This sometimes caused many hours wait before the server became aware that a new bot was available, now it is immediate.

I also fixed a bug in my code which prevented the priority battles from working correctly after all of the pairings were complete, instead just giving random battles, so there should be much faster post-pairings-complete stabilisation, and a much more even distribution of battles from pairing to pairing.

Skilgannon06:42, 23 August 2012

Cool, I updated this morning. Seems like Diamond likes 1.7.4.2 - Diamond 1.8.22 is a pretty minor change that had no impact in the RoboRumble, and I guess it could still come down quite a bit, but +1.25 after 100 battles is a pretty strong start. [1] DrussGT is up from 2.8.2 to 2.8.4 with the client version switch, too. [2] I wonder what's up with that? I guess we should wait a bit before thinking about it too much.

Voidious21:40, 23 August 2012
 

Prettifying + Bot Comparisons

I've added a simple color scheme as well as the comparison page. I still need to add something for easily selecting previous versions of the same bot, but most of the work is there. Take a peek!

Skilgannon16:41, 24 July 2012

Nice, looks good!

Voidious17:01, 24 July 2012
 

I've added a quick way to get to comparisons of previous versions from the details page, as well as details stats on the bot's scores.

Skilgannon16:28, 25 July 2012
 

Lost Pairings

I recently made a transition over from using gzipped pickled Python objects to gzipped json'ed Python dictionaries, and somehow managed to run into an error with the main Rumble object, which was deleted each time it was uploaded to. After that all the uploads that happened had their respective bots pairings deleted. I managed to catch it fairly quickly, but about half the bots lost a good portion of their battles, so it's back to the waiting game again. Fortunately after the changes the server uses less memory, so I can do more aggressive caching and have less accidental out of memory shutdowns =)

I also need to whip up a bot comparisons page to compare versions... this or next week I think.

Skilgannon16:59, 8 July 2012

Are you still pushing the limit on how many clients you can support? I've now got a heck of a lot more RoboRumble firepower, so let me know if you want me to point some of it at your server.

Voidious17:10, 9 July 2012
 

I was running Melee clients at the same time, so I've stopped those now and they're running 1v1. That puts me at 6 clients, which I think is all I can handle. Perhaps I'll upgrade to paid someday when I get more annoyed with the limits =)

Skilgannon17:31, 9 July 2012
 

Trying New Rumbles

Do you mind if I try submitting results for some different rumble configs? I could setup my own instance of this at home or something if you'd rather. I'm thinking of finally seeing what a StrongestBotsRumble might look like, or a PerceptualRumble.

Voidious18:55, 6 June 2012

No problem. Just don't put more than one client on at the moment, or it will probably go over write quota. I'm currently working my way up through all the pairings of the roborumble with 4 clients on an i5, and filling out pairings for the first time is probably the worst as far as writes go because every single bot gets writes evenly, so caching between writes doesn't help particularly. Because of this I currently have caching on quite aggressively, I will lower it in a while. The problem with excessive caching is that occasionally the changes made in the bots get evicted from memcache and the frontend instance gets shut down/cycled, so some battles might be lost. Don't worry, everything will stabilise in the end, and it should be robust to any problems with lost battles on one pairing but not the other, etc.

If it does hit write quota it will reset at 07h00 UTC. You will start getting messages from your client that writes are failing, again don't worry, they will get written once the quota rolls around if they are still available in memcache or instance memory. Batch rankings (currently just Vote, ie. BestBot) get recalculated at 22h00 UTC.

Here's a sample client config: roborumble.txt

Enjoy!

Skilgannon00:07, 7 June 2012
 

Btw, what's your goal with filling out the 1v1 pairings? Just as a test to see how they compare to Darkcanuck's, or are you planning to try and keep this instance up to date with new RoboRumble activity?

Voidious18:54, 7 June 2012
 

Once they're full I want to try out some different scoring mechanisms - in particular I want to try out my Average Normalised Pairs Percentage. I also want to see what the KNN-PBI looks like for the main rumble. I'm not sure I want to keep a ~900 bot rumble permanently updated - it would eat into the free quota quite a bit. The same with the melee rumble. They are better suited for Darkcanuck's server IMO.

I think it would be more interesting to have a few others, like the PerceptualRumble or TripleDuel/TwinMelee, but I don't want to run any rumbles with more than 100-200 bots on a long term basis. They are just too slow to stabilise, and chances are that the majority of the bots have been abandoned anyways. Perhaps the next step is to write an app that serves the participants list as a FIFO - automatically kicking out old bots as new ones are entered (although not counting versions as new bots).

Skilgannon08:04, 8 June 2012
 

Nice work and some thoughs

Nice work!

Not using JavaScript sorting made it easier to link sorted tables from other pages. Also, with a melee database reset, getting rid of battles with retired bots may change the rankings. And you put %wins scoring. =D

But I miss some kind of Condorcet ranking. PL was the only one we had, and the one Combat was doing best.

Also miss some kind of statistical ranking. Elo was what we had and allowed fun statistics like problem bot index, specialization index and that non-working JavaScript diagram. Mirror bots and ram bots will lose some of their appeal without those statistics.

I tried to raise a RoboRumble server in App Engine a long time ago, but they didn´t allow me into the free tier. :(

MN17:57, 27 May 2012

My %Wins is a bit of a cheat. It is just 1 point per PL win, divided by bots in rumble. I prefer it to PL because it is not dependant on the number of bots in the rumble. So if Combat was doing well in PL, it should do well in %Win.

I'm still not using my Backend for anything, so I was thinking that once a day I could use it to generate some sort of pseudo-problembot stats stuff. ELO/Glicko is nice, but it is really designed for being good approximations when pairings are missing. In our case, the pairings are fairly easy to fill, so that isn't a problem; APS tends to converge to the same ranking order, and it isn't full of voodoo that makes it difficult to comprehend. It is also possible to correct APS easily if results get lost due to being in memcache =)

One ranking idea I had an idea for was doing a The Best Bot calculation (get a point for being the best against any competitor). It would increase my number of database writes in the only reasonably robust/non-batch way I can think of, which is what is holding me back at the moment. I could use a Backend for calculating it once or twice a day, I guess, or make it expire once every 6 hours and be triggered by a page load. It needs n*n runtime. Maybe I can fit it into the regular rankings calculations.

The hardest part is getting the rumble to stay in the free tier. I think it will be limited to about 6 melee clients in total, or maybe 12 1v1 clients instead (less pairings per battle in 1v1).

Skilgannon22:58, 27 May 2012

The last time I checked, App Engine offered about 1GB database in the free tier. Which is enough to store all pairings and all uploaded battles, as long as you delete data from retired bots once in a while.

As for the amount of clients the server can handle, it should not really be an issue, since there are usually 3 to 4 simultaneous clients at most.

If you want to use some batch processing, adding a Ranked Pairs ranking would make my day. I has O(n^4) complexity, but I think it can still fit inside the 10 minutes window from cron, so no need for a backend.

MN01:05, 28 May 2012

The problem isn't so much total storage space, but that I'm limited at 50k writes per day. Each bot counts as 2 writes, so effectively I have 25k updates I can do. I've figured out a caching scheme so that each melee battle comes out to 10 updates (1 per bot) instead of 45 updates (1 per pairing). I also need to update the total rumble battles count and the user upload count, so that leaves a bit of overhead, meaning I can have ~2000 melee battles per day uploaded.

I'l see what ratings systems are feasible..

Skilgannon17:25, 28 May 2012

Batch updates are more useful in a limited environment like that. Maybe it´s time for a refactoring in the upload protocol (1 update per batch upload), even if it breaks backward compatibility.

MN17:48, 28 May 2012

I've actually figured out a sort of temporary caching between requests where I wait for bots to accumulate a certain number of pairings before pushing them to disk. I don't think it's necessary to re-work the rumble upload protocol yet. One thing I would like the rumble to tell me is how many bots are in a melee battle though. Right now I just have it hardcoded at 10. It would help with my caching if I knew how many they were uploading per battle.

Skilgannon20:36, 28 May 2012
 
 
 
 

Hey that's neat! A quick and lightweight rumble setup could be really useful for tournaments and experiments. You just need a participants list URL, you make up a rumble name, and everything just works? Makes me want to try some new divisions. :-) Though that never seems to gain momentum...

What is App Engine pricing like? I'll take a look. I'd certainly be willing to pitch in some for Robocode related stuff if we needed more horsepower.

Voidious18:19, 28 May 2012

There is one division I would like to see. A twin melee rumble (like 5 teams of 2 bots each). Joining concepts of both melee and team/twin.

MN19:18, 28 May 2012
 

I like the idea, but I think it would be so crowded that it would pretty much reduce to melee strategy. (Having to fight off 8 other bots with 1 ally out there somewhere is not much different than fighting off 9 other bots.) Maybe 3 teams of 3? I've thought about MegaBot TwinDuel for a while...

Voidious19:40, 28 May 2012

I think megabot TwinDuel would be awesome! Although it might reduce to wavesurfing quite quickly. I'd also be interested in a TriDuel - a 3 vs 3. I think having that extra bot will completely change the dynamics compared to twinduel, and make surfing much harder.

Skilgannon20:21, 28 May 2012

Maybe split teamrumble into categories?

5 bots (teamrumble bots)

4 bots (DeltaSquad)

3 bots

2 bots (twin duel bots)

Teams with fewer bots can compete in categories with more bots but not the opposite.

MN20:56, 28 May 2012
 

Imagine 2 melee bots using minimum risk movement (dominant in melee), and 2 bots using provocative movement (dominant in twin duel). The 2 melee bots will be each on a different corner, but the 2 with provocative movement will be on the same corner ganking on the lonely melee bot. But at the same time, 3 bots close together become tasty targets for swarm targeting from other 3 teams.

There must be a balance between minimum risk and provocative movement, or a third undiscovered strategy. Maybe there is still room for inovation.

MN20:35, 28 May 2012

Sure, but that's assuming both bots on both of those teams survive to the final stages of the round, which seems unlikely. And even if both bots on one team survive that long, I think how much energy they've retained from the "pure melee" early stage of the round will be the most important factor. Maybe on a bigger field than 1000x1000, and/or with 3-4 teams instead of 5?

Voidious21:10, 28 May 2012

It assumes ganks in the middle of a battle weakens "pure melee" strategies somewhat. Although not in the same way as in twin duel.

With 3 teams, I believe "shooting the team with lowest energy" 2x1 strategy will dominate. One team is eliminated almost on luck, and the battle is decided between the remaining 2. It happens in most 3 player games.

There is a catch though, since the API doesn´t tell you which bots from the opponents belong to the same team. Which is not a problem in either meleerumble or teamrumble. But estimating it in team melee might be worth. This alone may change the game significantly... or not.

A bigger battlefield or 4 teams seems nice. I thought of 5 teams of 2 bots each to keep the 10 bots total from meleerumble/teamrumble, and 2 bots per team from twin duel. And see strategies from all 3 divisions clashing against each other.

MN21:50, 28 May 2012
 
 

Any of these divisions sounds pretty interesting to me. I think the main hurdle is just getting that first person to write up a 3x3 team or add TwinMelee support to one of their bots. =) Nobody wants to commit the time if nobody else is going to compete, but if someone just does it, I bet others would follow suit...

I'm kind of caught up in my Diamond refactor right now, but maybe I'll make time for something fun soon. ;) Or try running a PerceptualRumble client just for kicks.

Voidious21:40, 29 May 2012

Combat can perform okayish in almost any battle setup, having to change the packaging only. With the exception of handicapped setups like restricted codesize, PerceptualRumble or ExtendsRobot.

MN19:12, 30 May 2012
 

Hmm... all of those divisions do sound interesting to me too. Now it has me thinking about how best to adapt the LunarTwins/Polylunar strategy to a bit different formats...

Rednaxela13:57, 1 June 2012
 

Yeah, my thoughts are that something like this would be perfect for school/lab/office tournaments. Just give it a new name in the client, set up a participants list somewhere and away you go.

In the free tier I'm not really going to run out of disk space any time soon, a rumble of 300 bots comes out at around 2MB, it's the database writes which are the killer. From what I can tell, App Engine pricing starts at $2.10 a week for the minimum paying tier. That gets you quite a bit more quota than the free tier, which probably should be enough for everything, pretty much forever, without crossing that $2.10 limit. For now I'm going to see how much I can push the free tier, though.

I still have a bunch of optimisations I need to make - like not pulling all of the rumble data into memory just to serve the rankings page (it's all cached, doesn't affect my quota, just speed) - which should make it more snappy both on the main rankings pages and on the RatingDetails page the RR client queries occasionally.

A hidden feature: if you add timing=1 as an argument into your GET for any of the pages it summarises the timing breakdown for CPU usage at the bottom of the page and lets you know how many bots were pulled from cache vs. from the datastore.

Skilgannon20:18, 28 May 2012
 
 

Problem Bot Index

I'm not sure, but I think PBI with Elo/Glicko was based on some magical formula between the ratings. Maybe an APS-based measure would just be the average score your neighbors get against that bot. So like if you're ranked #25, the average score against bot B for ranks 15-35 is your expected score.

Of course, if you're #1, you can only go from 2-11, but that's probably still useful info. And in that case (or in every case), you could shift everything so your average PBI is still 0.

Voidious21:17, 30 May 2012

PBI is the difference between expected score and real score. The expected score is based on difference between ratings.

Zero difference is 50% expected score
-800 difference is 1 to 20 odds, which is 1/(20+1) or 4,76% expected score
-1600 difference is 1 to 20^2 odds or 0,25% expected score
-Infinite difference is 0% expected score
MN23:23, 30 May 2012
 

Yeah, I was trying to think of a way of handling this elegantly without having to resort to a KNN type lookup or doing a whole ELO calculation. I was thinking something like:

Expected_for_bot_a = (bot_a_APS + (100 - bot_b_APS))/2

Eg: If BotA has APS of 70% and BotB 30% it predicts the 70%, 30% which seems intuitive to me. If BotA has APS of 80% and BotB 80% it predicts the 50%, 50% perfectly. If BotA has APS of 80% and BotB 60% it predicts 60%, 40%, which seems OK.

I think the trouble with this is that it assumes that there is a linear relationship between average score and pairwise score. I think it is more of a sigmoidal relationship, because once you have taken out the low hanging fruit there is less increase to draw from. Because of this I think a modified version of the above formula, something like: Expected_for_bot_a = ((bot_a_APS^Q + (100 - bot_b_APS)^Q)/2)^(1/Q) for some magic value of Q would probably be a better fit.

I've added a simple 'Vote' rankings page, where each bot votes for their worst pairing. The majority of bots don't get anything, predictably, but this is interesting for use in comparing who does the best. Again, this is a winner takes all ranking, so makes no differentiation between the bot that got 79.9% and 50% against another, where the worst pairing was 80%, and this makes me uncomfortable as there is clearly lost information. Perhaps I should change it so that every bot gets a vote of weight 100*pair%/worst pair%, but I'll leave it as it is for a day or so.

The batch pairings get updated once an hour for any rumble which has had battles since the last batch run.

Skilgannon09:40, 31 May 2012

If you try to figure out a sigmoidal relationship, you will eventually end with the same logistic distribution used in Elo and Glicko.

MN19:43, 31 May 2012
 

Thinking on this more, I actually really like the KNN idea. It's the only one that really tells you "you can and should be doing better against this bot", as opposed to "this bot might just have a weird score profile". (RamBots are the perfect/extreme example of this - they can show up as Problem Bots even if you're doing well against them.)

I know when I'm trying to figure out who I could do better against, I don't look at PBI, I compare to DrussGT. ;) I understand it would be a lot of calculations, but it should still be simple to code up, and it's all just basic math operations.

Voidious14:16, 31 May 2012
 

Another thought is, if you already have the best score vs any bot, a useful number might be that score minus your score. Calling it "PBI" would be a misnomer, but It tells you how much room you have to improve.

Voidious14:21, 31 May 2012
 

If you look at the site, you might just notice the errors ;-) That's because I ran out of Datastore Read quota. I think it's because of the batch rankings - before them I never even got to 20% of read quota. So I've changed batch rankings to every 6 hours, so in about 17 hours the quota will reset and we can see how it works =)

Since I'm only doing updates once every 6 hours I should have lots of quota for long, tedious calculations. So I'll whip up a KNN-based PBI over the next few days to see how it does. Any ideas on how to calculate K? How about sqrt(participants)?

It seems we have similar ideas about 'max improvement indexes'. Thinking further on my comment above about my %pair/(%worst pair) idea, I'm thinking about an interesting new ranking system that I'd like to call 'Average Normalised Percentage Pairs' or ANPP. Each bot normalises all of their pairings by subtracting the min score and then dividing by (max - min). Your score is than calculated as the average of your pairing against each (100 - enemy normalised score). Thus, if the best anybody does is 75% against a rambot, and the worst anybody does is 30%, 30% will be treated as 0 and 75% treated as 100%. This would make it very easy to see problembots, as if your NPP against them is less than your average NPP, you should focus on them more. Thus, the worst bot against everybody would get 0%, and the best bot against everybody would get 100%.

Skilgannon14:57, 31 May 2012
 

Just thought I'd say... I rather do like the notion of KNN-based "expected score" system. The sigmoidal relationship given by Elo/Glicko is a reasonable fit for predicting score based each bot's overall rating, but it does really miss the sort of interesting subtitles/patterns that a system that considers multiple axis of strength would.

Rednaxela13:49, 1 June 2012
 
First page
First page
Next page
Next page
Last page
Last page