View source for Talk:LiteRumble
- [View source↑]
- [History↑]
Contents
Thread title | Replies | Last modified |
---|---|---|
API | 14 | 15:34, 26 March 2013 |
links in new tab | 1 | 22:02, 25 March 2013 |
1.8.1.0 | 0 | 07:33, 25 March 2013 |
Down (for a few hours) | 1 | 08:17, 24 March 2013 |
MOAR CLIENTS | 8 | 06:10, 24 March 2013 |
LiteRumble 1.7.4.2 superpack | 4 | 20:45, 22 March 2013 |
Vote Ranking | 11 | 20:39, 22 March 2013 |
non-Robocode LiteRumble usage | 2 | 19:50, 22 March 2013 |
Contributor stats | 1 | 18:06, 22 March 2013 |
Individual Battle Scores | 4 | 06:42, 14 February 2013 |
server load? | 1 | 19:47, 12 February 2013 |
NPP over 100? | 1 | 12:28, 13 January 2013 |
running gigarumble | 10 | 17:09, 11 January 2013 |
CSS | 9 | 05:47, 29 October 2012 |
Client Version | 1 | 21:40, 23 August 2012 |
Prettifying + Bot Comparisons | 2 | 16:28, 25 July 2012 |
Lost Pairings | 2 | 17:31, 9 July 2012 |
Trying New Rumbles | 3 | 08:39, 8 June 2012 |
Nice work and some thoughs | 18 | 13:57, 1 June 2012 |
Problem Bot Index | 7 | 13:49, 1 June 2012 |
First page |
Previous page |
Next page |
Last page |
Ok, I have a simple API up. Just add &api=True to any Rankings or BotDetails page and it will return in nice easy-to-parse JSON-ish format. I say JSON-ish because real JSON puts double-quotes around everything, whereas I'm lazy and don't feel like doing a ton of double-quote escaping, and besides, we don't have commas or colons in our data so there isn't any risk.
If you don't want the entire pairings detail from the BotDetails, add &limit=0 to the page and it will leave them out.
Some usage examples:
http://literumble.appspot.com/Rankings?game=nanorumble&api=True
http://literumble.appspot.com/BotDetails?game=nanorumble&name=sheldor.jk.Yatagan%201.0.5&api=True
Of course, I'd rather you call Rankings once than call BotDetails with &limit=0 985 times, because although it doesn't generate the JSON, it still has to pull all the data in, which adds up for lots of requests. I'd ask that if you are getting more than 3, rather use the Rankings.
If there's anything else you'd like me to add to the API just ask, if it's already saved in my data it should be easy to whip something up to return it.
Sweet! I'll take a look at updating RR API clients in the next couple days. (That's @roborumble on Twitter, Category:Archived RoboRumble Rankings, RumbleStats, and BedMaker.)
I'll let you know how it goes or if I hit any snags. Hopefully the JSON libs I use don't mind JSON-ish. ;)
Well, the Perl JSON library is indeed unhappy about the lack of quoting, but I think I've got it covered:
$content =~ s/([\{,]\s*)([^:\{,]*):/$1\"$2\":/g; $content =~ s/:([^\[][^,\}]*)([,\}]*)/:\"$1\"$2/g;
=)
Cool, got it working with archived rankings, eg: RumbleArchives:RoboRumble 20130324. This checks back hourly when rumbles don't have complete pairings, but that may be forever without priority battles so I'll kill it in a couple days. =)
Only useful thing that I found missing so far is that Darkcanuck's results came back sorted and had a "rank" field. Seems like you have logic for sorting in there so it'd be nice if you could honor the "order" param like you do for the normal pages, but not a big deal (/shamefully hides his bubble sort). The quote thing wasn't a huge deal for me but that might also screw some people up since I imagine most clients use a JSON parser instead of doing it manually.
And @roborumble, yay: [1]
I'll look at RumbleStats and BedMaker tomorrow. Will be nice that other people can actually use BedMaker now. =) If anyone's interested, I had to make one more tweak to my regex to properly quote the &limit=0 format:
$content =~ s/([\{,]\s*)([^:\{,]*):/$1\"$2\":/g; $content =~ s/:([^\[][^,\}\n]*)([,\}]*)/:\"$1\"$2/g;
OK, you've convinced me, I've added the quotes :-p I've also added "rank" to both Rankings and BotDetails, and the sort works too.
I would like a query which returns the full pairing matrix, with scores from all pairings. Number of battles from all pairings would also de nice.
Probably the costliest query anyone could ask.
Unfortunately, that would take more memory than I have available in a frontend, and frontends only give me 10 seconds to respond or the process gets killed. Besides, all I'd be doing in the background is doing a Rankings query to get all N of the bot names, then N x BotDetails queries to get all of the pairwise data, so it's nothing that you wouldn't be able to do just as effectively with the tools you already have.
If you do something like this, I'd ask that you put a half-second delay between all of the BotDetails queries just to leave spare capacity for client uploading.
But I agree, it would be quite interesting to see what trends etc could be found in the data.
It would be used mostly to try different ranking systems over the scores, and to build custom priority battles systems.
For building batch rankings systems I can understand, but it would take too long to do priority battles, which need to be run much more regularly.
I know this is about as low priority as it gets, but... The always opening a new tab when I select a rumble annoys me. =) I'm pretty conscious of ctrl or shift clicks if I want a new window or tab, so I'd rather just keep control of that myself.
Please switch clients over to 1.8.1.0 to take advantage of priority battles for mini, micro and nano as well as fix the skipped turns issues.
1.7.4.2 will be disabled in the next few days.
I was playing around with the backend and restarted it a few times, which chowed my new (increased) daily quota pretty much instantly. So just hang in there for another hour and a half and it will be back to normal, I promise.
Guys, if anybody has some spare CPU-power lying around, could you point it at the 1v1 and melee rumbles? It seems that I was getting bad stability in the mini/micro/nano rumbles, and I've just fixed that, but we have a bunch of pairings which need to be filled in. I also want to see how it handles a higher load, so I know whether it is worth upgrading to paid. Thanks!
Seems to handle my four clients ok so far, running mostly/all nano battles too.
Edit: Oops, not all nano battles. But they're running pretty fast. :-)
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page.
Are there priority battles happening for the lower weight classes? Even Mini 1v1 mostly doesn't have full pairings. Was it just way behind or is it not working correctly?
The client doesn't accept priority battles from the lower weight classes. I've filed a bug and Fnl has fixed it, so as soon as 1.8.1 is released we'll be switching to that.
I've made the plunge, LiteRumble is now on the paid tier. So, open the taps and let's see what this puppy can do!
Sweet! Let me know if there's somewhere (eg PayPal) I can drop a donation. :-)
Also it's probably time we update a bunch of wiki pages to note the LiteRumble as the main rumble server.
I created a fresh "superpack" pointing at Skilgannon's rumble server: LiteRumble 1.7.4.2 superpack (31.5 MB) ... We can put this somewhere more prominent if we want to make the transition more official.
I fixed the participant list links to darkcanuck.net to point to Rednaxela's archive instead and downloaded most bots for all the rumbles in the above zip. There are still some stragglers with broken links that we should fix on the participants pages. I also set all the configs to ITERATE=NOT and updated the shell scripts to loop, which is my preferred setup. Just change your name at the top of roborumble/*.txt before running.
I don't know how many clients Skilgannon's server can handle - last I heard it was just a few. For now I'm just running 1 client for General 1v1.
Thanks. My client is in my university lab and I need the processing power during the day for the next few months at least, so feel free to run as many as you like. It should also have a more graceful quota-exceeded behaviour than it used to, but we'll actually have to hit that before I can be sure.
If you remove the CPU constant from the robocode.properties file, won't it automatically recalculate it? If so, might be a good idea to remove the CPU contant from the robocode.properties file in the superpack. Otherwise, anyone who runs the superpack without thinking about updating the CPU constant will be running with the CPU constant packaged in the superpack.
Ah, good call. I'm not sure it works that way but I'll test it. I'll also shore up the missing bots.
Alternatively what about just outright removing robocode.properties? That's what I always did when I made superpacks.
I gotta say, I love the Vote ranking. Gives you a completely different perspective on the rumble.
Yeah, I really like it as well. I'm thinking of doing away with (A)NPP though, it seems a bit redundant, and it uses tons of memory to calculate.
ANPP is actually the one I love =), but I'm fine with dropping it if you want. I find it pretty meaningful in the GigaRumble.
Hmm. My main concern with it is that it requires me to do a full nxn grid of scores, which is taking up a full GB of RAM on the backend, and causing soft-kills. If I could somehow implement it as an incremental score, which is updated along with APS, I wouldn't mind it so much.
At least with the KNN-PBI I only need sqrt(rumblesize) in memory at once.
Pretty sure you could do it without nxn grid. You could do min/max for each bot just loading one row (n) at a time. Then ANPP vs each bot the same way. Then each bot's avg with another pass one row at a time. Unless even that process requires nxn with the code/data model. I certainly don't see anyway to do it drastically quicker, like on par with APS.
It's still probably the most CPU intensive and least useful of the rankings, so I totally agree it's a good candidate for the axe.
I think incrementally updated ANPP could actually be very fast most of the time if you had a table to cache the highest and lowest scores against each bot. Most of the time, the bot holding the highest/lowest score against another won't change, so those times only the bot with the newly submitted score would be affected. It could also be made less resource intensive by not including a bot in ANPP until it's pairings are complete, further reducing how often the min/max score against it changes. If the highest/lowest score changes, it affects the resulting ANPP score of all other robots, but that update could be done with a low memory footprint I'd think.
Yeah, I was trying to think that through... Agree that obviously min/max change is fairly rare and that's a fast case if it doesn't change. If min/max does change, you need to recalculate everyone's ANPP vs that bot. Then you need to update everyone's overall ANPP, but that doesn't need to be a complete recalculation, just (((overall * numBots) - oldScore + newScore) / numBots).
Is Vote determined by how many opponents a robot gets top score against? I vaguely recall something like that. Kind of interesting to see the shielders near the top when sorted that way.
Yeah, my next comment is that LiteRumble could use some info on what all these crazy rankings are. :-)
Vote is what % of bots you are the best against. (Each opponent "votes" for its worst matchup.)
I like the Vote ranking. It finally gives exploitative bots (like Epeeist) the recognition that they deserve.
I like it too - Diamond is #1. ;) Of course I presume EnergyDome steals quite a bit of score from DrussGT.
Been thinking I'll probably try a LiteRumble instance when I get to creating a BerryBots Rumble. I was wondering if you could say, at a high level, if there's much Robocode-specific about it? Just looking for some general info, I'm happy to tackle the details myself when I get into setting it up.
I know obviously the protocol is whatever the RR client uses, and the scoring is sorta Robocode-centric. But that shouldn't be too hard to adapt if I need to. I guess the next thing that comes to mind is any CodeSize related stuff? I'm starting on the control API stuff soon, which would enable writing a rumble client, and someone posted the first public user-created bot on the forums today (woohoo!), so I might experiment with it soonish.
Thanks!
It doesn't have anything Robocode dependant, and doesn't even know that the roborumble is more related to the mini/micro/nano than, say, the meleerumble. The homepage does a manual categorisation based on whether they are 1v1, teams, melee etc., but that's as far as it goes. You could even create, say, a 'berryrumble' on the same server if you wanted.
Aw, I saw in the diff you removed the "in case I ever get" line and thought maybe that meant you'd added it. ;) Though I'd probably still be a ways behind you for a while...
Is there any way to see individual battle scores in the LiteRumble? I didn't see any way to do that. It is the one thing I would really miss about Darkcanuck's server if we only had LiteRumble in the future. I find individual battle scores to be of great use from time to time (e.g. if you have a score of 62 , it makes a big difference if the individual scores are 61, 63, 62 or 86, 88, 12). And what client gave that score is also of use. It can help identify Robocode version issues. For example, just go look at the history of battles for Krabby2, and it becomes obvious something broke in Krabby2 after Robocode 1.7.3.0+ came into use.
All that aside, if Darkcanuck's server ever goes away, LiteRumble as it is would be far preferable to no rumble at all.
Unfortunately it only keeps the averaged result and the number of battles. If I changed it to keep pairings as well I suspect that each bot entry might exceed 1MB, which would break Google App Engine rules. I've considered the possibility of keeping 'min' and 'max' though, would that satisfy you?
If you persist each battle independently, without associating it with other records (no foreign key constraints), then the only limit is total database size.
Any idea how much load your server can handle now? Just curious... Still thinking about the feasibility of switching to a new rumble server.
If I upgrade to paid it would probably handle as much as Darkcanuck's server. As it is it is mostly limited by number of database writes, probably 4 1v1 clients and 2 melee clients at i5/i7 execution speeds would be the limit. If I/we upgrade I could also tune the caching so that it loses less data.
I'm actually looking at doing a Java implementation just because the number crunching is so much faster. Also now that I've done the Python one it would just be a translation and not an entire rewrite.
Does it mean you have a bug if I have NPP over 100 against some bots? [1] Or are the ranges only recalculated at some interval, so maybe I scored higher than the previous max score against that bot?
Because the different halves of each pairing are stored separately, each might have a slightly different idea of what its score is against each other because one of the scores being dropped from cache before it is written to disk. Now that I'm not trying to complete pairings for Melee and Roborumble I might as well reduce the caching to less dangerous levels. I've also considered fixing this in my batch processing by taking averages, but never got around to it.
does someone has a roborumble.txt for the gigarumble available? Or can someone run the gigarumble, I want to know if I finally leave the the last spot now that I have entered the top-10 in the normal rumble.
Sure, I'll post my config this evening when I'm home (like 3-4 hours from now). I think I'm the only one that was running it. I was partly just experimenting with LiteRumble, but then I really liked GigaRumble. =)
I can run it if someone tells me how to configure roborumble.txt.
I didn´t even know it existed, much less having a bot in it.
Ok, given the skyrocketing level of interest in the GigaRumble ;), I SSH'd home and uploaded my gigarumble.txt: [1]
I've got one (fairly weak) client going now. Also note that you need to use 1.7.4.2 for LiteRumble - Skilgannon has it rejecting all other Robocode versions.
(Also I fixed the wiki, as you may have noticed.)
Also run 2 clients on i7, but do it in last moment and do not ensure, that results loading works
Oh, now I remember why I only had 1 client configured - Skilgannon asked me not to run more than that because he was already close to max load with his LiteRumble clients. Not sure if that is still true. GigaRumble clients are much slower, anyway, so it may not be much of an issue.
Thanks! It is a nice jump from #30 to #25. Improvement against all except 2 bots, I feel like Dookious is in reach in the normal rumble. ;-) And thanks for running your client, I have not been home yet.
It is a relatively minor issue. But it is driving me crazy.
I think the page would be better served with at least some minor CSS. Here I have some CSS-1 only. All browsers (except text based ones) implement it exactly the same. If you set cellspacing="1" on the table html, it should look very similar (viusally) to the normal roborumble page.
Replace the bgcolor attribute tags with class="even" or class="odd", depending.
body { font-family: helvetica, arial, sans-serif; } table { background-color: #D3D3D3; /* LightGrey */ } table th { background-color: #EEEEEE; } table th a { color: black !important; /* !important, this keeps it from having odd colors on visited links */ text-decoration: none; } table td, table th { padding: 2px 4px; } table th { padding-right: 1em; } tr.odd { background-color: #F8F8F8; } tr.even { background-color: #FFFFFF; /* White */ }
If you are okay with CSS2, all of the html formatting attribute tags can be hacked out. (Just let me know).
I don't really have time for editing this right now, but if you fork and edit I'm happy to merge your pull request =) The latest code is up: https://bitbucket.org/jkflying/literumble/overview I apologise in advance for my terrible coding style, this was a "learn python and Google AppEngine" project for me. If I was starting from scratch I would have made several changes in the structure and how much was abstracted, but this grew slowly and is very hackish inside.
Cool, thanks. I've merged it in and uploaded. I had to fix a few small bugs - html_header should be structures.html_header, and I needed to add style.css to app.yaml but otherwise perfect. Thanks, and the changes are now live =)
Ahh okay, I have never done python before, so I was just guessing mostly on everything was suppose to go. I know only the basics of how it is 'suppose' to work. That and google. I used google a lot.
Actually there is another bug which is completely my fault. Can't sort compare by APS or Survival since I renamed the columns to be APS (A) and so forth, where it expects A APS.
Also reading your logs you fixed a bit more then that, I tried to fix most the html problems. But since I couldn't really test, I was fixing it blind.
Although I'm not sure why it needs to be CSS instead of plain HTML. It's not like the HTML adds much code size overhead, and surely it prevents the browser from having to look up all of the cell values each time. bgcolor=F8F8F8
vs class="even"
isn't much different, and these pages are completely dynamic so it's not like having an abstracted reference makes it any easier to maintain. If it can be done without CSS I'd prefer it that way.
Almost all the old html formatting tags and attributes have been deprecated since 1999 (I think <b> and <em> were still valid).
You could go and edit your code/html if you wanted to alter the appearance. But it is very likely it is easier to edit a CSS file, even if your html is dynamically generated.
I'll see about forking and editing it.
I've updated the code to reject clients other than 1.7.4.2, as it has quite a few improvements for the client over 1.7.3.[0|2]. These are:
- Timeouts for slow connections, instead of hanging the client
- GZip http compression for retrieving the ratings file
- Always running a bot that isn't in the ratings file if it shows up in the participants list - even if there are priority battles waiting. 1.7.3.0 would always run bots below 2000 battles, but this doesn't help, for instance, in the Gigarumble, while 1.7.3.2 ignored the 2000 battles limit and when no priority battles were available would run random battles until it stumbled on the new entry. This sometimes caused many hours wait before the server became aware that a new bot was available, now it is immediate.
I also fixed a bug in my code which prevented the priority battles from working correctly after all of the pairings were complete, instead just giving random battles, so there should be much faster post-pairings-complete stabilisation, and a much more even distribution of battles from pairing to pairing.
Cool, I updated this morning. Seems like Diamond likes 1.7.4.2 - Diamond 1.8.22 is a pretty minor change that had no impact in the RoboRumble, and I guess it could still come down quite a bit, but +1.25 after 100 battles is a pretty strong start. [1] DrussGT is up from 2.8.2 to 2.8.4 with the client version switch, too. [2] I wonder what's up with that? I guess we should wait a bit before thinking about it too much.
I've added a simple color scheme as well as the comparison page. I still need to add something for easily selecting previous versions of the same bot, but most of the work is there. Take a peek!
I recently made a transition over from using gzipped pickled Python objects to gzipped json'ed Python dictionaries, and somehow managed to run into an error with the main Rumble object, which was deleted each time it was uploaded to. After that all the uploads that happened had their respective bots pairings deleted. I managed to catch it fairly quickly, but about half the bots lost a good portion of their battles, so it's back to the waiting game again. Fortunately after the changes the server uses less memory, so I can do more aggressive caching and have less accidental out of memory shutdowns =)
I also need to whip up a bot comparisons page to compare versions... this or next week I think.
Are you still pushing the limit on how many clients you can support? I've now got a heck of a lot more RoboRumble firepower, so let me know if you want me to point some of it at your server.
I was running Melee clients at the same time, so I've stopped those now and they're running 1v1. That puts me at 6 clients, which I think is all I can handle. Perhaps I'll upgrade to paid someday when I get more annoyed with the limits =)
Do you mind if I try submitting results for some different rumble configs? I could setup my own instance of this at home or something if you'd rather. I'm thinking of finally seeing what a StrongestBotsRumble might look like, or a PerceptualRumble.
No problem. Just don't put more than one client on at the moment, or it will probably go over write quota. I'm currently working my way up through all the pairings of the roborumble with 4 clients on an i5, and filling out pairings for the first time is probably the worst as far as writes go because every single bot gets writes evenly, so caching between writes doesn't help particularly. Because of this I currently have caching on quite aggressively, I will lower it in a while. The problem with excessive caching is that occasionally the changes made in the bots get evicted from memcache and the frontend instance gets shut down/cycled, so some battles might be lost. Don't worry, everything will stabilise in the end, and it should be robust to any problems with lost battles on one pairing but not the other, etc.
If it does hit write quota it will reset at 07h00 UTC. You will start getting messages from your client that writes are failing, again don't worry, they will get written once the quota rolls around if they are still available in memcache or instance memory. Batch rankings (currently just Vote, ie. BestBot) get recalculated at 22h00 UTC.
Here's a sample client config: roborumble.txt
Enjoy!
Btw, what's your goal with filling out the 1v1 pairings? Just as a test to see how they compare to Darkcanuck's, or are you planning to try and keep this instance up to date with new RoboRumble activity?
Once they're full I want to try out some different scoring mechanisms - in particular I want to try out my Average Normalised Pairs Percentage. I also want to see what the KNN-PBI looks like for the main rumble. I'm not sure I want to keep a ~900 bot rumble permanently updated - it would eat into the free quota quite a bit. The same with the melee rumble. They are better suited for Darkcanuck's server IMO.
I think it would be more interesting to have a few others, like the PerceptualRumble or TripleDuel/TwinMelee, but I don't want to run any rumbles with more than 100-200 bots on a long term basis. They are just too slow to stabilise, and chances are that the majority of the bots have been abandoned anyways. Perhaps the next step is to write an app that serves the participants list as a FIFO - automatically kicking out old bots as new ones are entered (although not counting versions as new bots).
Nice work!
Not using JavaScript sorting made it easier to link sorted tables from other pages. Also, with a melee database reset, getting rid of battles with retired bots may change the rankings. And you put %wins scoring. =D
But I miss some kind of Condorcet ranking. PL was the only one we had, and the one Combat was doing best.
Also miss some kind of statistical ranking. Elo was what we had and allowed fun statistics like problem bot index, specialization index and that non-working JavaScript diagram. Mirror bots and ram bots will lose some of their appeal without those statistics.
I tried to raise a RoboRumble server in App Engine a long time ago, but they didn´t allow me into the free tier. :(
My %Wins is a bit of a cheat. It is just 1 point per PL win, divided by bots in rumble. I prefer it to PL because it is not dependant on the number of bots in the rumble. So if Combat was doing well in PL, it should do well in %Win.
I'm still not using my Backend for anything, so I was thinking that once a day I could use it to generate some sort of pseudo-problembot stats stuff. ELO/Glicko is nice, but it is really designed for being good approximations when pairings are missing. In our case, the pairings are fairly easy to fill, so that isn't a problem; APS tends to converge to the same ranking order, and it isn't full of voodoo that makes it difficult to comprehend. It is also possible to correct APS easily if results get lost due to being in memcache =)
One ranking idea I had an idea for was doing a The Best Bot calculation (get a point for being the best against any competitor). It would increase my number of database writes in the only reasonably robust/non-batch way I can think of, which is what is holding me back at the moment. I could use a Backend for calculating it once or twice a day, I guess, or make it expire once every 6 hours and be triggered by a page load. It needs n*n runtime. Maybe I can fit it into the regular rankings calculations.
The hardest part is getting the rumble to stay in the free tier. I think it will be limited to about 6 melee clients in total, or maybe 12 1v1 clients instead (less pairings per battle in 1v1).
The last time I checked, App Engine offered about 1GB database in the free tier. Which is enough to store all pairings and all uploaded battles, as long as you delete data from retired bots once in a while.
As for the amount of clients the server can handle, it should not really be an issue, since there are usually 3 to 4 simultaneous clients at most.
If you want to use some batch processing, adding a Ranked Pairs ranking would make my day. I has O(n^4) complexity, but I think it can still fit inside the 10 minutes window from cron, so no need for a backend.
The problem isn't so much total storage space, but that I'm limited at 50k writes per day. Each bot counts as 2 writes, so effectively I have 25k updates I can do. I've figured out a caching scheme so that each melee battle comes out to 10 updates (1 per bot) instead of 45 updates (1 per pairing). I also need to update the total rumble battles count and the user upload count, so that leaves a bit of overhead, meaning I can have ~2000 melee battles per day uploaded.
I'l see what ratings systems are feasible..
Batch updates are more useful in a limited environment like that. Maybe it´s time for a refactoring in the upload protocol (1 update per batch upload), even if it breaks backward compatibility.
I've actually figured out a sort of temporary caching between requests where I wait for bots to accumulate a certain number of pairings before pushing them to disk. I don't think it's necessary to re-work the rumble upload protocol yet. One thing I would like the rumble to tell me is how many bots are in a melee battle though. Right now I just have it hardcoded at 10. It would help with my caching if I knew how many they were uploading per battle.
Hey that's neat! A quick and lightweight rumble setup could be really useful for tournaments and experiments. You just need a participants list URL, you make up a rumble name, and everything just works? Makes me want to try some new divisions. :-) Though that never seems to gain momentum...
What is App Engine pricing like? I'll take a look. I'd certainly be willing to pitch in some for Robocode related stuff if we needed more horsepower.
There is one division I would like to see. A twin melee rumble (like 5 teams of 2 bots each). Joining concepts of both melee and team/twin.
I like the idea, but I think it would be so crowded that it would pretty much reduce to melee strategy. (Having to fight off 8 other bots with 1 ally out there somewhere is not much different than fighting off 9 other bots.) Maybe 3 teams of 3? I've thought about MegaBot TwinDuel for a while...
I think megabot TwinDuel would be awesome! Although it might reduce to wavesurfing quite quickly. I'd also be interested in a TriDuel - a 3 vs 3. I think having that extra bot will completely change the dynamics compared to twinduel, and make surfing much harder.
Maybe split teamrumble into categories?
5 bots (teamrumble bots)
4 bots (DeltaSquad)
3 bots
2 bots (twin duel bots)
Teams with fewer bots can compete in categories with more bots but not the opposite.
Imagine 2 melee bots using minimum risk movement (dominant in melee), and 2 bots using provocative movement (dominant in twin duel). The 2 melee bots will be each on a different corner, but the 2 with provocative movement will be on the same corner ganking on the lonely melee bot. But at the same time, 3 bots close together become tasty targets for swarm targeting from other 3 teams.
There must be a balance between minimum risk and provocative movement, or a third undiscovered strategy. Maybe there is still room for inovation.
Sure, but that's assuming both bots on both of those teams survive to the final stages of the round, which seems unlikely. And even if both bots on one team survive that long, I think how much energy they've retained from the "pure melee" early stage of the round will be the most important factor. Maybe on a bigger field than 1000x1000, and/or with 3-4 teams instead of 5?
It assumes ganks in the middle of a battle weakens "pure melee" strategies somewhat. Although not in the same way as in twin duel.
With 3 teams, I believe "shooting the team with lowest energy" 2x1 strategy will dominate. One team is eliminated almost on luck, and the battle is decided between the remaining 2. It happens in most 3 player games.
There is a catch though, since the API doesn´t tell you which bots from the opponents belong to the same team. Which is not a problem in either meleerumble or teamrumble. But estimating it in team melee might be worth. This alone may change the game significantly... or not.
A bigger battlefield or 4 teams seems nice. I thought of 5 teams of 2 bots each to keep the 10 bots total from meleerumble/teamrumble, and 2 bots per team from twin duel. And see strategies from all 3 divisions clashing against each other.
Any of these divisions sounds pretty interesting to me. I think the main hurdle is just getting that first person to write up a 3x3 team or add TwinMelee support to one of their bots. =) Nobody wants to commit the time if nobody else is going to compete, but if someone just does it, I bet others would follow suit...
I'm kind of caught up in my Diamond refactor right now, but maybe I'll make time for something fun soon. ;) Or try running a PerceptualRumble client just for kicks.
Hmm... all of those divisions do sound interesting to me too. Now it has me thinking about how best to adapt the LunarTwins/Polylunar strategy to a bit different formats...
Yeah, my thoughts are that something like this would be perfect for school/lab/office tournaments. Just give it a new name in the client, set up a participants list somewhere and away you go.
In the free tier I'm not really going to run out of disk space any time soon, a rumble of 300 bots comes out at around 2MB, it's the database writes which are the killer. From what I can tell, App Engine pricing starts at $2.10 a week for the minimum paying tier. That gets you quite a bit more quota than the free tier, which probably should be enough for everything, pretty much forever, without crossing that $2.10 limit. For now I'm going to see how much I can push the free tier, though.
I still have a bunch of optimisations I need to make - like not pulling all of the rumble data into memory just to serve the rankings page (it's all cached, doesn't affect my quota, just speed) - which should make it more snappy both on the main rankings pages and on the RatingDetails page the RR client queries occasionally.
A hidden feature: if you add timing=1
as an argument into your GET for any of the pages it summarises the timing breakdown for CPU usage at the bottom of the page and lets you know how many bots were pulled from cache vs. from the datastore.
I'm not sure, but I think PBI with Elo/Glicko was based on some magical formula between the ratings. Maybe an APS-based measure would just be the average score your neighbors get against that bot. So like if you're ranked #25, the average score against bot B for ranks 15-35 is your expected score.
Of course, if you're #1, you can only go from 2-11, but that's probably still useful info. And in that case (or in every case), you could shift everything so your average PBI is still 0.
PBI is the difference between expected score and real score. The expected score is based on difference between ratings.
- Zero difference is 50% expected score
- -800 difference is 1 to 20 odds, which is 1/(20+1) or 4,76% expected score
- -1600 difference is 1 to 20^2 odds or 0,25% expected score
- -Infinite difference is 0% expected score
Yeah, I was trying to think of a way of handling this elegantly without having to resort to a KNN type lookup or doing a whole ELO calculation. I was thinking something like:
Expected_for_bot_a = (bot_a_APS + (100 - bot_b_APS))/2
Eg: If BotA has APS of 70% and BotB 30% it predicts the 70%, 30% which seems intuitive to me. If BotA has APS of 80% and BotB 80% it predicts the 50%, 50% perfectly. If BotA has APS of 80% and BotB 60% it predicts 60%, 40%, which seems OK.
I think the trouble with this is that it assumes that there is a linear relationship between average score and pairwise score. I think it is more of a sigmoidal relationship, because once you have taken out the low hanging fruit there is less increase to draw from. Because of this I think a modified version of the above formula, something like:
Expected_for_bot_a = ((bot_a_APS^Q + (100 - bot_b_APS)^Q)/2)^(1/Q)
for some magic value of Q would probably be a better fit.
I've added a simple 'Vote' rankings page, where each bot votes for their worst pairing. The majority of bots don't get anything, predictably, but this is interesting for use in comparing who does the best. Again, this is a winner takes all ranking, so makes no differentiation between the bot that got 79.9% and 50% against another, where the worst pairing was 80%, and this makes me uncomfortable as there is clearly lost information. Perhaps I should change it so that every bot gets a vote of weight 100*pair%/worst pair%
, but I'll leave it as it is for a day or so.
The batch pairings get updated once an hour for any rumble which has had battles since the last batch run.
If you try to figure out a sigmoidal relationship, you will eventually end with the same logistic distribution used in Elo and Glicko.
Thinking on this more, I actually really like the KNN idea. It's the only one that really tells you "you can and should be doing better against this bot", as opposed to "this bot might just have a weird score profile". (RamBots are the perfect/extreme example of this - they can show up as Problem Bots even if you're doing well against them.)
I know when I'm trying to figure out who I could do better against, I don't look at PBI, I compare to DrussGT. ;) I understand it would be a lot of calculations, but it should still be simple to code up, and it's all just basic math operations.
Another thought is, if you already have the best score vs any bot, a useful number might be that score minus your score. Calling it "PBI" would be a misnomer, but It tells you how much room you have to improve.
If you look at the site, you might just notice the errors ;-) That's because I ran out of Datastore Read quota. I think it's because of the batch rankings - before them I never even got to 20% of read quota. So I've changed batch rankings to every 6 hours, so in about 17 hours the quota will reset and we can see how it works =)
Since I'm only doing updates once every 6 hours I should have lots of quota for long, tedious calculations. So I'll whip up a KNN-based PBI over the next few days to see how it does. Any ideas on how to calculate K? How about sqrt(participants)?
It seems we have similar ideas about 'max improvement indexes'. Thinking further on my comment above about my %pair/(%worst pair) idea, I'm thinking about an interesting new ranking system that I'd like to call 'Average Normalised Percentage Pairs' or ANPP. Each bot normalises all of their pairings by subtracting the min score and then dividing by (max - min). Your score is than calculated as the average of your pairing against each (100 - enemy normalised score). Thus, if the best anybody does is 75% against a rambot, and the worst anybody does is 30%, 30% will be treated as 0 and 75% treated as 100%. This would make it very easy to see problembots, as if your NPP against them is less than your average NPP, you should focus on them more. Thus, the worst bot against everybody would get 0%, and the best bot against everybody would get 100%.
Just thought I'd say... I rather do like the notion of KNN-based "expected score" system. The sigmoidal relationship given by Elo/Glicko is a reasonable fit for predicting score based each bot's overall rating, but it does really miss the sort of interesting subtitles/patterns that a system that considers multiple axis of strength would.
First page |
Previous page |
Next page |
Last page |