Talk:RoboRunner

From Robowiki
Jump to navigation Jump to search

Contents

Thread titleRepliesLast modified
smart battles916:15, 13 August 2012
priorities709:24, 1 August 2012
Congrats!617:21, 28 July 2012
First page
First page
Next page
Next page
Last page
Last page

smart battles

So I'm planning to implement smart battle selection this weekend. Every bot (or bot set) will get at least two battles, then I will choose battles to run (in batches since I don't want idle threads) based on trying to decrease standard error in the least amount of time. Maybe with some random battles sprinkled in as well.

I'm thinking I will choose bots with the highest value for: <math>{{stDev \over \sqrt{numBattles}} - {stDev \over \sqrt{numBattles + 1}}} \over {avgBattleTime}</math>

I think this will lead to an overall result with the highest confidence in the least amount of time.

I like testing against a test bed with an average score about the same as my RoboRumble APS. The problem with this is it includes a lot of bots with super low variance (eg, 99.9% scores), so running lots of battles against them is a waste of time. But ignoring them and using a stronger test bed risks specializing against stronger bots.

Voidious16:52, 10 August 2012

That looks like a good metric for choosing fast stability. Now I'm wishing I'd included variance in the LiteRumble scores...

Skilgannon18:33, 10 August 2012
 

Yeah, do you just store a running tally of average score? I'll need to update RoboRunner to keep scores from every individual battle, too, along with battle times.

Voidious20:15, 10 August 2012
 

Yeah, I do a online mean calculation, so newMean = oldMean*(n/(n+1)) + newScore/(n+1), n++

I've actually thought quite a bit about this, and it all depends what score you're trying to stabilise. If you're trying to stabilise the PL, for instance, you need to run lots of battles for pairings at or near the 50/50 mark. If you're doing Schultz then lots of battles need to go to where a weak bot beat a strong bot. It's all about which battle has the most potential influence.

Skilgannon07:54, 11 August 2012

Yeah, for sure you would focus on different battles to optimize other rankings. I'm not sure I need to add a "focus on win/loss" flag to RoboRunner, since you'd probably just test against your toughest matchups if that's what you were working on. It does support smart battles for all the score types, though (eg survival, bullet damage).

If we do implement this type of smart battle selection in a rumble system, maybe we could have a client side setting for what you're interested in optimizing. =) I guess to start it would just be APS vs win/loss, but it could include Schultz or Vote at some point.

Voidious15:47, 13 August 2012
 

Got this working, just dogfooding it a bit myself before posting it since it's a pretty major change. Data files are now (gzipped) XMLs with the raw scores from every battle and everything's recalculated on the fly. (That was actually most of the work.) Comes out to about 100 kb for 3k battles.

It runs 2 seasons vs each bot then does smart battle selection with the formula above to try to increase overall accuracy as quickly as possible. It's nice to see test runs where only 2 battles were run vs HawkOnFire. =) 5% of the time, it instead chooses randomly among the bots with fewest battles, to try to mitigate cases where the variance was randomly low in the initial battles. (I can make this configurable if/when anyone cares.)

It won't schedule two battles vs the same bot unless the number of bots is <= the number of threads. Otherwise, you'd keep scheduling that bot until the battle finishes. I could instead estimate how many times in a row it would still be worth scheduling it, but that seems like a lot of work for a corner case.

I think this is going to save a heck of a lot of CPU time. The XML data files will also make it easier to let you store arbitrary score data in the custom scoring stuff.

Voidious21:50, 12 August 2012

Though I'm still figuring out how to avoid potentially corrupting the data file if you ctrl-C your run. I'm not sure if skipping the gzipping would help or if it's just become more likely because the data files are so much larger. Maybe I need to add a keyboard option to safely exit.

Voidious21:54, 12 August 2012
 

Just a quick node. Maybe you know that already but you can add a shutdownhook to the runtime thread. This would catch CTRL-C and you can clearly shutdown the gzipping. Not sure if that is what you looking for.

Wompi22:08, 12 August 2012
 

Cool, yeah, that might do the trick. I'm trying just doing a fresh save of the score data in the shutdown hook and I'll see if I can ever replicate the problem.

Voidious22:20, 12 August 2012
 

Still getting the feel for how many seasons to run with smart battles. It indeed seems to be much more accurate in less time, but I'm not sure to what degree I should:

  • Run less seasons because it's more accurate per number of battles.
  • Run the same number of seasons, since it will run faster and still be more accurate.
  • Run more seasons, in about the same amount of time as before, but with much more accuracy.

I guess it partly depends on how patient you were being before this feature. =)

Edit: Part of the dilemma is that this focuses on accuracy per time, not per number of battles. So maybe with a certain test bed, you don't gain accuracy in 10 seasons vs traditional battle selection, but it completes in 25% less time and gives the same accuracy. So you could up it to 12 seasons to do better on both time and accuracy.

Voidious16:10, 13 August 2012
 

priorities

At this point, this tool does everything I need and I'm really happy with it, so if anyone wants to offer feedback as far as features or prioritizing the to-do's, let me know. =) I'll probably bang out some of the more important stuff in the next week or so (like letting you configure JVM arguments), and the option for dynamically loading battle listeners for custom scoring sounds really cool, so I might tinker with that soon too.

Voidious00:57, 30 July 2012

Hi mate. I got a little intimate with your code and finally figured out how it works :). I wrote a dynamic class loader that can load classes from a specific directory/jar. The classes will only be loaded if they provide a certain interface. So far so good. After this i was digging through the code and was looking for a good point to use these classes. Unfortunately it looks like, there is no good way to pass classes between the 'BattleProcess' and the 'BattleRunner'. I tried to redirect the 'System.out/in' of the BattleProcess to Serialization streams but this is not working as i now know. I guess object serialization over temp files is nothing that you are fond of, neither to speak of RMI. The other idea that came to me, would it be possible to map the events of the BattleProcess BattleListener (on...()) to strings, then pass it over the in/out stream to the BattleRunner and rebuild the events there. In my opinion this would have the advantage that you can pass the events to the user made score class and would have no need to do all the score parsing within your code. If the user class decide it has no need for the event it will simply be ignored.

Hmm i have right now a hard time to explain this :). Lets give you a scenario.

I write a score class for the PatternChallenge. The score class interface has a getName() method and this name has to be in the 'pattern.rrc' to. The class will be loaded RoboRunner reads the "rrc" file looks for the available score classes and find my PatternChallenge class. Now you can register this class on the BattleRunner (similar to the BattleResultHandler you have). The score interface has, lets say onBattleCompleted(..) implemented and you pass all the events (in this case just one) to my score class. There i can read the damage fields and calculate my score and if i want to print the results to the console i can do this as well (no work for you so far :)). If the score interface provides a toString() method i could use this to provide a output string for the data file. The only thing you had to do, would be to get this string and write it to the data file at the end of everything. I'm sure i missed something but so far as i see it, could you get rid of all the hard coded score you have right now.

Well, i hope it makes at least a little bit of sense what i have said. If you think i'm wrong on one/all points let me know, i'm not offended at all by it.

Anyway enough mumbling for today :)

Take Care

Wompi19:38, 30 July 2012
 

Cool! Well, I have a few thoughts on how all this could tie together:

  • Instead of (or in addition to) RoboRunner/BattleRunner dynamically loading the listener/scoring class, I think we should pass a flag to BattleProcess that tells it the name of the listeners to load and attach to Robocode engine.
  • I think it would be good if the listener interface extends IBattleListener, or includes one, so you can just attach it to the RobocodeEngine (addBattleListener) and have it listen to the events it wants.
  • Then I guess it would need some setup to pass its output back to BattleRunner so we can store it and print it. I'm fine with printing to stdout or writing to temp files or whatever. I guess if we load the interface on the RoboRunner side, too, it could also have a method that runs after each battle to print whatever it wants from the data file.
  • I don't think it's reasonable for BattleProcess to always listen to all events and pass all that data back for every battle. If you look at IBattleListener, it's possible to listen to every detail about every single turn in the battle. That's a lot of extra processing if you're not using it. =)

Does most of that make sense? Thanks for getting the ball rolling on this! I think it'd be a really exciting feature. Even if nobody but us uses it. =)

Voidious19:58, 30 July 2012
 

Hmm ...

  • Is there another way then dynamic loading a class, if the program does not know about it? Maybe including the score class directory in the class path and making the challenge name the fully package name but this would still need the class loader part.
  • I was starting with the interface to be IBattleListener but i could not get the event classes within BattleRunner and therefore i mapped it to the same methods but with different parameter objects.
  • This sounds interesting. I was playing with this but had to face some issues that i could not solve. Loading the same class in different environments but not using all methods equally would be very inconsistent (not to say bad style :)) i guess. The user is probably not aware that the class has no idea where the events are processed and would put his output stuff just within the on..() methods - but never got a result, because it works in a different environment. And making two different classes (one for BattleRunner one for BattleProcess) would be not very user friendly and increases the probability to doing something wrong.

If you have no problem with temp files i guess this would be a good way to solve the issues. This way you can load the score class (should be extend BattleAdaptor) and RoboRunner can check if a certain method is overloaded (translates to - is he interested in this information). This information could be flagged to the BattleProcess and he can use it to process the needed events. If you use temp files you have the possibility to serialize almost every event to the file - pass the temp file name to BattleRunner, restore the Events and pass them to the score class. I cannot point my finger on it, but something tells me that there is something wrong with this approach :)

  • yep you are right :) - i was not fully aware of the cascading level of the onTurn..() events and this could lead to some issues with the temp files to i guess. If you, lets say, just interested in the energy level of all bots, it would certainly not make sense to save the whole turn event cascade. Maybe you have a idea to overcome this.

Hehe, thats quite a point you got there :). But i hope it will pay off somehow, especially if i look at the time i have spend to write output classes to get some data visualized through GnuPlot. I easily can see some nice GUI statistic diagrams or movement plots for later runs and that really excites me :).

Take Care

Edit: Another incredible easy to use IPC would be to use named pipes. But this would put the Windows user out of business until someone is willing to write a JNI adaptor, or find another way to establish a named pipe there.

Wompi08:53, 31 July 2012
 

So I guess there's two major things being weighed here:

  • User code running in 1 vs 2 places - Having the user code running on just the RoboRunner side of things may avoid some programming pitfalls if someone tries to store state between the battle listener and the score output.
  • Having to flatten the battle events for post-processing - If the user code is not in the BattleProcess, we need to figure out what events to listen to, log them, and pass them back to the other side for post-processing after the battle.

I guess I have a pretty strong preference for having user code in the battle listener itself instead of processing and transferring all the desired events. Figuring out which methods to listen to, serializing all the events, then processing them on the other side just seems like a lot of unnecessary work, and possibly error prone. A big note in the Javadoc that the listener methods should be idempotent, or using separate interfaces both seem like OK options to me.

I get the impression you'd rather make the other trade-off. =) The main thing I'm not sure of is whether reflection can figure out which methods you actually override. All of them would be overridden by BattleAdaptor, so I'm just not sure we can tell the difference. You could end up with some big temp files if you listen to onTurnEnded, but I don't think processing time would be much compared to running the battle itself.

So I guess what I'm imagining is something like:

  • RoboRunner finds the custom listeners (command line argument and/or in challenge file). It loads an instance to process scoring output and passes the listener names to BattleProcess, which also loads them.
  • BattleProcess sets some object on the listening class, which the listener can use to store custom values. (Eg, "skipped_turns" = 50, or "score_snapshots = {100, 150, 250, 575}".) Maybe an XML or JSON object.
  • BattleProcess loads the battle listener and attaches it to the Robocode Engine, and runs the battle. The listener processes things on the fly and stores data in the data object.
  • The values stored by the listener would be output by BattleProcess, read by BattleRunner, and stored in the bot's data file. (With XML or JSON, converting to/from ASCII like this would be pretty easy.)
  • The scoring method would take the score data for that bot set and/or battle and display whatever it wants.
Voidious16:30, 31 July 2012
 

If you're using some sort of IPC, why not TCP? Then it opens the option of running remote battle runners.

Skilgannon19:16, 31 July 2012

This could actually work really smoothly. By default, it spins up the processes as now, but passing a port number to each process and communicating over TCP/IP. The data sent / received could remain the same. Then we could add command line arguments for:

  1. Launching Robocode processes and doing nothing, just listening for commands.
  2. Accepting a list of host:port of additional processes. In addition to the normal processes, launch a thread for each remote process.

So on your extra machine, you do #1, and on your primary machine you do #2, and voila!

Edit: Except for copying the necessary bot JARs. That would be a little more complicated.

Voidious19:45, 31 July 2012
 

Well :), of course TCP would be the obvious choice for IPC, but i think you bring a whole new bunch of complexity into the program and i'm not sure if it is worth the struggle.

Beside copying the bot JARs, copying the user score classes, configuring the robocode path on every extra machine there are some other more technically issues to consider. Of course if done right it would be a very nice and strong feature, beyond question.

Using the scenario you described, with JSON, sound quite interesting, maybe i should reconsider my concerns about having the user classes running on two different places. I'm sure i'm nitpicking to much on that point.

Sidenode: It is possible with reflection to check if a method is overloaded just by doing

myBattleAdaptor.getClass().getMethod("onBattleCompleted",BattleCompletedEvent.class).getDeclaringClass()

if it gives back the name of myBattleAdaptor it is overloaded.

Right now i have discarded the tmp file approach, simple because i don't liked it and switched to named pipes. The BattleRunner got some watcher threads where he is communicating with the BattleProcesses, using ObjectStreams and watch out for errors and feed the score class. Don't worry i'm doing all this just for curiosity and will be fine with whatever you come up.

Take Care

Wompi09:24, 1 August 2012
 
 

Congratulations on releasing this!

I've got it working already. Super easy.

It is moving along noticably quicker than RoboResearch! Awesome work!

Tkiesel05:17, 26 July 2012

Cool, so nice to hear! =) I think some people will miss the RoboResearch GUI, but maybe I or someone else can add one sometime. And there's still quite a few little things left on the to do list. But I'm pretty happy with it. =)

Voidious05:28, 26 July 2012
 

Hi Voidious. Nice program have you put there together, respect. I'm really excited about the melee feature. I had a couple of tries with RoboResearch but got it never to work on melee benchmarks. Easy to install and use, nice job.

Have you thought about some sort of dynamic score output? For me it would be very useful if i could write my own benchmark score, because in melee it is sometimes better to get a score view along some certain battle states. Like start/middle/end game or score against every opponent by its own. If you have for example Diamond :) and some samples together i would like to know how much score i loose to the samples (or in general weaker bots) if a top bot is on the field.

I will have a look at the sources, and maybe it is possible to make the scores dynamic. Maybe you have something in mind and we could share some ideas. I'm very fond of the idea to have a nice and easy melee test platform.

The remote client feature of Jdev Distributed_Robocode would be awesome. Unfortunately it seems to need Java 7 and therefore is out of my reach.

Take Care

Wompi09:11, 28 July 2012
 

Hey Wompi, thanks for the thoughts. The score output could definitely use a lot more features/options, it's pretty bare bones right now. You also make me realize that I don't even score per bot scores in the data file, so I'll need to fix that first. One thing is, as much as possible, I want to make the right decision automatically about how to show scores instead of making you remember lots of settings, but in cases where different things make sense to different people I'm OK with adding optional flags or whatever.

So for Melee, right now we have something like:

  voidious.Diamond 1.8.4.x12 vs abc.Shadow 3.84i, sample.Crazy 1.0: 61.02, positive.Portia 1.26e took 34.8s, avg: 59.93.
Overall score: 55.34, 1.5 seasons

So maybe if there's more than one opponent, we'd add the per bot scores each on their own line after that? Like:

  voidious.Diamond 1.8.4.x12 vs abc.Shadow 3.84i, sample.Crazy 1.0: 61.02, positive.Portia 1.26e took 34.8s, avg: 59.93.
    vs abc.Shadow 3.84i: 55.05 (22000 : 19000), avg: 53.70
    vs sample.Crazy 1.0: 90.1 (22000 : 2000), avg: 90.2
    vs positive.Portia 1.26e: 53.43 (22000 : 20341), avg: 54.15
Overall score: 55.34, 1.5 seasons

What do you think? Would you also like to see bullet damage / survival data? I always collect all the different fields for scoring, but for the most part was only going to show whatever you had configured as the scoring style. But I've been thinking lately it might be nice to show bullet damage / survival too.

Voidious14:28, 28 July 2012
 

Oh, and about the options for mid-battle score data, that sounds like a really cool idea. Do you mean like you could write and plugin your own scoring code? I'm not really sure the best way to set it up so you could write your scoring class and pass it to RoboRunner, but from a technical standpoint I don't think it would be too tough.

I was just thinking yesterday that it would be cool to integrate some stuff like what Rednaxela did here for collecting hit rates and stuff during a battle, too.

Voidious14:34, 28 July 2012
 

Yes, each per line would be great.

For the damage/survival data, hmm, personally i look at the damage in very rare cases (mostly if i run my 100+k/40k benchmark against the samples) and survival is most interesting if you see all places (to spot some movement leaks in early/mid game) but it couldn't hurt to show these data :)

As i said, i have quite a bunch of 'odd' scoring pattern and a way to implement these dynamic would be great. My fist thought was to provide an dynamic ClassLoader and a directory, where you can put your own written score pattern. There you can release RoboRunner with some default pattern (score,damage) and still provide the possibility to write your own. I guess this would be fairly easy just to provide an interface and pass the 'ScoreObject'. In that way you don't have to put much interest in the scoring table. Some challenges also need some unusual score i guess.

To use the 'robocode.control' (like Rednaxela did) would be extraordinaire, i constantly write new output classes and pass the results to GnuPlot but having this bundled - awesome!

If you like, i can try to put a first draft together for the scoring tomorrow. Not sure if you are fond with the idea to have someone messing with your code.

Take Care

Wompi16:27, 28 July 2012
 

Well, for the next round of changes, I think I'll add the per bot data for Melee battles and the other basic scoring options (like survival and bullet damage). There's still a bunch of basic things I need to check off my to-do list before I get too deep into the custom scoring stuff. But I do think it sounds awesome and really powerful.

Reading a custom battle listener at runtime and attaching it to the Robocode engine via control API (which I'm already using to run battles) should be pretty easy. Then you could listen to whatever events you want to and do whatever you want with the data. And if you're comfortable doing everything outside the RoboRunner infrastructure, that will be all you need.

What I see as the hard part is crafting a nice way for you to store/retrieve these score data in the data files and format custom output, which I think is necessary to make this a lot more useful. It's not rocket surgery, but I think the data file format would need an overhaul - maybe switch to something XML based. And I'm not sure about just passing a custom ScoreObject, because, for instance, right now I'm only listening for the final score from the Robocode engine, so I don't even have the data you'd want. You'd need to listen to other stuff for the per-round survival and stuff. There's a lot of options for what type of data to collect, so I don't think I want to guess and try to record all types of data you might want and just pass it along.

And sure, feel free to experiment some of this stuff, or fork the GitHub repo and go nuts. =) I made the code public domain so people can do whatever they want with it. I'm super stoked that anybody's even interested in this - I thought I'd be the only one using it while everyone else just stuck with their RoboResearch setups. =)

Voidious17:21, 28 July 2012
 
First page
First page
Next page
Next page
Last page
Last page