Talk:Dynamic Clustering

[View source↑]
[History↑]

Storing Data

Do you store all your data between matches for each robot you encounter? Or do you start with a fresh tree each time? Store a subset of the data that will give you a good start? I assume storing the full data set will be too large to be able to fit it in the bots data directory? But it would be good to store some of your tree so that you don't have to re-learn every time? I couldn't see details on how many rounds in a row the Roborumble ran for each bot match. Is it enough that you have time to populate the tree with useful data?

Wolfman‎

I don't save data between battles. It's probably worth a few RR points but so inconsistent based on how your battles get distributed, I just avoid the headache.

Voidious‎

Does the robocode security disable network access? I had an idea of storing data in a cloud server and then loading that data when a robot loads, meaning your robot could store data for all roborumble results and share it between distributed runs! ;) But I'm guessing thats not possible?

Its just interesting because my gun definitely gets better the larger the KD-Tree becomes. Its worth a good few percent between 100 rounds and 500 rounds of data for instance. I guess I'll have to tune it for 35 rounds :(

Wolfman‎

Yep, network access is disabled. Sharing saved data across clients has always seemed like a potentially cool RoboRumble feature, but personally I'm more in the "get rid of all saved data" camp. :-) It's helpful to log exceptions for debugging, though.

Voidious‎

Ok fair enough. So what is the number of rounds that the rumble runs each match? Is it 35 like the latest targeting challenges?

Wolfman‎

Yep, 35.

Voidious‎

The roborumble (800X600) and meleerumble (1000X1000) are 35 rounds, the teamrumble (1200X1200) is 10 rounds, and the twinduelrumble (800X800) is 75 rounds (with all forms of data-saving between matches banned).

Sheldor‎

And not trying to discourage you from saving data - if it's interesting to you and there's points to be had, go for it!

What I like about not having it is that it makes for a more clearly defined problem, and you don't run the risk of real improvement getting hidden by fluctuations in the performance of your data saving.

Voidious‎

I guess what is interesting to me is the identification of patterns in large data sets that saving would give you. Whereas not saving data becomes either how do I gather data fast enough to be useful or how do I perform well with small data sets. For instance I would assume that wave surfers would perform better after more rounds.

Do you pre-populate any of your data at the start of a match even if you don't save data about specific robots?

Wolfman‎

I have two things that are a form of pre-populated data:

In surfing, I surf as if there's one hit at GF=0 until I have any data.
In the gun, I have this silly RetroGirl/Gun that I use for the early ticks of round 1 before my real gun has data. (The real gun quickly outperforms it with even a tiny amount of data.)

Voidious‎

Yup I think I may have to investigate switching guns depending on the amount of data in my dataset to improve performance in early rounds (or in round 1). At the moment I only am using a DC gun.

Wolfman‎

What other kind of gun would you be using?

Sheldor‎

I dont know, but I have a hunch that a preloaded GF simple segmented GF gun may be more accurate for the first few shots compared to an almost-empty DC gun *shrug*. My gun at the moment has a noticeable increase in accuracy between 35 rounds to 100. :/

Are there any tricks to make a DC gun better in the first few rounds that I am missing?

Wolfman‎

Pre-loading of any sort I'll give you, but I consider fast learning one of the strengths of DC over VCS. As you get more data, it automatically tightens its bounds and uses more relevant data. To achieve the same with VCS you need to layer buffers of different complexities or dynamically segment.

Circular targeting or RetroGirl/Gun certainly outperform a DC gun with no data, and I am crazy enough to have such a gun just for that purpose... But from what I recall, it's like 2-5 data points (like 50 ticks into round 1) where my DC gun starts outperforming them.

Voidious‎

Wolfman,
I would say, don't worry about the first round so much. I don't know much about megas, but I don't think that missing two shots due to a lack of data would do any noticeable damage to your score.

Sheldor‎

Probably true it's something to save for later, but there's good reason to put extra emphasis on those early ticks/rounds. There are a lot of mid-range bots that have a chance at taking a round from (e.g.) Diamond early on, even if Diamond will reliably crush them every round after round 5. That 1 round is big in percent score. Every shot counts! :-)

Voidious‎

Between versions 1.7.2 and 1.7.4, Diamond gained .15% APS, 4.2 Glicko, 1 PL, and .06% Survival.

Sheldor‎

<Squinting Fry> Can't tell if agreeing or disagreeing with me... :-)

Totally agree that improving the accuracy of the first few shots is only a tiny (but measurable) improvement to Diamond's already state of the art gun. That it's measurable only lends credence to the notion that Wolfman is right to put some emphasis on performance in the first few rounds (not necessarily ticks) in his still early in development gun.

Voidious‎

You added a very powerful pre-loaded gun to an extremely powerful learning gun, and you got an improvement of .15% APS. That's significant when you're grappling for the throne, but I don't really think that it's worth Wolfman's time.

If I were you, Wolfman, I would focus on making a gun that works extremely well in the last 34 rounds, before worrying about squeezing every last point out of the first.

Sheldor‎

Dynamic Clustering - How many matches do you look for?

From what I understand of dynamic clustering, and the way I am currently looking at implementing mine, you store a history of all stats and which angles you would have hit the target at. Then when choosing your targeting angle you select the top N closest matches to the targets current state, and then select the angle to fire from those top N. My question is, does anyone have a good ballpark figure for the value of N?

If N is too small you might not have enough data about the target to get accuracy. If N is too large you might end up including too much information, polluting your pool with bad matches.

Or, do you not take N all the time, but instead only take matches which satisfy criteria on how good the match is, i.e only matches which are 5% different to the targets current state?

Anyone have opinions on this?

P.s If this is the wrong place to discuss, tell me and I will move it to the correct place! :)

Wolfman‎

Its worth noting that only taking the matches to within 5% might not produce enough matches and will have the same problem as N too small. So you could combine it - select matches to 5%, if not enough, select the top N best of the rest. If you have more than N matches to 5%, then take all of those 5%. Thoughts?

Of course then we would need to start discussing both N and match accuracy % values! ;)

Wolfman‎

I take the top sqrt(tree.size()) scans, limited at 100. I think it's a pretty good balance between 'generality' and 'specificity'.

Skilgannon‎

I just take the size divided by some number and limit to an upper bounds. In my new gun, this is 14 with a maximum of 80.

Chase‎

Right now Diamond's main gun uses max(1, min(225, numDataPoints / 9)). So it scales linearly from 1 at start of round 1 to 225 data points at about 2000 ticks (~3rd round). I've many times evolved these settings from scratch with genetic algorithms and gotten max-k values from 150 to 350 and divisor from 9 to 14 without much change in performance.

It's important to note that I (and I think most of us) also weight the data points by distance when doing kernel density to decide on an actual firing angle, which is why the actual choice of k (er, N in your post) doesn't matter so much.

Btw if you're spending a lot of time in the gun lab, you might like WaveSim.

Voidious‎

Combat uses a constant K. And weight data points proportionally to the farthest data point of all K closest data points. It's a kind of variable kernel density.

weight = 1 - distance/(max(allDistances)+1)

K is currently set at 19 for gun (it is this low because it uses only real waves), 17 for wave surfing and 18 for flattener.

Never done any serious tuning except trying a few adjacent K values in RoboResearch and picking the one with highest APS. Some kind of manual hill climbing.

Using only real waves and not doing any fine tuning is the main reason Combat performs badly in APS league, but does ok in PL league.

MN‎

Thanks for all the replies, I might implement weighting of points based of distance, something I hadn't considered before.

Voidious: I took a look at WaveSim it looks cool but perhaps im misunderstanding the point, if it is just playing back recorded battles, how can you ever improve your gun against any bots that take into account how often / where you hit them and react accordingly?

Wolfman‎

Well... My first answer is you can't, and for tuning against surfers you still need real battles. Most of the rumble is still not very adaptive, so it's good to have a gun that crushes all those bots, so I'm just tuning my "Main Gun" with WaveSim.

That said, not all weaknesses that guns prey on are things that even surfers adapt to very well. Surfers are reluctant to get too close, have preferred distances, and other tendencies even if they try to flatten their profiles. Skilgannon has tuned against pre-gathered data for surfers and supposedly had success with it, though I'm not sure there's enough data to say it really worked that way or if just tuning his gun to weird new settings is what helped.

Voidious‎

The huge increase in simulation speed somewhat compensates for not simulating adaptable movements.

And you can try recording a battle, tuning over the static data, then record another battle under new parameters. Iterating many times can make tuning converge to an optimum against adaptable movements, but it is only a theory (never tried).

MN‎

There is also the problem of noise when tuning against real battles.

MN‎

I tried using WaveSim but having some issues. We try to classify tick N, however we have only been fed tick s up to around 50 less than Tick N, so I cannot get any state from the classify Tick Data about the target bots state at ticks N-1 to N-50. This means I cannot do classification using data like distance "moved last 10 ticks" for instance. Any way to do this?

Wolfman‎

I had that same problem. The suggestion I got was to modify the robot with whatever data you want to record and rerun the battles, and then use that in your classifier.

But I ended up just using the Tick Classifier (or whatever its called).

Chase‎

Yeah, WaveSim is really at its best if the data set has all the attributes you use for targeting. Then you can just use the wave classifier and everything's very clean. The TickClassifier gets fed every tick as it happens, so you can use that to supplement the data from the wave classifiers. (Your classifier can implement both.)

It shouldn't be too hard to modify TripHammer to collect different attributes, or to modify your own bot to collect the data. I have a newer/better TripHammer I never got around to releasing, rebased off Diamond after a bit a refactor/rewrite: [1] ... I'd work off that one if you go that route. voidious/gun/TripHammerGunDataManager has all the data writing stuff pretty cleanly separated out.

Voidious‎

Wouldn't it be good to store the ScannedRobotEvents that triphammer receives, and pipe that into a scanned function in the WaveSim - as well as the wave data feeds. After all, thats all the data that all robots have to go on so you then have everything you need, and scanned data / waves / classify will come in exactly the same order as trip hammer, allowing all bots to do the WaveSim no matter their configuration.

Wolfman‎

Hmm, I think that's a pretty awesome idea in terms of usability, which is probably the main place WaveSim is lacking. I'll definitely look into that if/when I next work on WaveSim.

But another way WaveSim gains speed over real battles is that if you record all the attribute data ahead of time, you don't have to do any complex math (trig, square roots) in your WaveSim test runs to deduce that stuff. So you'd lose that part.