why virtual waves help
So, I am wondering why virtual waves help. I think the reason is that many robots make a movement decision every turn, rather than every shot. Excluding that as a reason, as far as I can see, the reasoning behind them would be: "My enemy needs to move at least when I shoot. The GFs he ends up at while moving are related to the final GF. So training based on the intermediate GFs will approximate the final GF." My problems with these reasons are a) I don't like making my gun better for what happens to be common (making a decision about movement every turn) rather than about what should be common. b) I don't want to approximate the final GF, I want to use it exactly.
Am I missing anything about virtual waves that would explain why they are a good idea?
My current plan is to remove them, and then, since my data set will be about 15 times smaller, I could add 5 or so new dimensions like "average turn rate in the past 10 turns", etc. that would hopefully make up for the lost data.
Even for surfers, lots of decisions are the same no matter what you're doing in terms of bullet dodging, things like dive protection and adjusting distancing. I think it's a good idea to experiment, but I'm willing to bet you'll find the 15x bigger data set is worth it. :-) And couldn't you afford a few extra attributes regardless? I know my gun is plenty fast for however many attributes I want, if I found any new ones to be worthwhile, which I haven't.
I have my main gun, which uses virtual waves, and my anti-surfer gun, which doesn't. The problem with not using virtual waves is that you don't start learning their movement patterns until you spend energy, so at the beginning it will not be very accurate. I've considered using virtual waves until my actual waves have enough data, then dropping the virtual waves, but I'm not sure it would help that much.
I'd phrase it as being a tradeoff between "small data set with no bias" versus "big data set with bias". Unless the enemy robot is intentionally creating the bias, the big data set pretty much always worth it. Even when the enemy robot is intentionally creating a bias, it's not always a big enough bias to outweigh the benefit of the larger data set.
I don't think the "small data set with no bias" vs "large data set with bias" is quite correct though. If it were 15 times more unique situations with some bias thrown in I would think you would be right, but these are all related to the unbiased situations, so we have more of a "data set with more details and bias" rather than just a larger data set. Of course, I do lose a significant amount on the TCRM by removing it.
Eh, the gaps in time between bullet waves are kind of huge really. "15x larger data set" would be an overstatement due to the redundant data of strongly correlated data points, but I'd still characterise it as a larger data set by some magnitude, due to how large the large gaps between bullet waves are.
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page.
Return to Thread:User talk:AW/virtualWaves/why virtual waves help/reply (8).
Mind you the firing waves are not totally independent, either, so maybe you should only collect a wave every 50 ticks? =) Just kidding... But I do agree with penalizing non-firing waves, and penalizing them more against surfers. Another good reason to do so against non-surfers is that in a KNN gun, you are only selecting so many data points each time you aim, so favoring firing waves will give you more unique situations. But I've never found improved performance by ignoring virtual waves completely. And hitting top surfers is kind of a black art in any case, and they're the only targets where my Anti-Surfer gun outperforms my Main Gun consistently.
Btw, WaveSim is a good platform for testing this kind of stuff. On my new quad-core box (with 8 threads), I can run my Main Gun through 10k battles of gun data in about 3 minutes. I'd be happy to post some more of my data sets if anyone's interested. =)
As an aside... I almost wonder if there should be a wavesim-based challenge with a huge dataset, to complement the full-robot challenges... I think that could be interesting.
well, the problem with wave sim, at least for me, is that I have some unique attributes (based on PPMEA) so I would need to rewrite a lot of it and collect my own data. I tend to think that once the data is collected, it would be faster and almost as easy to run a classification algorithm in a C++ program.
The point of such a challenge would be to focus on the classification algorithms with a standardized dataset that different people can try to get the best results they can with it. Compared to the conventional challenges that currently exist the results would be much more repeatable and smaller differences could be more clearly discerned.
Also, IIRC WaveSim does store enough information to derive most if not all possible attributes (including PPMEA based ones), due to the way that I added PIF support to WaveSim.
Yeah, you should be able to anything - I've written a Pattern Matcher with PIF, which is about as non-Wave as you can get. But to get the full speed benefits, and for actually improving your own gun, I'd definitely recommend collecting your own data with your own attributes and bullet power scheme. Adding attributes should be pretty easy, really, just adding them to the Wave class then the reads and writes - but then again I wrote the code. =)
I also have an updated TripHammer that I haven't posted yet, after the big Diamond refactor. The code is a lot cleaner and integrates better with Diamond's code, just modifying the gun data manager to write the data files. It also interpolates waves for skipped turns (as Diamond does now), which is kind of cool.
A WaveSim based challenge could be fun, kind of like the Netflix challenge. Not really sure how many would be interested, though, and we'd have to come up with some kind of rule set to prevent over-fitting. Maybe a training data set, then you submit your code to an App Engine instance to run against the real data set, which is how Netflix did it I think.
This makes me think... I wonder if the data collection and conversion to attributes should be made into seperate steps. The data collection bot could collect the raw data, and a seperate program could convert it to the attributes one's gun uses. That way you get the full speed benefits without having to modify the data collection bot or collect new data.
Against non-dodgers, your gun learns 15x faster. It's a good crush-the-weak strategy.
Against dodgers, all the extra data makes your gun more unpredictable. The dodger needs to keep track of 15x more data as well to turn your gun deterministic behavior against you and push hit rate below random targeting hit rate.