3.1.3DC vs 3.1.3
For some reason I've never managed to get the DC to perform as well as the VCS, so it still used VCS. I remember Jdev commenting that a range search worked better for him than a KNN search in movement, so I'll be trying that next.
Have you tried doing something similar to your many randomized attribute buffers with kD-Trees? You could make 100 trees, each with a random subset of the predictors, then combine the results. You could even start weighting some tree's results higher if they perform better.
In my mind, a tree is more heavyweight than a buffer. You need multiple buffers to begin to approximate the smoothing you get from KNN and kernel density. I use multiple trees in my surfing for the same reasons, but more on the order of 10 than 100.
I was running some tests over the holidays, and it seems I've improved my DC to the point where against weaker bots it is as good as VCS. However, against top bots VCS movement is still much stronger, ~15% difference on the MC2K7.
Adjusting rolling speed and changing the shape functions of some attributes.
Rolling speed is part of his moving average algorithm, basically how much new values supersede older ones based on time.
I assume shape functions here he means the shape of the kernel function he uses.
Close. The shape functions are the nonlinear scalings I do on attributes before adding them to the tree.
I compared DrussGT Version 2.8.16 to DrussGT 3.2.1 And it was way better http://www.2shared.com/photo/bKABJFb4/2816_vs_321.html
Could someone explain why averaging the results from many random trees is stronger than using a single well-tuned tree?
I would suspect it might make your nearest-neighbours come from multiple perspectives, giving you areas of concavity in your nearest-neighbour function instead of just a pure convex search area. I also suspect using some fancy pre-processing on tree attributes (perhaps dimension reduction/PCA) before adding could give equivalent search patterns.
I'd answer this in 3 parts.
- There are some high level movement classes that are worth segmenting. Against simple targeters, time since velocity change is just noise. Against most bots, a flattener would be noise. But for a bot where a flattener helps, those lower levels of stats don't hurt. I think they even add "harmless noise" - they are still bullet dodging, so they won't make horrible decisions. So I have a few tiers (simple, normal / decaying, light flattener, flattener) in my movement stats, enabled at different enemy hit percentages.
- I found VCS to be easier to tune than DC. Similarly, I think layering a few trees is easier than trying to add features to your KNN system to create the exact "shapes" (or however you imagine it) that you want. "5 of last 150 + 5 of last 500 + 5 of last 1500" is easy to understand. Adjusting the weights and distancing to produce the same results from one KNN call seems hard.
- I can't prove that it is.
I believe that the trees tend to cancel each others errors and over fitting. That is why Random Forest works. In addition,like Voidious said, simpler trees may be better vs simple enemies.
But then each tree should be specifically tuned against a specific kind of gun. Then each tree outputs a spike at a different GF, which shouldn't be a problem since you can dodge many GFs at once.
But generating dimensions at random to mimic DrussGT 100 buffers is another matter entirely. A combination of dimensions which don't relate to any gun is supposed to hurt classification. Although I can't prove it either.
I'd tend to expect that when the "correct" parameters of the model (i.e. weightings of dimensions) are have more uncertainty than is in the resulting prediction of any one model, the consensus among a diverse set of models is less likely to be completely wrong than any one model. Or to put it another way, perhaps there there is no single well-tuned tree that fits all opponents of a large-ish category (i.e. "specific kind of gun") well enough to outperform a consensus of different models, and while there may exist well-tuned trees for smaller categories of opponents, the battles might not be long enough to reliably detect which would be the best category. That's all just conjecture of course though.